# Tracing & Evaluation for Groq based Agents using Maxim AI

Learn how to integrate Maxim observability with the Groq SDK. 

When you use Groq with Maxim instrumentation, the following information is automatically captured for each API call:
- Request Details: Model name, temperature, max tokens, and all other parameters
- Message History: Complete conversation context including system and user messages
- Response Content: Full assistant responses and metadata
- Usage Statistics: Input tokens, output tokens, total tokens consumed
- Cost Tracking: Estimated costs based on Groq’s pricing
- Error Handling: Any API errors or failures with detailed context
- Node Level Evaluations
- Get Real Time Alerts (Slack, PagerDuty, etc.)

Link to Docs - https://getmax.im/HIF14Di

## Install dependencies & Set environment variables

In [None]:
# Install dependencies

'''
%pip install groq
%pip install maxim-py
'''

# Set environment variables
import os
os.environ["MAXIM_API_KEY"] = "YOUR_MAXIM_API_KEY"
os.environ["MAXIM_LOG_REPO_ID"] = "YOUR_MAXIM_LOG_REPO_ID"

## Import the required dependencies

In [None]:
from groq import Groq

from maxim.logger.groq import instrument_groq
from maxim import Config, Maxim
from maxim import logger
from maxim.logger import LoggerConfig

## Create a Maxim Logger & Callback

In [None]:
import os

maxim_api_key = os.environ.get("MAXIM_API_KEY")
maxim_log_repo_id = os.environ.get("MAXIM_LOG_REPO_ID")

maxim = Maxim(Config(api_key=maxim_api_key))

logger = maxim.logger(LoggerConfig(id=maxim_log_repo_id))

instrument_groq(logger)

[32m[MaximSDK] Initializing Maxim AI(v3.9.12)[0m


  maxim = Maxim(Config(api_key= MAXIM_API_KEY))
  logger = maxim.logger(LoggerConfig(id=MAXIM_LOG_REPO_ID))


Now you have Maxim configured, you can start logging your Groq model interactions seamlessly.


## Simple Inference

In [6]:
client = Groq()

In [7]:
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Explain the importance of fast language models",
        }
    ],

    model="llama-3.3-70b-versatile"
)

In [None]:
print(chat_completion.choices[0].message.content)

## Streaming Response

In [None]:
stream = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Explain the importance of fast language models",
        }
    ],

    model="llama-3.3-70b-versatile",

    temperature=0.5,

    max_completion_tokens=1024,

    top_p=1,

    stop=None,

    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

## Async Chat Completion

In [None]:
from groq import AsyncGroq

In [None]:
import asyncio

from groq import AsyncGroq

async def main():
    client = AsyncGroq()

    chat_completion = await client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            # Set a user message for the assistant to respond to.
            {
                "role": "user",
                "content": "Explain the importance of fast language models",
            }
        ],

        model="llama-3.3-70b-versatile",

        temperature=0.5,

        max_completion_tokens=1024,
        top_p=1,

        stop=None,

        stream=False,
    )

    # Print the completion returned by the LLM.
    print(chat_completion.choices[0].message.content)

await main() # Use asyncio.run if not working in jupyter environment

## Async Completion with Streaming

In [None]:
async def main():
    client = AsyncGroq()

    stream = await client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            # Set a user message for the assistant to respond to.
            {
                "role": "user",
                "content": "Explain the importance of fast language models",
            }
        ],

        # The language model which will generate the completion.
        model="llama-3.3-70b-versatile",

        temperature=0.5,

        max_completion_tokens=1024,

        top_p=1,

        stop=None,

        # If set, partial message deltas will be sent.
        stream=True,
    )

    # Print the incremental deltas returned by the LLM.
    async for chunk in stream:
        print(chunk.choices[0].delta.content, end="")

await main()

## Check Logs on Maxim Dashboard

![Maxim Dashboard](images/groq_fin_maxim.gif)

In this example we saw how you can use Maxim to log your Groq model interactions.
We can get the following info on the Maxim dashboard:
- View detailed logs of all your LLM interactions
- Inspect request and response payloads
- Monitor model usage and performance metrics
- Track Tool / Function Calls
- Run Node Level Evals & Get Real Time Alerts (Slack, PagerDuty, etc.)

For more information, check this [cookbook](https://www.getmaxim.ai/docs/cookbooks/integrations/groq)