# Annotate Traces Tutorial

Annotations in Agenta let you enrich the traces created by your LLM applications. You can add scores, comments, expected answers and other metrics to help evaluate your application's performance.

In this tutorial, we'll:
1. Set up the Agenta SDK and create a traced LLM application
2. Run the application to generate traces
3. Add annotations to those traces programmatically
4. Query and view the annotations

## What You Can Do With Annotations

- Collect user feedback on LLM responses
- Run custom evaluation workflows
- Measure application performance in real-time

## Step 1: Install Required Packages

First, install the Agenta SDK, OpenAI, and the OpenTelemetry instrumentor:

In [None]:
%pip install -U agenta openai opentelemetry-instrumentation-openai requests

## Step 2: Configure Environment Variables

To start tracing your application and adding annotations, you'll need an API key:

1. Visit the Agenta API Keys page under settings
2. Click on **Create New API Key** and follow the prompts

Then set your environment variables:

In [None]:
import os

# Set your API keys here
os.environ["AGENTA_API_KEY"] = ""
os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai"  # Change for self-hosted
os.environ["OPENAI_API_KEY"] = ""

In [None]:
import agenta as ag
from getpass import getpass

# Initialize the SDK with your API key
api_key = os.getenv("AGENTA_API_KEY")
if not api_key:
    os.environ["AGENTA_API_KEY"] = getpass("Enter your Agenta API key: ")

openai_api_key = os.getenv("OPENAI_API_KEY")
if not openai_api_key:
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

# Initialize Agenta
ag.init()

## Step 3: Create and Instrument an LLM Application

Let's create a simple LLM application that we can trace and annotate:

In [None]:
import openai
from opentelemetry.instrumentation.openai import OpenAIInstrumentor

# Instrument OpenAI to automatically capture traces
OpenAIInstrumentor().instrument()

In [None]:
@ag.instrument()
def answer_question(question: str) -> tuple[str, str, str]:
    """A simple question-answering function that we'll trace and annotate.
    
    Returns:
        Tuple of (answer, trace_id, span_id)
    """
    response = openai.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that answers questions concisely."},
            {"role": "user", "content": question},
        ],
    )
    
    # Automatically get the trace_id and span_id from the current span
    link = ag.tracing.build_invocation_link()
    
    return response.choices[0].message.content, link.trace_id, link.span_id

## Step 4: Generate a Trace

Let's run our function to generate a trace. The function will automatically capture the trace_id and span_id using `ag.tracing.build_invocation_link()`.

In [None]:
# Run the function to create a trace and get the IDs automatically
question = "What is the capital of France?"
result, trace_id, span_id = answer_question(question)

print(f"Question: {question}")
print(f"Answer: {result}")
print(f"\n✅ Trace captured!")
print(f"Trace ID: {trace_id}")
print(f"Span ID: {span_id}")

## Step 5: Create an Annotation

Now let's add an annotation to the trace we just created. We'll use the trace_id and span_id that were automatically captured.

In [None]:
import requests

base_url = os.environ.get("AGENTA_HOST", "https://cloud.agenta.ai")
api_key = os.environ["AGENTA_API_KEY"]

headers = {
    "Content-Type": "application/json",
    "Authorization": f"ApiKey {api_key}"
}

# Create an annotation with a score and reasoning
annotation_data = {
    "annotation": {
        "data": {
            "outputs": {
                "score": 90,
                "normalized_score": 0.9,
                "reasoning": "The answer is correct and concise",
                "expected_answer": "The capital of France is Paris"
            }
        },
        "references": {
            "evaluator": {
                "slug": "accuracy_evaluator"
            }
        },
        "links": {
            "invocation": {
                "trace_id": trace_id,
                "span_id": span_id
            }
        },
        "metadata": {
            "annotator": "tutorial_user",
            "timestamp": "2025-10-30T00:00:00Z"
        }
    }
}

# Make the API request (note the trailing slash!)
response = requests.post(
    f"{base_url}/api/preview/annotations/",
    headers=headers,
    json=annotation_data
)

# Process the response
if response.status_code == 200:
    print("✅ Annotation created successfully!")
    annotation_response = response.json()
    print(f"\nAnnotation ID: {annotation_response['annotation']['trace_id']}")
    print(f"Span ID: {annotation_response['annotation']['span_id']}")
    print(f"\nAnnotation data:")
    print(annotation_response)
else:
    print(f"❌ Error: {response.status_code}")
    print(response.text)

## Step 6: Create Additional Annotations

You can add multiple annotations to the same trace from different evaluators:

In [None]:
# Create another annotation for quality assessment
quality_annotation = {
    "annotation": {
        "data": {
            "outputs": {
                "score": 85,
                "reasoning": "Response is helpful and well-formatted",
                "labels": ["Helpful", "Accurate", "Concise"]
            }
        },
        "references": {
            "evaluator": {
                "slug": "quality_evaluator"
            }
        },
        "links": {
            "invocation": {
                "trace_id": trace_id,
                "span_id": span_id
            }
        }
    }
}

response = requests.post(
    f"{base_url}/api/preview/annotations/",  # Note the trailing slash!
    headers=headers,
    json=quality_annotation
)

if response.status_code == 200:
    print("✅ Quality annotation created successfully!")
else:
    print(f"❌ Error: {response.status_code}")
    print(response.text)

## Step 7: Query Annotations

Now let's query all annotations for our invocation:

In [None]:
# Query all annotations for the invocation
query_data = {
    "annotation": {
        "links": {
            "invocation": {
                "trace_id": trace_id,
                "span_id": span_id
            }
        }
    }
}

response = requests.post(
    f"{base_url}/api/preview/annotations/query",
    headers=headers,
    json=query_data
)

if response.status_code == 200:
    print("✅ Annotations retrieved successfully!")
    annotations = response.json()
    print(f"\nFound {len(annotations.get('annotations', []))} annotation(s)")
    print("\nAnnotations:")
    for idx, ann in enumerate(annotations.get('annotations', []), 1):
        print(f"\n--- Annotation {idx} ---")
        print(f"Evaluator: {ann['references']['evaluator']['slug']}")
        print(f"Data: {ann['data']}")
else:
    print(f"❌ Error: {response.status_code}")
    print(response.text)

## Step 8: View Annotations in the UI

You can see all annotations for a trace in the Agenta UI:

1. Log in to your Agenta dashboard
2. Navigate to the **Observability** section
3. Find your trace
4. Check the **Annotations** tab to see detailed information

The right sidebar will show average metrics for each evaluator.

## Understanding Automatic Trace Capture

The `ag.tracing.build_invocation_link()` function is a helper that automatically:
1. Gets the current span context from the active trace
2. Formats the trace_id and span_id as hex strings
3. Returns a Link object with both IDs ready to use

This is much more convenient than manually querying the UI for trace IDs!

**Alternative Method:**
You can also use `ag.tracing.get_span_context()` if you need more control:

```python
span_ctx = ag.tracing.get_span_context()
trace_id = f"{span_ctx.trace_id:032x}"  # Format as hexadecimal
span_id = f"{span_ctx.span_id:016x}"    # Format as hexadecimal
```

## Understanding Annotation Structure

An annotation has four main parts:

1. **Data**: The actual evaluation content (scores, comments)
2. **References**: Which evaluator to use (will be created automatically if it doesn't exist)
3. **Links**: Which trace and span you're annotating
4. **Metadata** (optional): Any extra information you want to include

### Annotation Data Examples

You can include various types of data in your annotations:

In [None]:
# Example 1: Simple score
simple_annotation = {
    "outputs": {
        "score": 3
    }
}

# Example 2: Score with explanation
detailed_annotation = {
    "outputs": {
        "score": 3,
        "comment": "The response is not grounded"
    }
}

# Example 3: Multiple metrics with reference information
comprehensive_annotation = {
    "outputs": {
        "score": 3,
        "normalized_score": 0.5,
        "comment": "The response is not grounded",
        "expected_answer": "The capital of France is Paris",
        "labels": ["factual", "concise"]
    }
}

print("Annotation data can include:")
print("- Numbers (scores, ratings)")
print("- Categories (labels, classifications)")
print("- Text (comments, reasoning)")
print("- Booleans (true/false values)")

## Optional: Remove an Annotation

If you need to remove an annotation, you can delete it by its trace_id and span_id:

In [None]:
# Uncomment and replace with your annotation's trace_id and span_id to delete
# annotation_trace_id = "your_annotation_trace_id"
# annotation_span_id = "your_annotation_span_id"

# response = requests.delete(
#     f"{base_url}/api/preview/annotations/{annotation_trace_id}/{annotation_span_id}",
#     headers=headers
# )

# if response.status_code == 200:
#     print("✅ Annotation deleted successfully")
# else:
#     print(f"❌ Error: {response.status_code}")
#     print(response.text)

## Summary

In this tutorial, we've covered:

1. ✅ Setting up the Agenta SDK and instrumenting an LLM application
2. ✅ Generating traces by running the application
3. ✅ Creating annotations with scores, reasoning, and metadata
4. ✅ Adding multiple annotations from different evaluators
5. ✅ Querying annotations programmatically
6. ✅ Understanding annotation structure and capabilities

## Next Steps

Now that you know how to annotate traces, you can:

- Integrate annotation creation into your evaluation workflows
- Build custom evaluators that automatically annotate traces
- Use annotations to track user feedback in production
- Analyze annotation data to improve your LLM applications