# Observability: OpenTelemetry to Application Insights

> **Author:** Ozgur Guler | AI Solution Leader, AI Innovation Hub
> **Contact:** [ozgur.guler1@gmail.com](mailto:ozgur.guler1@gmail.com)
> **© 2025 Ozgur Guler. All rights reserved.**

---

This notebook demonstrates how to enable observability for Azure AI Foundry agents using OpenTelemetry and Application Insights.

## What is Agent Observability?

Agent observability provides:
- **End-to-end tracing** of agent interactions
- **Performance metrics** (latency, token usage, API calls)
- **Debugging capabilities** for multi-agent systems
- **Cost tracking** and optimization insights

## How It Works

Azure AI Foundry uses **OpenTelemetry** standards to capture traces:
1. **Application Insights** stores the trace data
2. **Foundry Portal** provides agent-specific views
3. **Azure Monitor** enables full-stack observability

## Prerequisites

Before running this notebook:

1. **Azure CLI authenticated**: Run `az login`
2. **Azure AI Foundry project**: From previous sections
3. **Agent created**: From `../04-foundry-agent-memory`
4. **Application Insights resource**: (notebook can create this)

---

## Section 1: Setup and Configuration

In [None]:
# Install required packages
!pip install azure-ai-projects --pre --quiet
!pip install azure-identity python-dotenv --quiet
!pip install azure-monitor-opentelemetry opentelemetry-sdk --quiet
!pip install opentelemetry-exporter-otlp --quiet

In [None]:
import os
from dotenv import load_dotenv

# Load environment from parent directory
load_dotenv("../.env")

# Configuration
FOUNDRY_ACCOUNT = os.getenv("FOUNDRY_ACCOUNT_NAME", "ozgurguler-7212-resource")
PROJECT_NAME = os.getenv("FOUNDRY_PROJECT_NAME", "ozgurguler-7212")
PROJECT_ENDPOINT = f"https://{FOUNDRY_ACCOUNT}.services.ai.azure.com/api/projects/{PROJECT_NAME}"
RESOURCE_GROUP = os.getenv("AZURE_RESOURCE_GROUP", "rg-ozgurguler-7212")

# Model deployment
CHAT_MODEL = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-5-nano")

# Agent from previous section
AGENT_NAME = "chat-agent-with-memory"

print(f"Project Endpoint: {PROJECT_ENDPOINT}")
print(f"Resource Group: {RESOURCE_GROUP}")
print(f"Agent Name: {AGENT_NAME}")

In [None]:
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

# Initialize the client
credential = DefaultAzureCredential()
client = AIProjectClient(endpoint=PROJECT_ENDPOINT, credential=credential)

print("AIProjectClient initialized successfully")

---

## Section 2: Connect Application Insights

Application Insights stores all trace data. You can:
1. **Create a new** Application Insights resource, or
2. **Connect an existing** one to your Foundry project

### Option A: Via Azure Portal (Recommended)

1. Go to [Azure AI Foundry Portal](https://ai.azure.com)
2. Select your project
3. Navigate to **Observability** > **Tracing**
4. Click **Connect Application Insights**
5. Create new or select existing resource

In [None]:
# Check if Application Insights connection string is available
print("Checking Application Insights connection...")

try:
    connection_string = client.telemetry.get_application_insights_connection_string()
    print(f"\n✅ Application Insights connected!")
    print(f"Connection string (truncated): {connection_string[:50]}...")
    
    # Store for later use
    APP_INSIGHTS_CONNECTION_STRING = connection_string
    
except Exception as e:
    print(f"\n⚠️  Application Insights not connected: {e}")
    print("\nTo connect Application Insights:")
    print("1. Go to https://ai.azure.com")
    print("2. Select your project")
    print("3. Go to Observability > Tracing")
    print("4. Connect an Application Insights resource")
    
    APP_INSIGHTS_CONNECTION_STRING = None

### Option B: Create via Azure CLI

If you need to create Application Insights programmatically:

In [None]:
import subprocess
import json

# Only run if App Insights is not connected
CREATE_APP_INSIGHTS = False  # Set to True to create
APP_INSIGHTS_NAME = f"{PROJECT_NAME}-appinsights"

if CREATE_APP_INSIGHTS and APP_INSIGHTS_CONNECTION_STRING is None:
    print(f"Creating Application Insights: {APP_INSIGHTS_NAME}...")
    
    # Create Application Insights
    result = subprocess.run([
        "az", "monitor", "app-insights", "component", "create",
        "--app", APP_INSIGHTS_NAME,
        "--location", "eastus",
        "--resource-group", RESOURCE_GROUP,
        "--kind", "web",
        "--application-type", "web",
        "-o", "json"
    ], capture_output=True, text=True)
    
    if result.returncode == 0:
        app_insights = json.loads(result.stdout)
        APP_INSIGHTS_CONNECTION_STRING = app_insights.get("connectionString")
        print(f"✅ Created Application Insights")
        print(f"Connection String: {APP_INSIGHTS_CONNECTION_STRING[:50]}...")
        print("\n⚠️  Note: You still need to connect this to your Foundry project via the portal.")
    else:
        print(f"❌ Error: {result.stderr}")
else:
    if APP_INSIGHTS_CONNECTION_STRING:
        print("Application Insights already connected")
    else:
        print("Skipping App Insights creation (CREATE_APP_INSIGHTS = False)")

---

## Section 3: Configure OpenTelemetry Tracing

Now we'll set up OpenTelemetry to capture agent traces.

In [None]:
from azure.monitor.opentelemetry import configure_azure_monitor
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter

# Enable content recording (captures input/output - disable in production if sensitive)
os.environ["AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED"] = "true"

# Configure Azure Monitor if connection string available
if APP_INSIGHTS_CONNECTION_STRING:
    configure_azure_monitor(connection_string=APP_INSIGHTS_CONNECTION_STRING)
    print("✅ Azure Monitor configured for Application Insights")
else:
    # Fallback to console output for local testing
    print("⚠️  No App Insights connection - using console exporter")
    tracer_provider = TracerProvider()
    tracer_provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))
    trace.set_tracer_provider(tracer_provider)

# Get tracer for our application
tracer = trace.get_tracer(__name__)
print(f"Tracer initialized: {tracer}")

---

## Section 4: Test Tracing with Agent Invocation

Let's invoke an agent and see the traces captured.

In [None]:
# Get OpenAI client for agent invocation
openai_client = client.get_openai_client()

# Verify agent exists
try:
    agent = client.agents.retrieve(agent_name=AGENT_NAME)
    print(f"✅ Agent found: {agent.name} (version: {agent.version})")
except Exception as e:
    print(f"⚠️  Agent not found: {e}")
    print("Creating a simple test agent...")
    
    from azure.ai.projects.models import PromptAgentDefinition
    
    AGENT_NAME = "observability-test-agent"
    agent = client.agents.create_version(
        agent_name=AGENT_NAME,
        definition=PromptAgentDefinition(
            model=CHAT_MODEL,
            instructions="You are a helpful assistant for testing observability."
        )
    )
    print(f"✅ Created test agent: {agent.name}")

In [None]:
# Test 1: Simple traced agent call
print("=" * 60)
print("Test 1: Simple Traced Agent Call")
print("=" * 60)

with tracer.start_as_current_span("agent-test-simple") as span:
    # Add custom attributes to the span
    span.set_attribute("agent.name", AGENT_NAME)
    span.set_attribute("test.type", "simple")
    
    # Create conversation and invoke agent
    conversation = openai_client.conversations.create()
    span.set_attribute("conversation.id", conversation.id)
    
    response = openai_client.responses.create(
        input="What is 2 + 2? Please respond briefly.",
        conversation=conversation.id,
        extra_body={"agent": {"name": AGENT_NAME, "type": "agent_reference"}},
    )
    
    span.set_attribute("response.status", response.status)
    
    print(f"\nUser: What is 2 + 2?")
    print(f"Agent: {response.output_text}")
    print(f"\nTrace ID: {span.get_span_context().trace_id:032x}")
    print(f"Span ID: {span.get_span_context().span_id:016x}")

In [None]:
# Test 2: Multi-turn conversation with nested spans
print("=" * 60)
print("Test 2: Multi-Turn Conversation with Nested Spans")
print("=" * 60)

with tracer.start_as_current_span("agent-test-multiturn") as parent_span:
    parent_span.set_attribute("agent.name", AGENT_NAME)
    parent_span.set_attribute("test.type", "multi-turn")
    
    # Create conversation
    conversation = openai_client.conversations.create()
    parent_span.set_attribute("conversation.id", conversation.id)
    
    messages = [
        "Hello! I'm learning about AI agents.",
        "What are some common use cases for AI agents?",
        "Thanks! Can you summarize our conversation?"
    ]
    
    for i, user_input in enumerate(messages):
        with tracer.start_as_current_span(f"turn-{i+1}") as turn_span:
            turn_span.set_attribute("turn.number", i + 1)
            turn_span.set_attribute("user.input", user_input[:100])
            
            response = openai_client.responses.create(
                input=user_input,
                conversation=conversation.id,
                extra_body={"agent": {"name": AGENT_NAME, "type": "agent_reference"}},
            )
            
            turn_span.set_attribute("response.status", response.status)
            
            print(f"\nTurn {i+1}:")
            print(f"  User: {user_input}")
            print(f"  Agent: {response.output_text[:200]}..." if len(response.output_text) > 200 else f"  Agent: {response.output_text}")
    
    print(f"\nParent Trace ID: {parent_span.get_span_context().trace_id:032x}")

---

## Section 5: Custom Tracing for Agent Tools

You can add custom spans for any function or tool call.

In [None]:
import time
from opentelemetry.trace import SpanKind, Status, StatusCode

def traced_tool_call(tool_name: str, input_data: dict) -> dict:
    """Example of a traced tool/function call."""
    with tracer.start_as_current_span(
        f"tool.{tool_name}",
        kind=SpanKind.INTERNAL
    ) as span:
        span.set_attribute("tool.name", tool_name)
        span.set_attribute("tool.input", str(input_data))
        
        try:
            # Simulate tool execution
            start = time.time()
            time.sleep(0.1)  # Simulate processing
            result = {"status": "success", "data": f"Processed {input_data}"}
            
            span.set_attribute("tool.output", str(result))
            span.set_attribute("tool.duration_ms", (time.time() - start) * 1000)
            span.set_status(Status(StatusCode.OK))
            
            return result
            
        except Exception as e:
            span.set_status(Status(StatusCode.ERROR, str(e)))
            span.record_exception(e)
            raise

# Test custom tool tracing
print("Testing custom tool tracing...")

with tracer.start_as_current_span("agent-with-tools") as span:
    # Simulate agent calling multiple tools
    result1 = traced_tool_call("get_weather", {"city": "Seattle"})
    result2 = traced_tool_call("search_web", {"query": "AI agents"})
    
    print(f"Tool 1 result: {result1}")
    print(f"Tool 2 result: {result2}")
    print(f"\nTrace ID: {span.get_span_context().trace_id:032x}")

---

## Section 6: Viewing Traces

### In Azure AI Foundry Portal

1. Go to [Azure AI Foundry](https://ai.azure.com)
2. Select your project
3. Navigate to **Observability** > **Tracing**
4. Filter by time range, agent name, or trace ID

### In Application Insights

1. Go to [Azure Portal](https://portal.azure.com)
2. Find your Application Insights resource
3. Go to **Transaction search** or **End-to-end transaction details**
4. Search by trace ID or time range

### Agents View (Preview)

Application Insights has a new **Agents View** specifically for AI agents:
1. Open Application Insights
2. Navigate to **Monitoring** > **Agents**
3. View agent performance, errors, and usage patterns

In [None]:
# Generate links to view traces
print("=" * 60)
print("View Your Traces")
print("=" * 60)

print(f"\n1. Azure AI Foundry Portal:")
print(f"   https://ai.azure.com/tracing?project={PROJECT_NAME}")

print(f"\n2. Application Insights (Azure Portal):")
print(f"   https://portal.azure.com/#@/resource/subscriptions/*/resourceGroups/{RESOURCE_GROUP}/providers/microsoft.insights/components/*/overview")

print(f"\n3. Query traces with Kusto (in App Insights > Logs):")
print(f"""   
   // Recent agent traces
   traces
   | where timestamp > ago(1h)
   | where operation_Name contains "agent"
   | project timestamp, operation_Name, message, customDimensions
   | order by timestamp desc
   | take 50
""")

---

## Section 7: Export Traces to OTLP Endpoint (Optional)

You can also export traces to any OTLP-compatible backend (Jaeger, Grafana, etc.).

In [None]:
# Example: Configure OTLP exporter for third-party backends
# Uncomment and configure for your OTLP endpoint

ENABLE_OTLP = False  # Set to True to enable

if ENABLE_OTLP:
    from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
    from opentelemetry.sdk.trace.export import BatchSpanProcessor
    
    # Configure OTLP endpoint (e.g., Jaeger, Grafana Tempo)
    otlp_endpoint = os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4317")
    
    otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint)
    trace.get_tracer_provider().add_span_processor(
        BatchSpanProcessor(otlp_exporter)
    )
    
    print(f"✅ OTLP exporter configured: {otlp_endpoint}")
else:
    print("OTLP export skipped (ENABLE_OTLP = False)")
    print("\nTo export to OTLP backends like Jaeger:")
    print("1. Set ENABLE_OTLP = True")
    print("2. Set OTEL_EXPORTER_OTLP_ENDPOINT environment variable")

---

## Section 8: Best Practices

In [None]:
# Best practice: Use a decorator for consistent tracing
from functools import wraps

def traced(name: str = None, attributes: dict = None):
    """Decorator to add tracing to any function."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            span_name = name or func.__name__
            with tracer.start_as_current_span(span_name) as span:
                # Add custom attributes
                if attributes:
                    for key, value in attributes.items():
                        span.set_attribute(key, value)
                
                # Add function info
                span.set_attribute("function.name", func.__name__)
                span.set_attribute("function.module", func.__module__)
                
                try:
                    result = func(*args, **kwargs)
                    span.set_status(Status(StatusCode.OK))
                    return result
                except Exception as e:
                    span.set_status(Status(StatusCode.ERROR, str(e)))
                    span.record_exception(e)
                    raise
        return wrapper
    return decorator

# Example usage
@traced(name="process_user_request", attributes={"service": "chat-agent"})
def process_request(user_input: str) -> str:
    """Process a user request with automatic tracing."""
    return f"Processed: {user_input}"

# Test the decorator
result = process_request("Hello, agent!")
print(f"Result: {result}")

---

## Section 9: Cleanup

In [None]:
# Flush traces before exiting (important for batch exporters)
from opentelemetry.sdk.trace import TracerProvider

provider = trace.get_tracer_provider()
if hasattr(provider, 'force_flush'):
    provider.force_flush()
    print("✅ Traces flushed to exporters")
else:
    print("Tracer provider does not support force_flush")

print("\nTraces should now be visible in:")
print("- Azure AI Foundry Portal (Tracing section)")
print("- Application Insights (Transaction search)")

---

## Summary

### What We Built

1. **Connected Application Insights** to Azure AI Foundry project
2. **Configured OpenTelemetry** for trace collection
3. **Traced agent invocations** with custom spans
4. **Added custom tool tracing** for internal operations
5. **Created reusable tracing utilities** (decorator pattern)

### Key Concepts

| Concept | Description |
|---------|-------------|
| **Span** | A single unit of work (e.g., agent call, tool invocation) |
| **Trace** | Collection of spans forming an end-to-end request |
| **Trace ID** | Unique identifier linking all spans in a trace |
| **Attributes** | Key-value metadata attached to spans |
| **Exporter** | Sends traces to a backend (App Insights, OTLP, Console) |

### Environment Variables

```bash
# Required
PROJECT_ENDPOINT=https://<account>.services.ai.azure.com/api/projects/<project>

# Optional - auto-detected from Foundry
APPLICATION_INSIGHTS_CONNECTION_STRING=InstrumentationKey=...

# Optional - enable input/output recording (disable in production)
AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED=true

# Optional - OTLP export
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
```

### Best Practices

- **Name spans descriptively**: Use `agent.chat`, `tool.search`, etc.
- **Add relevant attributes**: agent name, user ID, conversation ID
- **Handle errors**: Use `span.record_exception()` for debugging
- **Disable content recording in production**: Protects sensitive data
- **Flush traces before exit**: Ensures all data is exported

---

## Next Steps

Continue to `../06-foundry-iq-grounding-with-ai-search` for RAG with AI Search.

---

<div align="center">

## License & Attribution

This notebook is part of the **Azure AI Foundry Demo Repository**

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](../LICENSE)

**Original Author:** Ozgur Guler | AI Solution Leader, AI Innovation Hub

**Contact:** [ozgur.guler1@gmail.com](mailto:ozgur.guler1@gmail.com)

---

*If you use, modify, or distribute this work, you must provide appropriate credit to the original author as required by the [Apache License 2.0](../LICENSE).*

**Copyright © 2025 Ozgur Guler. All rights reserved.**

</div>