# Azure AI Chat Client Observability

This notebook demonstrates **built-in telemetry** in Azure AI using the `AzureAIAgentClient` with streaming responses. It shows how to leverage Azure AI's telemetry capabilities to monitor chat interactions, tool calls, and code execution.

## What You'll Learn

- ✅ How to use **streaming responses** with observability
- ✅ How to track **multiple tools** (weather function + code interpreter)
- ✅ How to monitor **token usage** across operations
- ✅ How to use **trace IDs** for debugging in Application Insights
- ✅ How to see **Azure AI-specific operations** in traces

## Key Differences from Agent Observability

This sample uses the **chat client** directly instead of creating a `ChatAgent`. This approach:
- Shows lower-level telemetry integration
- Demonstrates Azure AI-specific operations (like `create_agent`)
- Uses `get_streaming_response()` method
- Tracks multiple tool types simultaneously

> **Note**: You must have an **Application Insights** instance attached to your Azure AI project.

## Prerequisites

Before running this notebook:

1. **Application Insights** connected to your Azure AI Foundry project
2. **Azure CLI** authenticated (`az login --use-device-code`)
3. **Environment variables** in `../.env`:
   - `AZURE_AI_PROJECT_ENDPOINT`
4. **Packages installed**:
   ```bash
   pip install agent-framework agent-framework-azure-ai python-dotenv azure-identity
   ```

## Import Required Libraries

Import libraries for chat client, tools, and observability:

In [None]:
# Copyright (c) Microsoft. All rights reserved.

import asyncio
import os
from random import randint
from typing import Annotated

from dotenv import load_dotenv
from pydantic import Field

from agent_framework import HostedCodeInterpreterTool
from agent_framework.azure import AzureAIAgentClient
from agent_framework.observability import get_tracer
from azure.ai.projects.aio import AIProjectClient
from azure.identity.aio import AzureCliCredential
from opentelemetry.trace import SpanKind
from opentelemetry.trace.span import format_trace_id

## Environment Configuration

Load the Azure AI project endpoint from environment variables:

In [None]:
# Load environment variables from parent directory
load_dotenv('../.env')

# Validate required environment variable
required_vars = ['AZURE_AI_PROJECT_ENDPOINT']
missing = [var for var in required_vars if not os.getenv(var)]

if missing:
    raise RuntimeError(f'Missing required environment variables: {missing}')

endpoint = os.getenv('AZURE_AI_PROJECT_ENDPOINT')
print(f'✅ AZURE_AI_PROJECT_ENDPOINT: {endpoint}')

## Define Custom Function Tool

Create a weather function that will be traced automatically:

In [None]:
# ANSI color codes for better console output
BLUE = "\x1b[34m"
RESET = "\x1b[0m"

async def get_weather(
    location: Annotated[str, Field(description="The location to get the weather for.")],
) -> str:
    """Get the weather for a given location."""
    # Simulate a network call delay
    await asyncio.sleep(randint(0, 10) / 10.0)
    
    conditions = ["sunny", "cloudy", "rainy", "stormy"]
    temperature = randint(10, 30)
    weather_condition = conditions[randint(0, 3)]
    
    return f"The weather in {location} is {weather_condition} with a high of {temperature}°C."

## Understanding Chat Client Telemetry

### What Gets Traced

When using `AzureAIAgentClient.get_streaming_response()`, telemetry captures:

1. **Azure AI Operations**:
   - `create_agent`: Agent creation in Azure AI Foundry
   - `create_thread`: Thread initialization
   - `create_run`: Run creation for processing

2. **Chat Operations**:
   - Model invocations with request/response details
   - Token usage (input/output)
   - Response streaming chunks

3. **Tool Executions**:
   - Custom function calls (`get_weather`)
   - Code interpreter executions
   - Tool arguments and results (if sensitive data enabled)

### Multiple Tools Support

This example demonstrates using **two types of tools**:
- **Custom function**: `get_weather` for weather queries
- **Hosted code interpreter**: For Python code execution

Both are automatically traced in Application Insights!

## Main Demo: Streaming Chat with Multiple Tools

This demo processes multiple questions using streaming responses and different tools:

In [None]:
async def run_chat_with_observability():
    """
    Run Azure AI chat client with observability enabled.
    
    This demonstrates:
    - Streaming responses with telemetry
    - Multiple tool types (function + code interpreter)
    - Azure AI-specific operation tracing
    - Token usage tracking
    """
    # Questions demonstrating different capabilities
    questions = [
        "What's the weather in Amsterdam and in Paris?",
        "Why is the sky blue?",
        "Tell me about AI.",
        "Can you write a python function that adds two numbers? and use it to add 8483 and 5692?",
    ]
    
    # Create async context managers for Azure resources
    async with (
        AzureCliCredential() as credential,
        AIProjectClient(endpoint=endpoint, credential=credential) as project,
        AzureAIAgentClient(project_client=project) as client,
    ):
        # Enable observability - configures telemetry to Application Insights
        print(f"{BLUE}🔧 Setting up Azure AI observability...{RESET}")
        await client.setup_azure_ai_observability()
        print(f"{BLUE}✅ Observability configured{RESET}\n")
        
        # Create a custom span to group all operations
        with get_tracer().start_as_current_span(
            name="Foundry Telemetry from Agent Framework",
            kind=SpanKind.CLIENT
        ) as current_span:
            # Display trace ID for Application Insights lookup
            trace_id = format_trace_id(current_span.get_span_context().trace_id)
            print(f"{BLUE}" + "="*70 + f"{RESET}")
            print(f"{BLUE}🔍 Trace ID: {trace_id}{RESET}")
            print(f"{BLUE}   Use this to find traces in Application Insights{RESET}")
            print(f"{BLUE}" + "="*70 + f"{RESET}\n")
            
            # Process each question with streaming response
            for question in questions:
                print(f"{BLUE}👤 User: {question}{RESET}")
                print(f"{BLUE}🤖 Assistant: {RESET}", end="")
                
                # Stream response chunks
                # Telemetry automatically captures:
                # - Agent creation (if needed)
                # - Tool calls (weather function or code interpreter)
                # - Model interactions
                # - Token usage
                async for chunk in client.get_streaming_response(
                    question,
                    tools=[get_weather, HostedCodeInterpreterTool()]
                ):
                    if str(chunk):
                        print(f"{BLUE}{str(chunk)}{RESET}", end="", flush=True)
                
                print(f"{BLUE}{RESET}\n")  # New line after response
            
            print(f"{BLUE}" + "="*70 + f"{RESET}")
            print(f"{BLUE}✅ Demo completed!{RESET}")
            print(f"{BLUE}   Check Application Insights for trace ID: {trace_id}{RESET}")
            print(f"{BLUE}" + "="*70 + f"{RESET}")

# Run the demo
await run_chat_with_observability()

## Understanding the Telemetry Output

### Spans You'll See in Application Insights

1. **Top-level span**: `Foundry Telemetry from Agent Framework`
   - Groups all operations under one trace ID
   - Duration of the entire conversation

2. **Azure AI Operations**:
   - `create_agent`: First-time agent creation
   - `create_thread`: Thread initialization (once per session)
   - `create_run`: Each question creates a new run

3. **Chat Spans**:
   - Model name and system (e.g., "gpt-4o", "openai")
   - Request and response details
   - Token usage per interaction

4. **Tool Execution Spans**:
   - `execute_tool get_weather`: Weather function calls
   - `execute_tool code_interpreter`: Python code execution
   - Function duration and results

### Questions and Tool Usage

The demo asks 4 different questions that demonstrate:

1. **Weather query**: Triggers `get_weather` function (custom tool)
2. **Knowledge question**: Uses only the LLM, no tools
3. **General question**: Direct LLM response
4. **Code execution**: Triggers `HostedCodeInterpreterTool` for Python

## Comparing with Agent Observability

### Chat Client Approach (This Notebook)
```python
async for chunk in client.get_streaming_response(
    question,
    tools=[get_weather, HostedCodeInterpreterTool()]
):
    print(chunk, end="")
```

**Benefits**:
- Lower-level access to Azure AI operations
- See internal Azure AI telemetry (create_agent, create_thread)
- Direct streaming from the client

### ChatAgent Approach (Previous Notebook)
```python
agent = ChatAgent(
    chat_client=client,
    tools=get_weather,
    name="WeatherAgent"
)
async for update in agent.run_stream(question, thread=thread):
    print(update.text, end="")
```

**Benefits**:
- Higher-level abstraction
- Thread management built-in
- Agent-level telemetry spans

### When to Use Each

- **Chat Client**: When you need fine-grained control and want to see Azure AI internals
- **ChatAgent**: When you want managed thread state and agent-level abstractions

## Next Steps

### View Traces in Azure Portal

1. Copy the **Trace ID** from the output above
2. Go to **Azure Portal** → Your AI Foundry Project
3. Navigate to **Application Insights**
4. Click **Transaction Search** or **Logs**
5. Search for the trace ID
6. View the complete trace timeline:
   - All spans in chronological order
   - Token usage per operation
   - Tool execution details
   - Performance metrics

### Enable Sensitive Data (Development Only)

To see prompts, responses, and function arguments in traces:

```python
from agent_framework.observability import setup_observability
setup_observability(enable_sensitive_data=True)
```

Or set environment variable:
```bash
ENABLE_SENSITIVE_DATA=true
```

⚠️ **Warning**: Only enable in development environments!

### Explore More

- **Custom Metrics**: Add custom counters with `get_meter()`
- **Custom Spans**: Create your own spans with `get_tracer()`
- **Workflow Telemetry**: See the next notebook for workflow observability
- **Aspire Dashboard**: Try local visualization during development

### Related Documentation

- 📖 [Agent Observability](https://learn.microsoft.com/en-us/agent-framework/user-guide/agents/agent-observability?pivots=programming-language-python)
- 📖 [Application Insights](https://learn.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview)
- 📖 [OpenTelemetry Python](https://opentelemetry.io/docs/instrumentation/python/)