In [None]:
```xml
<VSCode.Cell language="markdown">
# Workshop 2: Tracing and Observability

Welcome to Workshop 2! In this notebook, you'll learn how to add comprehensive observability to your Azure OpenAI applications using OpenTelemetry and Azure Monitor.

## What You'll Learn

1. **OpenTelemetry Fundamentals** - Understanding traces, spans, and telemetry
2. **Instrument Azure OpenAI calls** - Automatic tracing of API calls
3. **Azure Application Insights Integration** - Send traces to Azure Monitor
4. **Custom Instrumentation** - Add your own traces and metrics
5. **Analyze Performance** - Use traces to debug and optimize
6. **Production Monitoring** - Set up alerts and dashboards

## Prerequisites

- Completed Workshop 1 (Deploy Your First Model)
- Azure AI Foundry project with Application Insights configured
- Environment variables set up correctly

## Learning Objectives

By the end of this workshop, you will:
- Understand distributed tracing concepts
- Instrument Azure OpenAI applications with OpenTelemetry
- Analyze traces in Azure Application Insights
- Create custom spans for business logic
- Set up monitoring for production applications
</VSCode.Cell>

<VSCode.Cell language="python">
# Install required packages for tracing
%pip install opentelemetry-sdk opentelemetry-instrumentation-openai-v2 azure-monitor-opentelemetry azure-core-tracing-opentelemetry opentelemetry-exporter-otlp
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 1. Environment Setup and Imports

Let's set up our environment and import the necessary libraries for tracing.
</VSCode.Cell>

<VSCode.Cell language="python">
import os
import time
import json
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from openai import AzureOpenAI

# OpenTelemetry imports
from opentelemetry import trace, metrics
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import ConsoleMetricExporter, PeriodicExportingMetricReader

# Azure Monitor integration
from azure.monitor.opentelemetry import configure_azure_monitor
from azure.core.settings import settings

# Load environment variables
load_dotenv()

print("🔧 Tracing Workshop Environment Check:")
print("-" * 40)

# Check required environment variables
required_vars = [
    'PROJECT_ENDPOINT',
    'AZURE_AI_FOUNDRY_RESOURCE_NAME', 
    'MODEL_DEPLOYMENT_NAME'
]

for var in required_vars:
    value = os.getenv(var)
    status = "✅" if value else "❌"
    print(f"{status} {var}: {'Set' if value else 'Not set'}")

print("-" * 40)
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 2. Understanding OpenTelemetry Concepts

Before we start instrumenting, let's understand the key concepts of observability.
</VSCode.Cell>

<VSCode.Cell language="python">
def explain_telemetry_concepts():
    """
    Explain key OpenTelemetry concepts with examples.
    """
    print("📚 OpenTelemetry Concepts:")
    print("=" * 50)
    
    concepts = {
        "🔍 Traces": {
            "definition": "End-to-end journey of a request through your application",
            "example": "User question → API call → Model inference → Response",
            "use_case": "Understanding request flow and finding bottlenecks"
        },
        "⏱️ Spans": {
            "definition": "Individual operations within a trace (building blocks)",
            "example": "Database query, HTTP request, function execution",
            "use_case": "Measuring duration and recording operation details"
        },
        "🏷️ Attributes": {
            "definition": "Key-value pairs that provide context to spans",
            "example": "model_name=gpt-4o, user_id=123, temperature=0.7",
            "use_case": "Filtering and analyzing traces by specific criteria"
        },
        "📊 Metrics": {
            "definition": "Numerical measurements aggregated over time",
            "example": "Request count, response latency, token usage",
            "use_case": "Monitoring performance trends and alerting"
        },
        "📝 Logs": {
            "definition": "Structured or unstructured text records of events",
            "example": "Error messages, debug information, business events",
            "use_case": "Debugging issues and understanding application behavior"
        }
    }
    
    for concept, details in concepts.items():
        print(f"\n{concept}")
        print(f"  📖 Definition: {details['definition']}")
        print(f"  💡 Example: {details['example']}")
        print(f"  🎯 Use Case: {details['use_case']}")

explain_telemetry_concepts()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 3. Setting Up Local Console Tracing

Let's start with console tracing to see traces locally before sending them to Azure.
</VSCode.Cell>

<VSCode.Cell language="python">
# Configure local console tracing
def setup_console_tracing():
    """
    Set up OpenTelemetry with console output for local debugging.
    """
    print("🖥️ Setting up Console Tracing...")
    
    # Configure tracing
    trace.set_tracer_provider(TracerProvider())
    tracer = trace.get_tracer(__name__)
    
    # Add console exporter for local viewing
    console_exporter = ConsoleSpanExporter()
    span_processor = BatchSpanProcessor(console_exporter)
    trace.get_tracer_provider().add_span_processor(span_processor)
    
    # Configure Azure SDK tracing
    settings.tracing_implementation = "opentelemetry"
    
    print("✅ Console tracing configured")
    return tracer

# Set up console tracing
console_tracer = setup_console_tracing()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 4. Instrument Azure OpenAI SDK

Now let's instrument the OpenAI SDK to automatically trace all API calls.
</VSCode.Cell>

<VSCode.Cell language="python">
# Instrument the OpenAI SDK
def setup_openai_instrumentation():
    """
    Instrument the OpenAI SDK for automatic tracing.
    """
    print("🔧 Instrumenting OpenAI SDK...")
    
    # Instrument OpenAI SDK
    OpenAIInstrumentor().instrument()
    
    # Optionally enable content recording (contains sensitive data)
    content_recording = os.getenv('AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED', 'false').lower() == 'true'
    if content_recording:
        os.environ["AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED"] = "true"
        print("⚠️ Content recording enabled - traces will include prompts and responses")
    else:
        print("🔒 Content recording disabled - traces will not include sensitive content")
    
    print("✅ OpenAI SDK instrumentation complete")

setup_openai_instrumentation()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 5. Connect to Azure AI Foundry and Create Client

Let's connect to our AI Foundry project and create an instrumented OpenAI client.
</VSCode.Cell>

<VSCode.Cell language="python">
# Connect to Azure AI Foundry with tracing
try:
    print("🔗 Connecting to Azure AI Foundry...")
    
    # Initialize project client
    credential = DefaultAzureCredential()
    project_client = AIProjectClient(
        endpoint=os.getenv('PROJECT_ENDPOINT'),
        credential=credential
    )
    
    # Get OpenAI client (now instrumented)
    openai_client = project_client.get_openai_client()
    
    print("✅ Connected to Azure AI Foundry with tracing enabled")
    print(f"📍 Project: {os.getenv('PROJECT_ENDPOINT')}")
    print(f"🤖 Model: {os.getenv('MODEL_DEPLOYMENT_NAME')}")
    
except Exception as e:
    print(f"❌ Connection failed: {e}")
    exit()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 6. Test Basic Tracing with Console Output

Let's make some API calls and observe the traces in the console.
</VSCode.Cell>

<VSCode.Cell language="python">
# Test basic tracing with console output
def test_console_tracing():
    """
    Test OpenAI API calls with console tracing to see spans locally.
    """
    print("🧪 Testing Console Tracing:")
    print("-" * 30)
    
    # Create a custom span for our business logic
    with console_tracer.start_as_current_span("workshop_demo") as demo_span:
        # Add attributes to our custom span
        demo_span.set_attribute("workshop.name", "tracing_demo")
        demo_span.set_attribute("user.type", "workshop_participant")
        
        print("Making API call with tracing...")
        
        # This API call will be automatically traced by OpenAI instrumentation
        response = openai_client.chat.completions.create(
            model=os.getenv('MODEL_DEPLOYMENT_NAME'),
            messages=[
                {"role": "system", "content": "You are a helpful assistant explaining AI concepts."},
                {"role": "user", "content": "Explain what observability means in AI applications in 2 sentences."}
            ],
            max_tokens=100,
            temperature=0.7
        )
        
        # Add response details to our span
        demo_span.set_attribute("response.tokens", response.usage.total_tokens)
        demo_span.set_attribute("response.model", response.model)
        
        print(f"\n📝 Response: {response.choices[0].message.content}")
        print(f"📊 Tokens used: {response.usage.total_tokens}")
        
        # Simulate some processing time
        time.sleep(0.1)
        demo_span.add_event("Processing complete")

# Run the test
test_console_tracing()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 7. Azure Application Insights Integration

Now let's configure Azure Application Insights to send traces to the cloud for persistent storage and analysis.
</VSCode.Cell>

<VSCode.Cell language="python">
# Set up Azure Application Insights integration
def setup_azure_monitor():
    """
    Configure Azure Monitor to send traces to Application Insights.
    """
    print("☁️ Setting up Azure Monitor Integration...")
    
    try:
        # Get Application Insights connection string from the project
        connection_string = project_client.telemetry.get_application_insights_connection_string()
        
        if connection_string:
            print("✅ Retrieved Application Insights connection string from project")
            print(f"📍 Connection: {connection_string[:50]}...")
            
            # Configure Azure Monitor
            configure_azure_monitor(connection_string=connection_string)
            print("✅ Azure Monitor configured successfully")
            
            return True
        else:
            print("❌ No Application Insights connection string found")
            print("💡 Make sure Application Insights is configured in your AI Foundry project")
            return False
            
    except Exception as e:
        print(f"❌ Azure Monitor setup failed: {e}")
        print("💡 You can still use console tracing for this workshop")
        return False

# Configure Azure Monitor
azure_monitor_enabled = setup_azure_monitor()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 8. Advanced Tracing with Custom Spans

Let's create a more complex example with custom spans to trace business logic.
</VSCode.Cell>

<VSCode.Cell language="python">
# Advanced tracing example with custom spans
def advanced_ai_workflow(user_question: str, user_id: str = "workshop_user"):
    """
    Example of a complex AI workflow with custom tracing.
    """
    
    # Get tracer for creating custom spans
    tracer = trace.get_tracer(__name__)
    
    with tracer.start_as_current_span("ai_workflow") as workflow_span:
        # Add workflow-level attributes
        workflow_span.set_attribute("user.id", user_id)
        workflow_span.set_attribute("workflow.type", "question_answering")
        workflow_span.set_attribute("input.question_length", len(user_question))
        
        # Step 1: Question preprocessing
        with tracer.start_as_current_span("preprocess_question") as preprocess_span:
            preprocess_span.add_event("Starting question preprocessing")
            
            # Simulate preprocessing
            cleaned_question = user_question.strip().lower()
            question_type = "technical" if any(word in cleaned_question for word in ["ai", "ml", "algorithm", "model"]) else "general"
            
            preprocess_span.set_attribute("question.type", question_type)
            preprocess_span.set_attribute("question.cleaned_length", len(cleaned_question))
            
            time.sleep(0.05)  # Simulate processing time
            preprocess_span.add_event("Preprocessing complete")
        
        # Step 2: Select appropriate system prompt based on question type
        with tracer.start_as_current_span("select_system_prompt") as prompt_span:
            if question_type == "technical":
                system_prompt = "You are an expert AI/ML engineer. Provide technical but accessible explanations."
            else:
                system_prompt = "You are a helpful assistant. Provide clear and friendly explanations."
            
            prompt_span.set_attribute("prompt.type", question_type)
            prompt_span.set_attribute("prompt.length", len(system_prompt))
        
        # Step 3: Make AI API call (automatically traced)
        with tracer.start_as_current_span("ai_inference") as inference_span:
            inference_span.add_event("Starting AI inference")
            
            start_time = time.time()
            
            response = openai_client.chat.completions.create(
                model=os.getenv('MODEL_DEPLOYMENT_NAME'),
                messages=[
                    {"role": "system", "content": system_prompt},
                    {"role": "user", "content": user_question}
                ],
                max_tokens=200,
                temperature=0.7
            )
            
            inference_time = time.time() - start_time
            
            # Add inference metrics to span
            inference_span.set_attribute("inference.duration_ms", round(inference_time * 1000, 2))
            inference_span.set_attribute("tokens.prompt", response.usage.prompt_tokens)
            inference_span.set_attribute("tokens.completion", response.usage.completion_tokens) 
            inference_span.set_attribute("tokens.total", response.usage.total_tokens)
            inference_span.set_attribute("model.name", response.model)
            
            inference_span.add_event("AI inference complete")
        
        # Step 4: Post-process response
        with tracer.start_as_current_span("postprocess_response") as postprocess_span:
            ai_response = response.choices[0].message.content
            
            # Simulate some post-processing
            word_count = len(ai_response.split())
            has_code = "```" in ai_response
            
            postprocess_span.set_attribute("response.word_count", word_count)
            postprocess_span.set_attribute("response.has_code", has_code)
            postprocess_span.set_attribute("response.length", len(ai_response))
            
            time.sleep(0.02)  # Simulate processing time
        
        # Add final workflow metrics
        workflow_span.set_attribute("workflow.success", True)
        workflow_span.set_attribute("workflow.total_tokens", response.usage.total_tokens)
        workflow_span.add_event("Workflow complete")
        
        return {
            "response": ai_response,
            "metadata": {
                "question_type": question_type,
                "tokens_used": response.usage.total_tokens,
                "inference_time_ms": round(inference_time * 1000, 2),
                "word_count": word_count
            }
        }

# Test the advanced workflow
print("🚀 Testing Advanced AI Workflow with Custom Tracing:")
print("=" * 55)

test_questions = [
    "What is machine learning and how does it work?",
    "What's the weather like today?",
    "Explain the difference between supervised and unsupervised learning algorithms."
]

for i, question in enumerate(test_questions, 1):
    print(f"\n{i}️⃣ Question: {question}")
    result = advanced_ai_workflow(question, f"user_{i}")
    print(f"📝 Response: {result['response'][:100]}...")
    print(f"📊 Metadata: {result['metadata']}")
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 9. Error Handling and Tracing

Let's see how to trace errors and exceptions in your AI applications.
</VSCode.Cell>

<VSCode.Cell language="python">
# Error handling with tracing
def trace_with_error_handling():
    """
    Demonstrate how to trace errors and exceptions.
    """
    tracer = trace.get_tracer(__name__)
    
    print("🚨 Testing Error Handling with Tracing:")
    print("-" * 40)
    
    # Test 1: API call with invalid parameters
    with tracer.start_as_current_span("test_invalid_model") as span:
        span.set_attribute("test.type", "invalid_model")
        
        try:
            # This should fail due to invalid model name
            response = openai_client.chat.completions.create(
                model="invalid-model-name",
                messages=[{"role": "user", "content": "Hello"}],
                max_tokens=50
            )
            
        except Exception as e:
            # Record the error in the span
            span.record_exception(e)
            span.set_status(trace.Status(trace.StatusCode.ERROR, str(e)))
            span.set_attribute("error.type", type(e).__name__)
            span.set_attribute("error.message", str(e))
            
            print(f"❌ Expected error captured: {type(e).__name__}")
    
    # Test 2: Rate limiting simulation
    with tracer.start_as_current_span("test_rate_limiting") as span:
        span.set_attribute("test.type", "rate_limiting_simulation")
        
        try:
            # Simulate rate limiting by making rapid requests
            print("🔄 Simulating multiple rapid requests...")
            
            for i in range(3):
                with tracer.start_as_current_span(f"request_{i}") as req_span:
                    req_span.set_attribute("request.number", i)
                    
                    try:
                        response = openai_client.chat.completions.create(
                            model=os.getenv('MODEL_DEPLOYMENT_NAME'),
                            messages=[{"role": "user", "content": f"Quick question {i}"}],
                            max_tokens=20
                        )
                        
                        req_span.set_attribute("request.success", True)
                        req_span.set_attribute("tokens.used", response.usage.total_tokens)
                        print(f"  ✅ Request {i} succeeded")
                        
                    except Exception as e:
                        req_span.record_exception(e)
                        req_span.set_status(trace.Status(trace.StatusCode.ERROR, str(e)))
                        req_span.set_attribute("request.success", False)
                        print(f"  ❌ Request {i} failed: {type(e).__name__}")
                    
                    time.sleep(0.1)  # Small delay between requests
                    
        except Exception as e:
            span.record_exception(e)
            span.set_status(trace.Status(trace.StatusCode.ERROR, str(e)))
            print(f"❌ Batch request error: {e}")

# Run error handling test
trace_with_error_handling()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 10. Performance Analysis with Tracing

Let's create a performance analysis example that shows how to use tracing data.
</VSCode.Cell>

<VSCode.Cell language="python">
# Performance analysis with tracing
def performance_analysis_demo():
    """
    Demonstrate performance analysis using tracing data.
    """
    tracer = trace.get_tracer(__name__)
    
    print("📊 Performance Analysis Demo:")
    print("-" * 35)
    
    # Test different temperature values and their impact on response time
    temperatures = [0.1, 0.5, 1.0]
    results = []
    
    with tracer.start_as_current_span("performance_analysis") as analysis_span:
        analysis_span.set_attribute("analysis.type", "temperature_impact")
        
        for temp in temperatures:
            with tracer.start_as_current_span(f"temperature_test_{temp}") as temp_span:
                temp_span.set_attribute("model.temperature", temp)
                
                start_time = time.time()
                
                try:
                    response = openai_client.chat.completions.create(
                        model=os.getenv('MODEL_DEPLOYMENT_NAME'),
                        messages=[
                            {"role": "system", "content": "You are a creative writer."},
                            {"role": "user", "content": "Write a short poem about AI."}
                        ],
                        max_tokens=100,
                        temperature=temp
                    )
                    
                    end_time = time.time()
                    duration = round((end_time - start_time) * 1000, 2)
                    
                    # Add performance metrics to span
                    temp_span.set_attribute("performance.duration_ms", duration)
                    temp_span.set_attribute("performance.tokens_per_second", 
                                          round(response.usage.total_tokens / (duration/1000), 2))
                    temp_span.set_attribute("tokens.total", response.usage.total_tokens)
                    temp_span.set_attribute("response.length", len(response.choices[0].message.content))
                    
                    results.append({
                        'temperature': temp,
                        'duration_ms': duration,
                        'tokens': response.usage.total_tokens,
                        'tokens_per_second': round(response.usage.total_tokens / (duration/1000), 2),
                        'response_length': len(response.choices[0].message.content)
                    })
                    
                    print(f"🌡️ Temperature {temp}: {duration}ms, {response.usage.total_tokens} tokens")
                    
                except Exception as e:
                    temp_span.record_exception(e)
                    temp_span.set_status(trace.Status(trace.StatusCode.ERROR, str(e)))
                    print(f"❌ Temperature {temp} failed: {e}")
    
    # Analyze results
    if results:
        print(f"\n📈 Performance Analysis Results:")
        print("-" * 35)
        
        avg_duration = sum(r['duration_ms'] for r in results) / len(results)
        avg_tokens_per_sec = sum(r['tokens_per_second'] for r in results) / len(results)
        
        print(f"Average duration: {avg_duration:.2f}ms")
        print(f"Average tokens/sec: {avg_tokens_per_sec:.2f}")
        
        # Find fastest and slowest
        fastest = min(results, key=lambda x: x['duration_ms'])
        slowest = max(results, key=lambda x: x['duration_ms'])
        
        print(f"Fastest: Temperature {fastest['temperature']} ({fastest['duration_ms']}ms)")
        print(f"Slowest: Temperature {slowest['temperature']} ({slowest['duration_ms']}ms)")

# Run performance analysis
performance_analysis_demo()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 11. Viewing Traces in Azure Application Insights

If Azure Monitor is configured, your traces are now being sent to Application Insights. Here's how to view them.
</VSCode.Cell>

<VSCode.Cell language="python">
# Instructions for viewing traces in Azure
def show_azure_insights_instructions():
    """
    Provide instructions for viewing traces in Azure Application Insights.
    """
    print("☁️ Viewing Traces in Azure Application Insights:")
    print("=" * 50)
    
    if azure_monitor_enabled:
        print("✅ Your traces are being sent to Azure Application Insights!")
        print()
        
        steps = [
            "1. Open the Azure Portal (portal.azure.com)",
            "2. Navigate to your AI Foundry project resource",
            "3. Go to 'Observability' -> 'Tracing' in the left menu",
            "4. Or directly open Application Insights resource",
            "5. In Application Insights, go to 'Investigate' -> 'Transaction search'",
            "6. Look for traces with operation names like:",
            "   • 'ai_workflow' (our custom spans)",
            "   • 'chat/completions' (OpenAI API calls)",
            "   • 'workshop_demo' (our demo spans)",
            "7. Click on a trace to see the detailed span timeline",
            "8. Explore the 'Performance' tab for aggregated metrics",
            "9. Use 'Logs' to write KQL queries for custom analysis"
        ]
        
        for step in steps:
            print(step)
        
        print(f"\n🔍 What to Look For:")
        insights = [
            "• End-to-end trace duration",
            "• Individual span timings (preprocessing, inference, postprocessing)",
            "• OpenAI API call details and token usage",
            "• Custom attributes we added (question_type, user_id, etc.)",
            "• Error traces and exception details",
            "• Performance patterns across different requests"
        ]
        
        for insight in insights:
            print(insight)
        
        print(f"\n📊 Useful KQL Queries for Application Insights:")
        queries = [
            "// All traces from our workshop",
            "traces | where customDimensions.['workshop.name'] == 'tracing_demo'",
            "",
            "// OpenAI API performance",
            "requests | where name contains 'chat/completions'",
            "| summarize avg(duration), count() by bin(timestamp, 5m)",
            "",
            "// Token usage over time", 
            "traces | where customDimensions.['tokens.total'] != ''",
            "| extend tokens = toint(customDimensions.['tokens.total'])",
            "| summarize avg(tokens), sum(tokens) by bin(timestamp, 1h)"
        ]
        
        for query in queries:
            print(query)
            
    else:
        print("❌ Azure Monitor not configured")
        print("💡 To enable Azure Application Insights:")
        print("1. Ensure Application Insights is configured in your AI Foundry project")
        print("2. Check the 'Observability' section in Azure AI Foundry portal")
        print("3. Verify your connection string is accessible")

show_azure_insights_instructions()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 12. Production Best Practices

Let's cover best practices for using tracing in production environments.
</VSCode.Cell>

<VSCode.Cell language="python">
# Production best practices
def production_best_practices():
    """
    Demonstrate production best practices for tracing.
    """
    print("🏭 Production Tracing Best Practices:")
    print("=" * 40)
    
    practices = {
        "🔒 Security": [
            "• Disable content recording in production (set AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED=false)",
            "• Use Azure Managed Identity instead of API keys",
            "• Be careful with custom attributes - don't include PII",
            "• Review trace data retention policies"
        ],
        "⚡ Performance": [
            "• Use sampling to reduce trace volume (start with 1% sampling)",
            "• Implement batch span processors instead of simple processors",
            "• Set appropriate timeout values for exporters",
            "• Monitor the overhead of tracing itself"
        ],
        "📊 Monitoring": [
            "• Set up alerts on error rates and high latency",
            "• Monitor token usage trends and costs",
            "• Track model performance metrics over time",
            "• Create dashboards for key business metrics"
        ],
        "🏗️ Architecture": [
            "• Use consistent span names across services",
            "• Add business context through custom attributes",
            "• Implement correlation IDs for distributed tracing",
            "• Document your tracing strategy and naming conventions"
        ]
    }
    
    for category, items in practices.items():
        print(f"\n{category}:")
        for item in items:
            print(f"  {item}")
    
    print(f"\n🚨 Common Pitfalls to Avoid:")
    pitfalls = [
        "• Tracing sensitive data (passwords, personal info)",
        "• Over-instrumenting and creating performance overhead",
        "• Not handling tracing failures gracefully",
        "• Forgetting to update trace configurations in different environments",
        "• Not correlating traces with business metrics"
    ]
    
    for pitfall in pitfalls:
        print(f"  {pitfall}")

production_best_practices()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 13. Workshop Summary and Next Steps

Congratulations! You've completed Workshop 2 on Tracing and Observability.
</VSCode.Cell>

<VSCode.Cell language="python">
# Workshop summary
def workshop_summary():
    """
    Summarize what was learned in Workshop 2.
    """
    print("🎯 Workshop 2 Summary:")
    print("=" * 50)
    
    achievements = [
        "✅ Understood OpenTelemetry concepts (traces, spans, attributes)",
        "✅ Instrumented Azure OpenAI SDK for automatic tracing",
        "✅ Set up console tracing for local debugging",
        "✅ Configured Azure Application Insights integration",
        "✅ Created custom spans for business logic",
        "✅ Implemented error handling with tracing",
        "✅ Performed performance analysis using trace data",
        "✅ Learned production best practices"
    ]
    
    for achievement in achievements:
        print(achievement)
    
    print(f"\n🔧 Technical Skills Gained:")
    skills = [
        "• OpenTelemetry SDK configuration and usage",
        "• Azure Monitor OpenTelemetry integration",
        "• Custom span creation and attribute management",
        "• Error tracking and exception recording",
        "• Performance measurement and analysis",
        "• Production monitoring setup"
    ]
    
    for skill in skills:
        print(skill)
    
    print(f"\n🚀 Next Workshop Preview:")
    print("Workshop 3: AI Agents")
    print("• Create intelligent AI agents with tools")
    print("• Implement function calling capabilities")
    print("• Build multi-step reasoning workflows")
    print("• Trace agent interactions and decision-making")
    
    print(f"\n💡 Homework:")
    print("• Explore your traces in Azure Application Insights")
    print("• Try adding custom attributes to trace business metrics")
    print("• Experiment with different sampling rates")
    print("• Set up a simple alert on token usage or error rates")

# Final cleanup
def cleanup_instrumentation():
    """
    Clean up OpenTelemetry instrumentation.
    """
    try:
        OpenAIInstrumentor().uninstrument()
        print("🧹 OpenAI instrumentation cleaned up")
    except Exception as e:
        print(f"Note: Cleanup not needed or failed: {e}")

workshop_summary()
cleanup_instrumentation()
</VSCode.Cell>

<VSCode.Cell language="markdown">
## 🔧 Troubleshooting Guide

### Common Issues and Solutions

#### Tracing Not Appearing
- **Problem**: No traces visible in console or Azure
- **Solution**: 
  - Check if OpenAI instrumentation is properly set up
  - Verify Azure Monitor connection string
  - Ensure you're making actual API calls

#### Application Insights Connection Issues
- **Problem**: "No connection string found"
- **Solution**:
  - Check if Application Insights is configured in your AI Foundry project
  - Go to Azure AI Foundry portal → Observability → Tracing
  - Verify your Azure permissions

#### Performance Overhead
- **Problem**: Tracing is slowing down your application
- **Solution**:
  - Implement sampling (trace only 1-10% of requests)
  - Use batch processors instead of simple processors
  - Disable content recording in production

#### Missing Trace Data
- **Problem**: Some spans or attributes are missing
- **Solution**:
  - Check for exceptions in span creation
  - Verify attribute names don't contain special characters
  - Ensure spans are properly closed

### 📚 Additional Resources

- [OpenTelemetry Python Documentation](https://opentelemetry.io/docs/languages/python/)
- [Azure Monitor OpenTelemetry Documentation](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable)
- [Azure AI Foundry Tracing Guide](https://learn.microsoft.com/azure/ai-foundry/how-to/develop/trace-application)
- [Application Insights KQL Reference](https://learn.microsoft.com/azure/data-explorer/kusto/query/)

### 🎮 Try These Extensions

1. **Add Custom Metrics**: Implement custom metrics for token costs
2. **Correlation IDs**: Add correlation IDs to track requests across services  
3. **Sampling**: Implement different sampling strategies
4. **Dashboards**: Create custom dashboards in Application Insights
</VSCode.Cell>
```