# Workshop 1: Deploy Your First Model

Welcome to the Azure OpenAI Workshop! In this first notebook, you'll learn how to connect to Azure OpenAI and make your first API request.

## What You'll Learn

1. **Environment Setup** - Verify your Python environment and packages
2. **Azure OpenAI Connection** - Connect to your deployed Azure OpenAI resource
3. **First API Request** - Make a basic chat completion request
4. **Understanding Responses** - Explore the API response structure
5. **Token Usage** - Monitor and understand token consumption

## Prerequisites

- Azure OpenAI resource deployed (via infrastructure scripts)
- Environment variables configured in `.env` file
- Python environment set up with required packages

## Quick Setup

If you haven't set up the environment yet:

```bash
# Install dependencies
uv sync

# Activate environment (optional - uv run handles this automatically)
source .venv/bin/activate

# Start Jupyter
uv run jupyter lab
```

Let's start by verifying your environment setup!

In [None]:
# Verify environment setup
try:
    import openai
    import azure.ai.projects
    import azure.ai.inference
    import azure.identity
    import pandas as pd
    import numpy as np
    
    print("✅ Environment Setup Successful!")
    print("-" * 40)
    print(f"📦 OpenAI SDK: {openai.__version__}")
    print(f"📦 Pandas: {pd.__version__}")
    print(f"📦 NumPy: {np.__version__}")
    print("\n🚀 Ready to proceed with the workshop!")
    
except ImportError as e:
    print(f"❌ Import error: {e}")
    print("\n💡 Setup required:")
    print("1. Run: uv sync")
    print("2. Make sure you're using the correct kernel")
    print("3. Restart this notebook")

## 1. Environment Setup and Verification

Let's verify that all required packages are installed and your environment is ready.

## 2. Environment Variables Check

Let's verify that your environment variables are properly configured for Azure OpenAI.

In [None]:
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

print("🔧 Azure OpenAI Environment Check:")
print("-" * 50)

# Check required environment variables
required_vars = {
    'AZURE_OPENAI_ENDPOINT': 'Azure OpenAI service endpoint',
    'AZURE_OPENAI_DEPLOYMENT_NAME': 'Model deployment name (e.g., gpt-4o)',
    'AZURE_OPENAI_API_VERSION': 'API version'
}

all_set = True
for var, description in required_vars.items():
    value = os.getenv(var)
    if value:
        # Show only first 50 chars of endpoint for security
        display_value = value[:50] + "..." if var == 'AZURE_OPENAI_ENDPOINT' and len(value) > 50 else value
        print(f"✅ {var}: {display_value}")
    else:
        print(f"❌ {var}: Not set")
        print(f"   📝 {description}")
        all_set = False

# Check optional API key (if not using Entra ID)
api_key = os.getenv('AZURE_OPENAI_API_KEY')
if api_key:
    print(f"🔑 AZURE_OPENAI_API_KEY: {'*' * 10}...{api_key[-4:]} (Key-based auth)")
else:
    print("🔐 AZURE_OPENAI_API_KEY: Not set (Using Entra ID auth)")

print("-" * 50)
if all_set:
    print("🚀 Environment configuration looks good!")
else:
    print("⚠️  Please set missing environment variables in your .env file")
    print("💡 Check the README.md for setup instructions")

## 3. Connect to Azure OpenAI

Now let's create an Azure OpenAI client and connect to your deployed model.

In [None]:
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential

print("🔗 Connecting to Azure OpenAI...")

# Get configuration from environment variables
endpoint = os.getenv('AZURE_OPENAI_ENDPOINT')
api_version = os.getenv('AZURE_OPENAI_API_VERSION', '2024-10-21')
deployment_name = os.getenv('AZURE_OPENAI_DEPLOYMENT_NAME')

# Check if we have an API key or should use Entra ID
api_key = os.getenv('AZURE_OPENAI_API_KEY')

try:
    if api_key:
        # Use API key authentication
        print("🔑 Using API key authentication")
        client = AzureOpenAI(
            api_key=api_key,
            api_version=api_version,
            azure_endpoint=endpoint
        )
    else:
        # Use Entra ID authentication (recommended)
        print("🔐 Using Entra ID authentication")
        credential = DefaultAzureCredential()
        
        client = AzureOpenAI(
            api_version=api_version,
            azure_endpoint=endpoint,
            azure_ad_token_provider=lambda: credential.get_token("https://cognitiveservices.azure.com/.default").token
        )
    
    print("✅ Azure OpenAI client created successfully!")
    print(f"📍 Endpoint: {endpoint}")
    print(f"🤖 Model Deployment: {deployment_name}")
    print(f"📅 API Version: {api_version}")
    
except Exception as e:
    print(f"❌ Failed to create Azure OpenAI client: {e}")
    print("💡 Check your environment variables and Azure permissions")

## 4. Your First Azure OpenAI API Request

Let's make a basic chat completion request to test the connection and see the API in action!

In [None]:
import json
from datetime import datetime

print("🚀 Making your first Azure OpenAI API request...")
print("-" * 60)

# Define the messages for the chat
messages = [
    {
        "role": "system", 
        "content": "You are a helpful AI assistant. Be concise and friendly in your responses."
    },
    {
        "role": "user", 
        "content": "Hello! Please introduce yourself and explain what you can do in 2-3 sentences."
    }
]

try:
    # Record start time for performance measurement
    start_time = datetime.now()
    
    # Make the API request
    response = client.chat.completions.create(
        model=deployment_name,
        messages=messages,
        max_tokens=150,
        temperature=0.7,
        top_p=1.0
    )
    
    # Record end time
    end_time = datetime.now()
    response_time = (end_time - start_time).total_seconds()
    
    print("✅ API Request Successful!")
    print("-" * 60)
    
    # Extract the response content
    assistant_message = response.choices[0].message.content
    
    print("🤖 AI Assistant Response:")
    print(f"   {assistant_message}")
    print()
    
    # Display usage information
    usage = response.usage
    print("📊 Request Details:")
    print(f"   ⏱️  Response Time: {response_time:.2f} seconds")
    print(f"   🔤 Prompt Tokens: {usage.prompt_tokens}")
    print(f"   🔤 Completion Tokens: {usage.completion_tokens}")
    print(f"   🔤 Total Tokens: {usage.total_tokens}")
    print(f"   🏷️  Model Used: {response.model}")
    print(f"   🎯 Finish Reason: {response.choices[0].finish_reason}")
    
    print("-" * 60)
    print("🎉 Congratulations! You've successfully made your first Azure OpenAI API call!")
    
except Exception as e:
    print(f"❌ API request failed: {e}")
    print("💡 Troubleshooting tips:")
    print("   • Check your environment variables")
    print("   • Verify your Azure OpenAI deployment is active")
    print("   • Ensure you have proper permissions")
    print("   • Try running 'az login' if using Entra ID auth")

## 5. Understanding the API Response

Let's explore the structure of the response object to understand what Azure OpenAI returns.

In [None]:
# Let's examine the response object structure in detail
if 'response' in locals():
    print("🔍 Detailed Response Analysis:")
    print("=" * 70)
    
    # Response metadata
    print("📋 Response Metadata:")
    print(f"   🆔 Response ID: {response.id}")
    print(f"   📅 Created: {datetime.fromtimestamp(response.created)}")
    print(f"   🏷️  Model: {response.model}")
    print(f"   🎯 Object Type: {response.object}")
    
    # Choices analysis
    print(f"\n🎲 Choices (Available: {len(response.choices)}):")
    for i, choice in enumerate(response.choices):
        print(f"   Choice {i}:")
        print(f"     🔤 Content: {choice.message.content[:100]}...")
        print(f"     👤 Role: {choice.message.role}")
        print(f"     🏁 Finish Reason: {choice.finish_reason}")
        print(f"     📊 Index: {choice.index}")
    
    # Usage statistics breakdown
    print(f"\n📊 Token Usage Breakdown:")
    print(f"   📝 Prompt Tokens: {usage.prompt_tokens} (input)")
    print(f"   🤖 Completion Tokens: {usage.completion_tokens} (output)")
    print(f"   📈 Total Tokens: {usage.total_tokens} (prompt + completion)")
    
    # Estimate cost (approximate, varies by model and region)
    # Note: These are example rates and may not reflect current pricing
    prompt_cost_per_1k = 0.0015  # Example rate for GPT-4
    completion_cost_per_1k = 0.002
    
    estimated_cost = (usage.prompt_tokens * prompt_cost_per_1k / 1000) + \
                    (usage.completion_tokens * completion_cost_per_1k / 1000)
    
    print(f"\n💰 Estimated Cost (Example Rates):")
    print(f"   💸 This Request: ~${estimated_cost:.6f}")
    print(f"   📝 Prompt Cost: ~${usage.prompt_tokens * prompt_cost_per_1k / 1000:.6f}")
    print(f"   🤖 Completion Cost: ~${usage.completion_tokens * completion_cost_per_1k / 1000:.6f}")
    print(f"   ⚠️  Note: Actual costs vary by model and region")
    
    # Response timing
    print(f"\n⏱️  Performance Metrics:")
    print(f"   🚀 Response Time: {response_time:.2f} seconds")
    print(f"   📈 Tokens/Second: {usage.total_tokens / response_time:.1f}")
    
else:
    print("❌ No response object found. Please run the previous cell first.")

## 6. Try Your Own Request

Now it's your turn! Modify the message below and experiment with different parameters.

In [None]:
# ✏️ Customize this request - try changing the message, temperature, or max_tokens!

# Your custom message - modify this!
your_message = "Explain the concept of artificial intelligence in simple terms that a 10-year-old could understand."

# Experiment with these parameters:
custom_temperature = 0.7    # Try values between 0.0 (focused) and 1.0 (creative)
custom_max_tokens = 200     # Adjust response length
custom_top_p = 1.0         # Try values between 0.1 and 1.0

print(f"🎯 Your Custom Request:")
print(f"📝 Message: {your_message}")
print(f"🌡️  Temperature: {custom_temperature}")
print(f"📏 Max Tokens: {custom_max_tokens}")
print(f"🎲 Top P: {custom_top_p}")
print("-" * 60)

try:
    # Make your custom request
    custom_response = client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": "You are a helpful and educational AI assistant."},
            {"role": "user", "content": your_message}
        ],
        max_tokens=custom_max_tokens,
        temperature=custom_temperature,
        top_p=custom_top_p
    )
    
    # Display the result
    print("🤖 AI Response:")
    print(f"   {custom_response.choices[0].message.content}")
    print()
    
    # Quick stats
    custom_usage = custom_response.usage
    print("📊 Quick Stats:")
    print(f"   🔤 Total Tokens: {custom_usage.total_tokens}")
    print(f"   🏁 Finish Reason: {custom_response.choices[0].finish_reason}")
    
    print("\n💡 Try This Next:")
    print("   • Change the temperature (0.0 = focused, 1.0 = creative)")
    print("   • Modify the message to ask something different")
    print("   • Adjust max_tokens to control response length")
    print("   • Experiment with different system prompts")
    
except Exception as e:
    print(f"❌ Request failed: {e}")
    print("💡 Double-check your parameters and try again")

## Workshop 1 Summary

🎉 **Congratulations!** You've successfully completed Workshop 1!

### What You've Accomplished

✅ **Environment Setup** - Verified Python packages and dependencies  
✅ **Configuration** - Checked Azure OpenAI environment variables  
✅ **Connection** - Connected to Azure OpenAI service  
✅ **First API Call** - Made a successful chat completion request  
✅ **Response Analysis** - Understood the API response structure  
✅ **Experimentation** - Tried custom requests with different parameters  

### Key Concepts Learned

- **Azure OpenAI Client**: How to authenticate and connect
- **Chat Completions**: The main API for conversational AI
- **Token Usage**: Understanding input/output tokens and costs
- **Parameters**: Temperature, max_tokens, and top_p controls
- **Response Structure**: Choices, usage, and metadata

### Next Steps

🚀 **Ready for Workshop 2: Tracing and Observability**
- Learn to monitor and trace your AI applications
- Set up Application Insights integration
- Understand performance metrics and debugging

### Common Parameters to Remember

| Parameter | Purpose | Typical Range |
|-----------|---------|---------------|
| `temperature` | Controls randomness/creativity | 0.0 - 1.0 |
| `max_tokens` | Limits response length | 1 - 4096+ |
| `top_p` | Controls diversity via nucleus sampling | 0.1 - 1.0 |
| `frequency_penalty` | Reduces repetition | -2.0 - 2.0 |
| `presence_penalty` | Encourages new topics | -2.0 - 2.0 |

### 💡 Pro Tips

- **Monitor token usage** to control costs
- **Use system prompts** to set consistent behavior
- **Experiment with temperature** for different use cases
- **Set reasonable max_tokens** to avoid runaway responses
- **Use Entra ID authentication** for production applications

Keep experimenting and have fun building with Azure OpenAI! 🤖✨