# Azure OpenAI Stored Completions & Distillation Demo 🚀

This notebook demonstrates the **Azure OpenAI Stored Completions** feature (preview) which allows you to:

- 📝 **Store completions** from chat sessions for later use
- 🔍 **Query and manage** stored completion data  
- 🎯 **Create fine-tuning datasets** through distillation
- 📊 **Generate evaluation datasets** for model assessment

> **Note**: This feature requires API version `2025-02-01-preview` or later and appropriate permissions.

## 📦 Setup and Installation

### 🐍 Virtual Environment Setup (Recommended)

If you haven't set up the virtual environment yet, run these commands in your terminal:

```bash
# Create virtual environment
python -m venv venv

# Activate virtual environment (Windows)
.\venv\Scripts\Activate.ps1

# Install packages
pip install -r requirements.txt

# Install Jupyter kernel
python -m ipykernel install --user --name=azure-openai-stored-completions --display-name="Azure OpenAI Stored Completions"
```

### 📋 Package Installation

If you're not using the virtual environment, run the cell below to install required packages.

In [None]:
# Install packages from requirements.txt (if not using virtual environment)
# If you're using the virtual environment, these should already be installed
# !pip install -r requirements.txt

# Alternative: Install/upgrade required packages individually
!pip install --upgrade openai azure-identity python-dotenv jupyter pandas matplotlib seaborn

In [None]:
# Import required libraries
import os
import json
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from datetime import datetime
import time

## ⚙️ Azure OpenAI Configuration

Configure your Azure OpenAI client using **Managed Identity** (recommended for security) or environment variables.

In [None]:
# Configure Azure OpenAI client with Managed Identity (recommended)
# Replace with your Azure OpenAI endpoint
AZURE_OPENAI_ENDPOINT = "https://YOUR-RESOURCE-NAME.openai.azure.com"
MODEL_DEPLOYMENT_NAME = "gpt-4o"  # Replace with your model deployment name

# Setup authentication using Managed Identity
try:
    token_provider = get_bearer_token_provider(
        DefaultAzureCredential(), 
        "https://cognitiveservices.azure.com/.default"
    )
    
    client = AzureOpenAI(
        azure_endpoint=AZURE_OPENAI_ENDPOINT,
        azure_ad_token_provider=token_provider,
        api_version="2025-02-01-preview"  # Required for stored completions
    )
    
    print("✅ Azure OpenAI client initialized successfully with Managed Identity!")
    
except Exception as e:
    print(f"❌ Failed to initialize client: {str(e)}")
    print("Please ensure:")
    print("1. You're authenticated with Azure (az login)")
    print("2. Your identity has 'Cognitive Services OpenAI Contributor' role")
    print("3. Update AZURE_OPENAI_ENDPOINT with your resource URL")

## 💾 Creating Stored Completions

Let's create some chat completions with the `store=True` parameter to save them for later use in distillation and evaluation.

In [None]:
# Define different scenarios for stored completions
scenarios = [
    {
        "metadata": {"user": "data_scientist", "category": "ml_explanation", "difficulty": "intermediate"},
        "system_prompt": "Provide a clear and concise summary of machine learning concepts, highlighting key ideas and practical applications.",
        "user_message": "Explain ensemble methods in machine learning, including bagging, boosting, and stacking with practical examples."
    },
    {
        "metadata": {"user": "developer", "category": "code_review", "difficulty": "advanced"},
        "system_prompt": "You are a senior software engineer reviewing code. Provide constructive feedback focusing on best practices, performance, and maintainability.",
        "user_message": "Review this Python function for calculating Fibonacci numbers: def fib(n): return n if n <= 1 else fib(n-1) + fib(n-2)"
    },
    {
        "metadata": {"user": "business_analyst", "category": "data_analysis", "difficulty": "beginner"},
        "system_prompt": "Explain data analysis concepts in simple business terms with practical examples.",
        "user_message": "What is the difference between correlation and causation in data analysis? Provide business examples."
    }
]

stored_completions = []

print("🔄 Creating stored completions...")
for i, scenario in enumerate(scenarios):
    try:
        completion = client.chat.completions.create(
            model=MODEL_DEPLOYMENT_NAME,
            store=True,  # This enables stored completions
            metadata=scenario["metadata"],
            messages=[
                {"role": "system", "content": scenario["system_prompt"]},
                {"role": "user", "content": scenario["user_message"]}
            ]
        )
        
        stored_completions.append({
            "id": completion.id,
            "metadata": scenario["metadata"],
            "response": completion.choices[0].message.content
        })
        
        print(f"✅ Completion {i+1} stored with ID: {completion.id}")
        print(f"   Category: {scenario['metadata']['category']}")
        print(f"   Response preview: {completion.choices[0].message.content[:100]}...")
        print()
        
        # Small delay to avoid rate limiting
        time.sleep(1)
        
    except Exception as e:
        print(f"❌ Failed to create completion {i+1}: {str(e)}")

print(f"📊 Total stored completions created: {len(stored_completions)}")

## 📋 Listing Stored Completions

Now let's retrieve and explore our stored completions using the API.

In [None]:
# List all stored completions
print("🔍 Retrieving stored completions...")
try:
    response = client.chat.completions.list(limit=10)
    
    print(f"📊 Found {len(response.data)} stored completions")
    print("\n" + "="*80)
    
    for completion in response.data:
        print(f"ID: {completion.id}")
        print(f"Created: {completion.created}")
        print(f"Model: {completion.model}")
        
        # Display metadata if available
        if hasattr(completion, 'metadata') and completion.metadata:
            print(f"Metadata: {json.dumps(completion.metadata, indent=2)}")
        
        print("-" * 40)
        
except Exception as e:
    print(f"❌ Failed to list completions: {str(e)}")

# Filter by metadata (example: only machine learning related completions)
print("\n🔎 Filtering completions by category...")
try:
    ml_response = client.chat.completions.list(
        metadata={"category": "ml_explanation"},
        limit=5
    )
    
    print(f"📚 Found {len(ml_response.data)} ML-related completions")
    for completion in ml_response.data:
        print(f"  - {completion.id} (ML explanation)")
        
except Exception as e:
    print(f"❌ Failed to filter completions: {str(e)}")

## 🔍 Retrieving Specific Completions

Retrieve detailed information about specific stored completions by their ID.

In [None]:
# Get specific completion by ID (using first stored completion)
if stored_completions:
    completion_id = stored_completions[0]["id"]
    
    print(f"🔍 Retrieving completion details for ID: {completion_id}")
    
    try:
        # Get the completion details
        completion_detail = client.chat.completions.retrieve(completion_id)
        
        print(f"✅ Completion retrieved successfully!")
        print(f"ID: {completion_detail.id}")
        print(f"Model: {completion_detail.model}")
        print(f"Created: {completion_detail.created}")
        print(f"Usage: {completion_detail.usage}")
        
        # Get the messages for this completion
        print(f"\n📝 Retrieving messages for completion...")
        messages_response = client.chat.completions.messages.list(completion_id, limit=10)
        
        print(f"Found {len(messages_response.data)} messages:")
        for i, message in enumerate(messages_response.data):
            print(f"\nMessage {i+1}:")
            print(f"  Role: {message.role}")
            print(f"  Content: {message.content[:200]}{'...' if len(message.content) > 200 else ''}")
            
    except Exception as e:
        print(f"❌ Failed to retrieve completion: {str(e)}")
else:
    print("⚠️ No stored completions available. Please run the previous cells first.")

## ✏️ Updating Completion Metadata

You can add or update metadata for existing stored completions.

In [None]:
# Update metadata for a stored completion
if stored_completions:
    completion_id = stored_completions[0]["id"]
    
    print(f"✏️ Updating metadata for completion: {completion_id}")
    
    try:
        # Add new metadata
        updated_completion = client.chat.completions.update(
            completion_id,
            metadata={
                "quality_score": "excellent",
                "reviewed_by": "human_expert", 
                "reviewed_date": datetime.now().isoformat(),
                "suitable_for_training": "yes"
            }
        )
        
        print("✅ Metadata updated successfully!")
        print(f"Updated metadata: {json.dumps(updated_completion.metadata, indent=2)}")
        
    except Exception as e:
        print(f"❌ Failed to update metadata: {str(e)}")
else:
    print("⚠️ No stored completions available. Please run the previous cells first.")

## 🧪 Distillation Process

**Distillation** allows you to create fine-tuning datasets from your stored completions. This is powerful for creating smaller, specialized models based on high-quality interactions with larger models.

### Process Overview:
1. **Collect Data**: Use a large, powerful model (like GPT-4) to generate high-quality responses
2. **Store Completions**: Save these interactions with metadata for organization
3. **Filter & Curate**: Select the best completions for training
4. **Fine-tune**: Train a smaller model on this curated dataset

> **Note**: The actual distillation process is typically done through the Azure AI Foundry portal. This section demonstrates the data preparation and API usage.

In [None]:
# Analyze stored completions for distillation readiness
print("📊 Analyzing stored completions for distillation...")

try:
    # Get all completions
    all_completions = client.chat.completions.list(limit=50)
    
    # Analyze by category
    categories = {}
    quality_scores = {}
    
    for completion in all_completions.data:
        if hasattr(completion, 'metadata') and completion.metadata:
            category = completion.metadata.get('category', 'unknown')
            categories[category] = categories.get(category, 0) + 1
            
            quality = completion.metadata.get('quality_score', 'unrated')
            quality_scores[quality] = quality_scores.get(quality, 0) + 1
    
    print(f"📈 Completion Analysis:")
    print(f"  Total completions: {len(all_completions.data)}")
    print(f"  Categories: {json.dumps(categories, indent=4)}")
    print(f"  Quality scores: {json.dumps(quality_scores, indent=4)}")
    
    # Recommendations for distillation
    print(f"\n🎯 Distillation Recommendations:")
    total_completions = len(all_completions.data)
    
    if total_completions >= 10:
        print(f"  ✅ Minimum requirement met ({total_completions} ≥ 10 completions)")
    else:
        print(f"  ⚠️ Need more completions ({total_completions} < 10 minimum)")
    
    if total_completions >= 100:
        print(f"  ✅ Good dataset size for quality distillation")
    elif total_completions >= 50:
        print(f"  ⚠️ Moderate dataset size - consider adding more completions")
    else:
        print(f"  📝 Small dataset - aim for 100+ completions for best results")
        
except Exception as e:
    print(f"❌ Failed to analyze completions: {str(e)}")

## 📊 Evaluation Dataset Creation

Stored completions can also be used to create evaluation datasets for testing model performance across various dimensions like accuracy, coherence, and helpfulness.

In [None]:
# Prepare completions for evaluation
print("🔬 Preparing evaluation dataset from stored completions...")

try:
    # Filter completions suitable for evaluation
    evaluation_candidates = []
    
    completions = client.chat.completions.list(limit=20)
    
    for completion in completions.data:
        # Get messages for this completion
        messages = client.chat.completions.messages.list(completion.id, limit=10)
        
        if len(messages.data) >= 2:  # Should have at least system/user and assistant messages
            eval_entry = {
                "completion_id": completion.id,
                "messages": [{"role": msg.role, "content": msg.content} for msg in messages.data],
                "metadata": getattr(completion, 'metadata', {}),
                "created": completion.created
            }
            evaluation_candidates.append(eval_entry)
    
    print(f"📋 Found {len(evaluation_candidates)} completions suitable for evaluation")
    
    # Group by category for balanced evaluation
    eval_by_category = {}
    for entry in evaluation_candidates:
        category = entry['metadata'].get('category', 'general')
        if category not in eval_by_category:
            eval_by_category[category] = []
        eval_by_category[category].append(entry)
    
    print(f"\n📈 Evaluation dataset breakdown:")
    for category, entries in eval_by_category.items():
        print(f"  {category}: {len(entries)} entries")
    
    # Create a sample evaluation format
    print(f"\n📝 Sample evaluation entry format:")
    if evaluation_candidates:
        sample = evaluation_candidates[0]
        print(json.dumps({
            "id": sample["completion_id"],
            "messages": sample["messages"][:2],  # Show first 2 messages
            "category": sample["metadata"].get("category", "unknown"),
            "difficulty": sample["metadata"].get("difficulty", "unknown")
        }, indent=2))
        
except Exception as e:
    print(f"❌ Failed to prepare evaluation dataset: {str(e)}")

## 🧹 Cleanup and Management

Manage your stored completions by deleting specific entries when no longer needed.

In [None]:
# Example: Delete a specific stored completion (use with caution!)
# Uncomment and run only if you want to delete completions

# WARNING: This will permanently delete the completion!
delete_demo = False  # Set to True to enable deletion

if delete_demo and stored_completions:
    completion_to_delete = stored_completions[-1]["id"]  # Delete the last one as example
    
    print(f"🗑️ Attempting to delete completion: {completion_to_delete}")
    print("⚠️ WARNING: This action cannot be undone!")
    
    try:
        response = client.chat.completions.delete(completion_to_delete)
        print(f"✅ Completion deleted successfully!")
        print(f"Response: {response}")
        
        # Remove from our local list
        stored_completions = [sc for sc in stored_completions if sc["id"] != completion_to_delete]
        
    except Exception as e:
        print(f"❌ Failed to delete completion: {str(e)}")
else:
    print("🔒 Deletion demo disabled. Set 'delete_demo = True' to enable.")
    print("📊 Current stored completions count:", len(stored_completions) if stored_completions else 0)

## 🎯 Next Steps & Best Practices

### 🏆 Best Practices for Stored Completions:

1. **📋 Organize with Metadata**:
   - Use consistent metadata schemas
   - Include categories, quality scores, and difficulty levels
   - Add review status and timestamps

2. **🎯 Quality Control**:
   - Review completions before using for training
   - Filter by quality metrics
   - Maintain diverse datasets

3. **🔒 Security & Privacy**:
   - Use Managed Identity for authentication
   - Be mindful of sensitive data in completions
   - Implement proper access controls

4. **📊 Data Management**:
   - Monitor storage usage (10 GB limit)
   - Regular cleanup of outdated completions
   - Archive important datasets

### 🚀 Next Steps:

1. **Azure AI Foundry Portal**:
   - Visit [Azure AI Foundry](https://ai.azure.com) to use the visual interface
   - Filter and export completions for distillation
   - Start fine-tuning with your stored completions

2. **Distillation Workflow**:
   - Collect 100+ high-quality completions
   - Filter by metadata in the portal
   - Select target model for fine-tuning
   - Monitor training progress

3. **Evaluation**:
   - Create evaluation datasets from completions
   - Test model performance across different scenarios
   - Compare base vs fine-tuned models

### 📚 Additional Resources:

- [Stored Completions Documentation](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/stored-completions)
- [Fine-tuning Guide](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/fine-tuning)
- [Evaluation Best Practices](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/evaluations)