# RefLex LLM - Azure OpenAI Integration Example

This notebook demonstrates how to use RefLex LLM specifically with Azure OpenAI endpoints, including configuration, fallback capabilities, and best practices.

## Installation

In [None]:
!pip install reflex-llms

## Azure OpenAI Setup

### Environment Variables for Azure

In [1]:
from utils import load_and_verify_env

load_and_verify_env(required_vars=[
    'AZURE_OPENAI_ENDPOINT',
    'AZURE_OPENAI_API_KEY',
    'AZURE_OPENAI_API_VERSION',
])

Loading environment from '.env'...
Missing required variables: AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_API_VERSION

Contents of 'c:\Users\acisse\CodeWorkspace\reflex-llm\.env':
----------------------------------------
 1: OPENAI_API_KEY=sk****************************************************************************************************************************************************************wA
----------------------------------------


False

## Basic Azure OpenAI Usage

In [1]:
!netstat -aon | findstr :11435

In [1]:
import reflex_llms

# Get client with Azure preference
client = reflex_llms.get_openai_client(
    preference_order=["azure", "reflex"],  # Try Azure first, fallback to local
    port=11435
)

# Use exactly like the OpenAI client
response = client.chat.completions.create(
    model="gpt-3.5-turbo",  # Azure model deployment name
    messages=[
        {"role": "user", "content": "Hello! I'm using Azure OpenAI through RefLex."}
    ],
    max_tokens=100
)

print(f"Response: {response.choices[0].message.content}")
print(f"Using provider: {reflex_llms.get_selected_provider()}")

Checking OpenAI API providers...
  Trying azure...
  Trying reflex...
  Setting up RefLex server...
Setting up RefLex OpenAI-compatible backend...
Starting Ollama container...
Ollama is already running and ready
Waiting for Ollama to be ready...
Ollama ready! Found 16 existing models
Setting up OpenAI model mappings...
Setting up OpenAI-compatible models...

Processing: gpt-3.5-turbo -> llama3.2:3b
Model gpt-3.5-turbo already exists, skipping...

Processing: gpt-3.5-turbo-16k -> llama3.2:3b
Model gpt-3.5-turbo-16k already exists, skipping...

Processing: gpt-4 -> llama3.1:8b
Model gpt-4 already exists, skipping...

Processing: gpt-4-turbo -> gemma3:4b
Model gpt-4-turbo already exists, skipping...

Processing: gpt-4o -> gemma3:4b
Model gpt-4o already exists, skipping...

Processing: gpt-4o-mini -> gemma3:1b
Model gpt-4o-mini already exists, skipping...

Processing: o1-preview -> phi3:reasoning
Pulling model: phi3:reasoning
Failed to pull model phi3:reasoning: Error pulling model: pull m

## Azure-Specific Configuration

In [None]:
import reflex_llms

# Custom Azure configuration
client = reflex_llms.get_openai_client(
    preference_order=["azure"],
    azure_api_version="2024-06-01",  # Specific API version
    azure_base_url="https://custom-azure.openai.azure.com",  # Custom endpoint
    timeout=15.0
)

print(f"Azure client configured with provider: {reflex_llms.get_selected_provider()}")

## Azure Configuration File Example

Create a `reflex.json` file optimized for Azure:

In [None]:
import json

# Azure-focused configuration
azure_config = {
    "preference_order": ["azure", "reflex"],
    "timeout": 20.0,
    "azure": {
        "api_version": "2024-06-01",
        "base_url": "https://your-resource.openai.azure.com"
    },
    "reflex_server_config": {
        "port": 8080,
        "container_name": "azure-fallback-server",
        "model_mappings": {
            "minimal_setup": True,
            "minimal_model_mapping": {
                "gpt-35-turbo": "llama3.2:3b",
                "gpt-4": "llama3.1:8b",
                "text-embedding-ada-002": "nomic-embed-text"
            }
        }
    }
}

# Save configuration
with open('reflex_azure.json', 'w') as f:
    json.dump(azure_config, f, indent=2)

print("Azure configuration saved to reflex_azure.json")
print(json.dumps(azure_config, indent=2))

## Chat Application with Azure Fallback

In [None]:
import reflex_llms

def azure_chat_with_fallback(user_input="Hello from Azure!"):
    """Chat function that prefers Azure but falls back to local AI."""
    
    # Load from Azure config file
    client = reflex_llms.get_openai_client(from_file="reflex_azure.json")
    
    provider = reflex_llms.get_selected_provider()
    print(f"Chat ready! Using {provider}")
    
    conversation = [{"role": "user", "content": user_input}]
    
    try:
        # Use Azure model deployment names
        model_name = "gpt-35-turbo" if provider == "azure" else "gpt-3.5-turbo"
        
        response = client.chat.completions.create(
            model=model_name,
            messages=conversation,
            max_tokens=150
        )
        
        ai_response = response.choices[0].message.content
        print(f"You: {user_input}")
        print(f"AI ({provider}): {ai_response}")
        
        return ai_response
        
    except Exception as e:
        print(f"Error with {provider}: {e}")
        print("Attempting fallback...")
        
        # Clear cache and try fallback
        reflex_llms.clear_cache()
        client = reflex_llms.get_openai_client(from_file="reflex_azure.json")
        
        new_provider = reflex_llms.get_selected_provider()
        print(f"Switched to: {new_provider}")
        
        return "Successfully switched to fallback provider!"

# Test the chat with fallback
response = azure_chat_with_fallback("Explain the benefits of using Azure OpenAI")
print(f"\nResponse length: {len(response)} characters")

## Azure Embeddings with Fallback

In [None]:
import reflex_llms

def get_azure_embeddings(texts):
    """Generate embeddings using Azure OpenAI with local fallback."""
    client = reflex_llms.get_openai_client(from_file="reflex_azure.json")
    
    if isinstance(texts, str):
        texts = [texts]
    
    provider = reflex_llms.get_selected_provider()
    
    # Azure uses deployment names, local uses model names
    embedding_model = (
        "text-embedding-ada-002" if provider == "azure" 
        else "text-embedding-ada-002"  # Same for both in this case
    )
    
    embeddings = []
    
    for text in texts:
        try:
            response = client.embeddings.create(
                model=embedding_model,
                input=text
            )
            embeddings.append(response.data[0].embedding)
            
        except Exception as e:
            print(f"Embedding failed with {provider}: {e}")
            raise
    
    return {
        "embeddings": embeddings,
        "provider": provider,
        "count": len(embeddings)
    }

# Example usage
sample_texts = [
    "Azure OpenAI provides enterprise-grade AI services",
    "RefLex enables seamless fallback to local AI servers",
    "Hybrid AI deployment strategies improve reliability"
]

try:
    result = get_azure_embeddings(sample_texts)
    print(f"Generated {result['count']} embeddings using {result['provider']}")
    print(f"First embedding dimension: {len(result['embeddings'][0])}")
    
except Exception as e:
    print(f"Embedding generation failed: {e}")

## Azure Monitoring and Troubleshooting

In [None]:
import reflex_llms
import time

def monitor_azure_health():
    """Comprehensive Azure OpenAI health monitoring."""
    
    print("=== Azure OpenAI Health Report ===")
    
    # Get current status
    status = reflex_llms.get_module_status()
    provider = status['selected_provider']
    
    print(f"Current Provider: {provider}")
    
    if provider == "azure":
        print("✅ Azure OpenAI is active")
        
        # Test Azure API performance
        try:
            start_time = time.time()
            client = reflex_llms.get_openai_client()
            
            response = client.chat.completions.create(
                model="gpt-35-turbo",
                messages=[{"role": "user", "content": "ping"}],
                max_tokens=1
            )
            
            response_time = time.time() - start_time
            print(f"Azure Response Time: {response_time:.2f}s")
            print("Azure API Status: Healthy ✅")
            
        except Exception as e:
            print(f"Azure API Error: {e} ❌")
            
    elif provider == "reflex":
        print("⚠️  Using local fallback - Azure unavailable")
        
        server = reflex_llms.get_reflex_server()
        if server:
            server_status = server.get_status()
            print(f"Fallback Server: Running ✅")
            print(f"Available Models: {len(server_status.get('openai_compatible_models', []))}")
        else:
            print("Fallback Server: Not available ❌")
    
    else:
        print(f"Unexpected provider: {provider} ⚠️")
    
    # Configuration check
    print(f"\nConfiguration:")
    print(f"  Config Cached: {status['has_cached_config']}")
    print(f"  Fallback Ready: {status.get('reflex_server_running', False)}")
    
    return status

# Run Azure health monitoring
health_status = monitor_azure_health()

# Force fallback test
print("\n=== Testing Fallback Mechanism ===")
reflex_llms.clear_cache()

# Try with reflex priority to test fallback
try:
    fallback_client = reflex_llms.get_openai_client(
        preference_order=["reflex", "azure"]
    )
    fallback_provider = reflex_llms.get_selected_provider()
    print(f"Fallback test provider: {fallback_provider}")
except Exception as e:
    print(f"Fallback test failed: {e}")

## Summary

This notebook demonstrated:

1. **Azure OpenAI Integration** - Primary Azure configuration with environment variables
2. **Intelligent Fallback** - Automatic switching to local AI when Azure is unavailable
3. **Configuration Management** - Azure-specific settings and deployment mappings
4. **Production Deployment** - Enterprise-ready Azure setup with monitoring
5. **Health Monitoring** - Comprehensive Azure service monitoring and troubleshooting

Key benefits for Azure users:
- **Enterprise Compliance** - Use Azure's enterprise-grade infrastructure
- **Cost Control** - Fallback to local AI during outages or budget limits
- **Reliability** - Automatic failover ensures continuous operation
- **Flexibility** - Easy switching between Azure regions or deployments
