# RefLex LLM Usage Examples

This notebook demonstrates how to use RefLex LLM for intelligent OpenAI API provider resolution with automatic fallback to local AI servers.

## Installation

In [None]:
!pip install reflex-llms

## Basic Setup

### Environment Variables

First, set up your API credentials (optional - RefLex will fallback to local AI if not available):

In [1]:
import os

# Set up environment variables (optional)
# os.environ['OPENAI_API_KEY'] = 'your-openai-api-key'
# os.environ['AZURE_OPENAI_ENDPOINT'] = 'https://your-resource.openai.azure.com/'
# os.environ['AZURE_OPENAI_API_KEY'] = 'your-azure-api-key'
# os.environ['AZURE_OPENAI_API_VERSION'] = '2024-02-15-preview'

print("Environment setup complete")

Environment setup complete


### Docker Setup (for local AI fallback)

RefLex can automatically spin up local AI servers using Docker:

In [3]:
import subprocess

subprocess.run(['docker', '--version'], capture_output=True, text=True).stdout

'Docker version 28.1.1, build 4eba377\n'

## Quick Start Examples

### 1. Basic Usage (Automatic Provider Selection)

In [4]:
import reflex_llms

# Get OpenAI client with automatic provider resolution
client = reflex_llms.get_openai_client()

# Use exactly like the OpenAI client
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Hello! Can you help me with Python?"}
    ],
    max_tokens=100
)

print(f"Response: {response.choices[0].message.content}")
print(f"Using provider: {reflex_llms.get_selected_provider()}")

Checking OpenAI API providers...
  Trying openai...
  Using OpenAI API
Response: Of course! I'd be happy to help you with Python. What specific questions do you have or what do you need assistance with?
Using provider: openai


### 2. Check Which Provider is Being Used

In [5]:
import reflex_llms

# Check provider without making API calls
provider_type = reflex_llms.get_openai_client_type()
print(f"Selected provider: {provider_type}")

# Check if using local AI
if reflex_llms.is_using_reflex():
    print("Using local RefLex server")
    server = reflex_llms.get_reflex_server()
    if server:
        print(f"Server URL: {server.openai_compatible_url}")
        print(f"Server healthy: {server.is_healthy}")
else:
    print("Using cloud AI service")

# Get comprehensive status
status = reflex_llms.get_module_status()
print(f"Status: {status}")

Using cached openai configuration
Selected provider: openai
Using cloud AI service
Status: {'selected_provider': 'openai', 'has_cached_config': True, 'reflex_server_running': False, 'reflex_server_url': None}


## Configuration Examples

### 3. Custom Configuration Parameters

In [None]:
import reflex_llms

# Custom OpenAI API URL (e.g., for enterprise deployments)
client = reflex_llms.get_openai_client(
    preference_order=["openai"],
    openai_base_url="https://enterprise-api.openai.com/v1",
    timeout=10.0,
)

# Custom Azure configuration
client = reflex_llms.get_openai_client(
    preference_order=["azure", "openai"],
    azure_api_version="2024-06-01",
    azure_base_url="https://custom-azure.openai.azure.com",
)



In [None]:
from reflex_llms.server import ReflexServerConfig, ModelMapping

# Custom RefLex server configuration
reflex_config = ReflexServerConfig(
    port=8080,
    container_name="my-ai-server",
    model_mappings=ModelMapping(minimal_setup=True),
)

client = reflex_llms.get_openai_client(
    preference_order=["reflex"],
    reflex_server_config=reflex_config
)

print("Custom configurations set up successfully")

Using cached openai configuration
Custom configurations set up successfully


In [None]:
reflex_config = {
    "port": 8080,
    "container_name": "my-ai-server",
    "model_mappings": {
        "minimal_setup": True
    },
}
client = reflex_llms.get_openai_client(
    preference_order=["reflex"],
    reflex_server_config=reflex_config
)

print("Custom configurations set up successfully")

## Configuration File Examples

### 5. Using reflex.json Configuration Files

Create a `reflex.json` file in your project directory:

In [2]:
from usage.utils import display_json

ModuleNotFoundError: No module named 'usage'

In [None]:
import json
import reflex_llms

print("Created reflex.json configuration file")

# Use the configuration
client = reflex_llms.get_openai_client(from_file=True)
print(f"Using provider from file: {reflex_llms.get_selected_provider()}")

# Override specific settings while using file
reflex_llms.clear_cache()
client = reflex_llms.get_openai_client(
    from_file=True,
    timeout=15.0,  # Override file setting
    preference_order=["reflex", "openai"]  # Override file setting
)
print(f"Using provider with overrides: {reflex_llms.get_selected_provider()}")

### 6. Creating Example Configuration Files

In [None]:
import reflex_llms

# Create an example reflex.json with all options
reflex_llms.create_example_config("my-reflex-config.json")

# Check if a config file is being used
config_path = reflex_llms.get_reflex_config_path()
if config_path:
    print(f"Using config file: {config_path}")
else:
    print("No config file found, using defaults")

# Initialize with config discovery (like load_dotenv)
reflex_llms.init_reflex()
print("RefLex initialization complete")

## Practical Usage Examples

### 7. Chat Application Example

In [None]:
import reflex_llms

def chat_with_ai(user_input="Hello, how are you?"):
    """Simple chat function that works with any provider."""
    client = reflex_llms.get_openai_client()
    
    print(f"Chat ready! Using {reflex_llms.get_selected_provider()}")
    
    conversation = [{"role": "user", "content": user_input}]
    
    try:
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=conversation,
            max_tokens=150
        )
        
        ai_response = response.choices[0].message.content
        print(f"You: {user_input}")
        print(f"AI: {ai_response}")
        
        return ai_response
        
    except Exception as e:
        print(f"Error: {e}")
        print("Trying to reconnect...")
        
        # Clear cache and try again
        reflex_llms.clear_cache()
        client = reflex_llms.get_openai_client()
        return "Sorry, I'm having trouble connecting right now."

# Run the chat example
response = chat_with_ai("What is artificial intelligence?")
print(f"\nResponse received: {len(response)} characters")

### 8. Document Analysis Example

In [None]:
import reflex_llms

def analyze_document(text, analysis_type="summary"):
    """Analyze a document using available AI provider."""
    client = reflex_llms.get_openai_client()
    
    prompts = {
        "summary": "Please provide a concise summary of the following text:",
        "sentiment": "Analyze the sentiment of the following text:",
        "keywords": "Extract the main keywords from the following text:"
    }
    
    prompt = prompts.get(analysis_type, prompts["summary"])
    
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful document analysis assistant."},
            {"role": "user", "content": f"{prompt}\n\n{text}"}
        ],
        max_tokens=200
    )
    
    return {
        "analysis": response.choices[0].message.content,
        "provider": reflex_llms.get_selected_provider(),
        "model": "gpt-3.5-turbo"
    }

# Example usage
sample_text = """
RefLex LLM is a Python library that provides intelligent OpenAI API provider 
resolution with automatic fallback capabilities. It seamlessly switches between 
OpenAI, Azure OpenAI, and local Ollama-based servers based on availability and 
user preferences. This ensures your applications remain functional even when 
cloud services are unavailable.
"""

result = analyze_document(sample_text, "summary")
print(f"Analysis: {result['analysis']}")
print(f"Provider: {result['provider']}")

# Try sentiment analysis
sentiment_result = analyze_document(sample_text, "sentiment")
print(f"\nSentiment: {sentiment_result['analysis']}")

### 9. Embedding Generation Example

In [None]:
import reflex_llms
import numpy as np

def get_embeddings(texts):
    """Generate embeddings using available provider."""
    client = reflex_llms.get_openai_client()
    
    # Handle both single text and list of texts
    if isinstance(texts, str):
        texts = [texts]
    
    embeddings = []
    
    for text in texts:
        response = client.embeddings.create(
            model="text-embedding-ada-002",
            input=text
        )
        embeddings.append(response.data[0].embedding)
    
    return np.array(embeddings)

# Example usage
sample_texts = [
    "RefLex provides AI fallback capabilities",
    "OpenAI API integration with local alternatives",
    "Docker-based local AI server management"
]

try:
    embeddings = get_embeddings(sample_texts)
    print(f"Generated {len(embeddings)} embeddings using {reflex_llms.get_selected_provider()}")
    print(f"Embedding shape: {embeddings.shape}")
    
    # Calculate similarity between first two texts
    similarity = np.dot(embeddings[0], embeddings[1])
    print(f"Similarity between first two texts: {similarity:.3f}")
    
except Exception as e:
    print(f"Embedding generation failed: {e}")
    print("This might happen if the provider doesn't support embeddings")

## Advanced Configuration Examples

### 10. Production Deployment Configuration

In [None]:
import reflex_llms
from reflex_llms.server import ReflexServerConfig, ModelMapping
import os

def setup_production_ai():
    """Production-ready AI setup with fallback."""
    
    # Production configuration
    prod_config = ReflexServerConfig(
        host="0.0.0.0",  # Accept connections from all interfaces
        port=8080,
        container_name="production-ai-server",
        data_path="/opt/ai-models",  # Persistent storage
        model_mappings=ModelMapping(
            minimal_setup=False,  # Full model set for production
            model_mapping={
                "gpt-3.5-turbo": "llama3.2:7b",  # Larger models for better quality
                "gpt-4": "llama3.1:70b",
                "gpt-4o": "gemma3:27b",
                "text-embedding-ada-002": "nomic-embed-text"
            }
        )
    )
    
    # Try cloud first, fallback to local
    client = reflex_llms.get_openai_client(
        preference_order=["openai", "azure", "reflex"],
        timeout=30.0,  # Longer timeout for production
        reflex_server_config=prod_config
    )
    
    provider = reflex_llms.get_selected_provider()
    print(f"Production AI ready using: {provider}")
    
    if provider == "reflex":
        server = reflex_llms.get_reflex_server()
        if server:
            print(f"Local server status: {server.get_status()}")
    
    return client

# Setup production environment
try:
    prod_client = setup_production_ai()
    print("Production setup complete")
except Exception as e:
    print(f"Production setup failed: {e}")
    print("This is normal if Docker is not available")

### 11. Development Environment Setup

In [None]:
import reflex_llms
from reflex_llms.server import ReflexServerConfig, ModelMapping

def setup_development_ai():
    """Fast development setup with minimal models."""
    
    # Development configuration - fast startup
    dev_config = ReflexServerConfig(
        port=11434,
        container_name="dev-ai-server",
        model_mappings=ModelMapping(
            minimal_setup=True,  # Only essential models
            minimal_model_mapping={
                "gpt-3.5-turbo": "llama3.2:3b",  # Small, fast model
                "gpt-4o-mini": "gemma3:2b",      # Tiny model for testing
                "text-embedding-ada-002": "nomic-embed-text"
            }
        )
    )
    
    # Prefer local for development (no API costs)
    client = reflex_llms.get_client_dev_mode(
        reflex_server_config=dev_config,
        timeout=10.0
    )
    
    print(f"Development AI ready using: {reflex_llms.get_selected_provider()}")
    return client

# Setup development environment
try:
    dev_client = setup_development_ai()
    print("Development setup complete")
except Exception as e:
    print(f"Development setup failed: {e}")
    print("Falling back to available providers")

## Error Handling and Monitoring

### 12. Robust Error Handling

In [None]:
import reflex_llms
import time

def robust_ai_call(messages, max_retries=3):
    """Make AI calls with automatic retry and provider switching."""
    
    for attempt in range(max_retries):
        try:
            client = reflex_llms.get_openai_client()
            
            response = client.chat.completions.create(
                model="gpt-3.5-turbo",
                messages=messages,
                max_tokens=100
            )
            
            return {
                "success": True,
                "response": response.choices[0].message.content,
                "provider": reflex_llms.get_selected_provider(),
                "attempt": attempt + 1
            }
            
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            
            if attempt < max_retries - 1:
                # Clear cache and try different provider
                reflex_llms.clear_cache()
                time.sleep(2 ** attempt)  # Exponential backoff
            else:
                return {
                    "success": False,
                    "error": str(e),
                    "attempts": max_retries
                }

# Example usage
messages = [{"role": "user", "content": "What is machine learning?"}]
result = robust_ai_call(messages)

if result["success"]:
    print(f"Success with {result['provider']}: {result['response'][:100]}...")
else:
    print(f"All attempts failed: {result['error']}")

### 13. Health Monitoring

In [None]:
import reflex_llms
import time

def monitor_ai_health():
    """Monitor AI provider health and performance."""
    
    status = reflex_llms.get_module_status()
    print("=== AI Health Report ===")
    print(f"Selected Provider: {status['selected_provider']}")
    print(f"Config Cached: {status['has_cached_config']}")
    
    if status['reflex_server_running']:
        print(f"RefLex Server: Running at {status['reflex_server_url']}")
        
        server = reflex_llms.get_reflex_server()
        if server:
            try:
                server_status = server.get_status()
                print(f"  Total Models: {server_status.get('total_models', 'Unknown')}")
                print(f"  OpenAI Models: {len(server_status.get('openai_compatible_models', []))}")
                print(f"  Container Running: {server_status.get('container_running', False)}")
print(f"  Port Open: {server_status.get('port_open', False)}")
            except Exception as e:
                print(f"  Error getting server status: {e}")
    else:
        print("RefLex Server: Not running")
    
    # Test API call performance
    try:
        start_time = time.time()
        client = reflex_llms.get_openai_client()
        
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": "Hello"}],
            max_tokens=5
        )
        
        response_time = time.time() - start_time
        print(f"API Response Time: {response_time:.2f}s")
        print(f"API Status: Healthy")
        
    except Exception as e:
        print(f"API Status: Error - {e}")

# Run health check
monitor_ai_health()

## Cache Management

### 14. Cache Control

In [None]:
import reflex_llms

# Check current cache status
status = reflex_llms.get_module_status()
print(f"Cache status: {status['has_cached_config']}")
print(f"Current provider: {status['selected_provider']}")

# Force provider re-evaluation
print("\nForcing re-evaluation...")
config = reflex_llms.get_openai_client_config(force_recheck=True)
print(f"Re-evaluated provider: {reflex_llms.get_selected_provider()}")

# Clear cache manually
print("\nClearing cache...")
reflex_llms.clear_cache()
print(f"Cache cleared: {reflex_llms.get_module_status()['has_cached_config']}")

# Stop RefLex server if running
if reflex_llms.is_using_reflex():
    print("Stopping RefLex server...")
    reflex_llms.stop_reflex_server()
    print("RefLex server stopped")

## Best Practices

### 15. Recommended Usage Patterns

In [None]:
import reflex_llms
import os

class AIManager:
    """Best practice AI manager class."""
    
    def __init__(self, environment="development"):
        self.environment = environment
        self.client = None
        self._setup_client()
    
    def _setup_client(self):
        """Setup client based on environment."""
        if self.environment == "development":
            # Fast startup, prefer local
            self.client = reflex_llms.get_client_dev_mode(
                timeout=5.0,
                from_file=True  # Load from reflex.json if available
            )
        elif self.environment == "production":
            # Reliable, prefer cloud
            self.client = reflex_llms.get_client_prod_mode(
                timeout=30.0,
                from_file=True
            )
        else:
            # Default setup
            self.client = reflex_llms.get_openai_client(from_file=True)
        
        print(f"AI Manager ready using: {reflex_llms.get_selected_provider()}")
    
    def chat_completion(self, messages, **kwargs):
        """Robust chat completion with retry."""
        try:
            return self.client.chat.completions.create(
                messages=messages,
                **kwargs
            )
        except Exception as e:
            print(f"AI request failed: {e}")
            # Retry with fresh client
            reflex_llms.clear_cache()
            self._setup_client()
            return self.client.chat.completions.create(
                messages=messages,
                **kwargs
            )
    
    def get_embeddings(self, text):
        """Generate embeddings with error handling."""
        try:
            return self.client.embeddings.create(
                model="text-embedding-ada-002",
                input=text
            )
        except Exception as e:
            print(f"Embedding request failed: {e}")
            raise
    
    def health_check(self):
        """Check AI system health."""
        return reflex_llms.get_module_status()
    
    def cleanup(self):
        """Clean shutdown."""
        if reflex_llms.is_using_reflex():
            reflex_llms.stop_reflex_server()

# Usage examples
ai_dev = AIManager("development")
ai_prod = AIManager("production")

# Use the managers
try:
    response = ai_dev.chat_completion([
        {"role": "user", "content": "Hello from development!"}
    ], model="gpt-3.5-turbo", max_tokens=20)
    
    print(f"Dev response: {response.choices[0].message.content}")
    
except Exception as e:
    print(f"Development AI failed: {e}")

# Health check
health = ai_dev.health_check()
print(f"\nAI Health: {health['selected_provider']} - {health.get('reflex_server_running', False)}")

# Cleanup
ai_dev.cleanup()
ai_prod.cleanup()

## Summary

RefLex LLM provides:

1. **Seamless API Compatibility** - Drop-in replacement for OpenAI clients
2. **Intelligent Fallback** - Automatic switching between providers
3. **Flexible Configuration** - File-based, environment, or programmatic setup
4. **Production Ready** - Caching, error handling, and monitoring
5. **Development Friendly** - Fast local AI setup for development

The library ensures your AI applications remain functional regardless of cloud service availability, making them more resilient and cost-effective.

## Next Steps

1. **Install RefLex LLM**: `pip install reflex-llm`
2. **Set up your environment**: Add API keys or ensure Docker is available
3. **Create a `reflex.json`**: Configure your preferred settings
4. **Start building**: Use RefLex as a drop-in OpenAI replacement
5. **Monitor and optimize**: Use health checks and caching for best performance

For more information, visit the [RefLex LLM documentation](https://github.com/your-repo/reflex-llm).