# Week 3 — Part 04: OpenAI Compatible API Lab

**Estimated time:** 45–60 minutes

---

## What success looks like (end of Part 04)

- You can explain what OpenAI Compatible API means.
- You can configure the OpenAI SDK for multiple providers.
- You can make API calls to different backends with the same code.

## Learning Objectives

- Understand the OpenAI API compatibility standard
- Configure clients for multiple providers
- Switch providers by changing only base_url and api_key

## Setup

Before we begin, we need to:

1. **Import the OpenAI SDK** - This is the official Python client library that works with any OpenAI-compatible provider
2. **Configure our providers** - Each provider needs:
   - `base_url`: The API endpoint URL
   - `api_key`: Your authentication key
   - `default_model`: The model to use by default

The key insight is that **all OpenAI-compatible providers use the same SDK**. You just change the `base_url` and `api_key` to switch between them.

In [None]:
import os
import json
from openai import OpenAI

# Provider configurations - FREE providers (no credit card required)
# Get your API keys:
# - Groq: https://console.groq.com/keys
# - OpenRouter: https://openrouter.ai/keys

PROVIDERS = {
    # Groq - Ultra-fast inference, generous free tier
    # Docs: https://console.groq.com/docs/openai
    # Base URL: The API endpoint that accepts OpenAI-format requests
    # API Key: Your personal key from console.groq.com/keys
    "groq": {
        "base_url": "https://api.groq.com/openai/v1",
        "api_key": os.environ.get("GROQ_API_KEY", "your-groq-api-key"),
        "default_model": "llama-3.3-70b-versatile"  # 70B parameter model
    },
    
    # OpenRouter - 100+ models, free tier available
    # Docs: https://openrouter.ai/docs/quickstart
    # Free models have :free suffix in their name
    "openrouter": {
        "base_url": "https://openrouter.ai/api/v1",
        "api_key": os.environ.get("OPENROUTER_API_KEY", "your-openrouter-api-key"),
        "default_model": "meta-llama/llama-3.3-70b-instruct:free"
    },
    
    # Ollama (local) - requires Ollama running on localhost
    # See self_learn documentation for setup
    # "ollama": {
    #     "base_url": "http://localhost:11434/v1",
    #     "api_key": "ollama",  # Any value works - local only
    #     "default_model": "llama3.2"
    # },
}

print(f"Configured providers: {list(PROVIDERS.keys())}")
print("\nTo get API keys:")
print("  Groq: https://console.groq.com/keys")
print("  OpenRouter: https://openrouter.ai/keys")

## Exercise 1: List Available Models

The `/v1/models` endpoint returns a list of all models available from a provider. This is useful for:

- **Discovering what models are available** before making requests
- **Checking model names** (they vary between providers)
- **Verifying API connectivity** - if this works, your credentials are valid

### How it works:

1. Create an `OpenAI` client with the provider's `base_url` and `api_key`
2. Call `client.models.list()` - this hits the `/v1/models` endpoint
3. Extract model IDs from the response

The response format is standardized across all OpenAI-compatible providers.

In [None]:
def list_models(provider_name: str) -> list:
    """
    List available models from a provider.
    
    This function demonstrates the provider-agnostic nature of the OpenAI SDK.
    The same code works for Groq, OpenRouter, Ollama, or any OpenAI-compatible API.
    
    Args:
        provider_name: Key from the PROVIDERS dict (e.g., "groq", "openrouter")
    
    Returns:
        List of model ID strings
    """
    # Step 1: Get the configuration for this provider
    config = PROVIDERS.get(provider_name)
    if not config:
        print(f"Unknown provider: {provider_name}")
        return []
    
    # Step 2: Create an OpenAI client configured for this provider
    # The magic happens here - same OpenAI class, different base_url
    client = OpenAI(
        base_url=config["base_url"],
        api_key=config["api_key"]
    )
    
    # Step 3: Call the models.list() endpoint
    # This sends GET request to {base_url}/models
    try:
        models = client.models.list()
        # Extract just the model IDs from the response objects
        return [m.id for m in models.data]
    except Exception as e:
        print(f"Error listing models for {provider_name}: {e}")
        return []

# List models from each configured provider
# This loop tests connectivity to all providers at once
for provider in PROVIDERS:
    print(f"\n=== {provider.upper()} ===")
    models = list_models(provider)
    if models:
        print(f"Found {len(models)} models")
        for m in models[:5]:
            print(f"  - {m}")
        if len(models) > 5:
            print(f"  ... and {len(models) - 5} more")

## Exercise 2: Make a Chat Completion

The `/v1/chat/completions` endpoint is the core of the OpenAI API. This is what you'll use 99% of the time.

### Key Parameters:

| Parameter | Purpose |
|-----------|---------|
| `model` | Which model to use (e.g., "llama-3.3-70b-versatile") |
| `messages` | Conversation history as a list of role/content pairs |
| `max_tokens` | Maximum length of the response |
| `temperature` | Randomness (0 = deterministic, 1 = creative) |

### Message Roles:

- `system`: Instructions for the assistant's behavior
- `user`: Input from the user
- `assistant`: Previous responses (for multi-turn conversations)

### The Point of This Exercise:

Notice that **the same function works with any provider**. We just pass different configuration. This is the power of the OpenAI-compatible standard.

In [None]:
def simple_chat(provider_name: str, prompt: str) -> str:
    """
    Make a simple chat completion request.
    
    This function demonstrates the minimal code needed to get a response
    from any OpenAI-compatible provider.
    
    Args:
        provider_name: Key from PROVIDERS dict
        prompt: The user's question/message
    
    Returns:
        The assistant's response text (or error message)
    """
    # Get provider configuration
    config = PROVIDERS[provider_name]
    
    # Create client for this provider
    # Same OpenAI class works for all providers!
    client = OpenAI(
        base_url=config["base_url"],
        api_key=config["api_key"]
    )
    
    try:
        # Make the chat completion request
        # The messages array builds the conversation context:
        # - system: Sets the assistant's personality/behavior
        # - user: The actual question
        response = client.chat.completions.create(
            model=config["default_model"],
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt}
            ],
            max_tokens=100  # Limit response length
        )
        
        # Extract the text from the response
        # response.choices is a list (can have multiple if n > 1)
        # Each choice has a message with role and content
        return response.choices[0].message.content
    except Exception as e:
        return f"Error: {e}"

# Test with each configured provider
# Using a simple math question to verify basic functionality
test_prompt = "What is 2 + 2? Answer briefly."

for provider in PROVIDERS:
    print(f"\n=== {provider.upper()} ===")
    result = simple_chat(provider, test_prompt)
    print(f"Response: {result}")

## Exercise 3: Inspect Response Structure

Understanding the response structure is crucial for building robust applications. All OpenAI-compatible providers return the same JSON format.

### Response Object Structure:

```
{
  "id": "chatcmpl-xxx",           # Unique ID for this completion
  "model": "llama-3.3-70b-versatile",  # Model used
  "choices": [{                   # Array of completions (usually 1)
    "index": 0,                   # Position in choices array
    "message": {
      "role": "assistant",        # Always "assistant"
      "content": "..."            # The actual response text
    },
    "finish_reason": "stop"       # Why generation stopped
  }],
  "usage": {                      # Token counts for cost tracking
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}
```

### Finish Reasons:

- `stop`: Natural end (model finished its response)
- `length`: Hit max_tokens limit (response was cut off)
- `tool_calls`: Model wants to call a function (advanced usage)

### Why This Matters:

- **Usage tracking**: Monitor costs by counting tokens
- **Error handling**: Check `finish_reason` to detect truncated responses
- **Logging**: Store `response.id` for debugging

In [None]:
def inspect_response(provider_name: str) -> dict:
    """
    Make a request and return the full response structure.
    
    This function extracts all the key fields from the response object
    so you can see exactly what data is available.
    
    Args:
        provider_name: Key from PROVIDERS dict
    
    Returns:
        Dictionary with id, model, content, finish_reason, and usage
    """
    config = PROVIDERS[provider_name]
    client = OpenAI(
        base_url=config["base_url"],
        api_key=config["api_key"]
    )
    
    try:
        # Make a simple request
        response = client.chat.completions.create(
            model=config["default_model"],
            messages=[{"role": "user", "content": "Say 'hello'"}],
            max_tokens=10
        )
        
        # Extract all the important fields
        # This shows you what's available in the response object
        return {
            # Unique identifier for this completion
            "id": response.id,
            
            # The model that was actually used
            # (may differ from requested if provider aliases it)
            "model": response.model,
            
            # The actual text response
            "content": response.choices[0].message.content,
            
            # Why the model stopped generating
            # "stop" = natural end, "length" = hit max_tokens
            "finish_reason": response.choices[0].finish_reason,
            
            # Token usage - important for cost tracking
            "usage": {
                "prompt_tokens": response.usage.prompt_tokens,      # Input tokens
                "completion_tokens": response.usage.completion_tokens,  # Output tokens
                "total_tokens": response.usage.total_tokens          # Sum of both
            }
        }
    except Exception as e:
        return {"error": str(e)}

# Inspect response from each provider
# Notice how the structure is identical across providers!
for provider in PROVIDERS:
    print(f"\n=== {provider.upper()} Response ===")
    result = inspect_response(provider)
    print(json.dumps(result, indent=2))

## Exercise 4: TODO - Multi-Provider Client

Now it's your turn! Implement a `MultiProviderClient` class that:

1. **Stores multiple provider configurations**
2. **Creates OpenAI clients for each provider** at initialization
3. **Provides a simple `chat()` method** that routes to the right provider
4. **Handles errors gracefully**

### Why This Pattern Matters:

In production applications, you often want to:
- **Use different providers for different tasks** (e.g., cheap model for simple tasks, powerful model for complex ones)
- **Implement fallback logic** (if one provider fails, try another)
- **Compare responses** from different models
- **Load balance** across providers

### Hints:

- Store clients in a dictionary: `self._clients = {"groq": OpenAI(...), ...}`
- The `chat()` method should look up the client and make the request
- Use `kwargs.pop("model", config["default_model"])` to allow model override

In [None]:
# TODO: Implement MultiProviderClient
#
# This is a skeleton for you to complete. Read through the class
# and implement the missing parts marked with TODO comments.
#
# GOAL: Create a class that manages multiple API providers and lets
# you switch between them with a single method call.

class MultiProviderClient:
    """A client that can switch between multiple providers."""
    
    def __init__(self, providers: dict):
        """
        Initialize with provider configurations.
        
        Args:
            providers: Dict mapping provider names to their config
                      e.g., {"groq": {"base_url": "...", "api_key": "..."}}
        
        TODO:
        1. Store the providers dict as self.providers
        2. Create an OpenAI client for each provider
        3. Store clients in self._clients dict
        """
        # TODO: Store provider configs
        # self.providers = providers
        
        # TODO: Create OpenAI client for each provider
        # self._clients = {}
        # for name, config in providers.items():
        #     self._clients[name] = OpenAI(...)
        pass
    
    def chat(self, provider_name: str, messages: list, **kwargs) -> str:
        """
        Send a chat request to the specified provider.
        
        Args:
            provider_name: Which provider to use (e.g., "groq")
            messages: List of message dicts with role and content
            **kwargs: Additional options (model, max_tokens, temperature, etc.)
        
        Returns:
            The assistant's response text
        
        TODO:
        1. Check if provider_name exists in self._clients
        2. Get the client and config for this provider
        3. Get the model (use default if not specified in kwargs)
        4. Call client.chat.completions.create()
        5. Return the response content
        """
        # TODO: Get the client for this provider
        # TODO: Make the chat completion request
        # TODO: Return the response content
        pass
    
    def list_providers(self) -> list:
        """
        Return list of available provider names.
        
        Returns:
            List of provider name strings
        
        TODO:
        Return the keys from self.providers
        """
        # TODO: Return list of provider names
        # return list(self.providers.keys())
        pass


# Test your implementation
# Uncomment these lines after implementing the class:
# client = MultiProviderClient(PROVIDERS)
# print("Available providers:", client.list_providers())
# response = client.chat("groq", [{"role": "user", "content": "Hi"}])
# print("Response:", response)

## Appendix: Solution

Here's a complete implementation of the `MultiProviderClient` class. Compare this to your implementation to see how it all fits together.

### Key Design Decisions:

1. **Pre-create all clients at initialization** - This avoids creating a new client object on every request (more efficient)

2. **Store both providers and clients** - We need the original config to access `default_model`

3. **Use `kwargs.pop()` for model** - This allows callers to override the default model while keeping it optional

4. **Raise ValueError for unknown providers** - This makes debugging easier than failing silently

In [None]:
class MultiProviderClient:
    """
    A client that can switch between multiple providers.
    
    This class demonstrates a common pattern in production AI applications:
    managing multiple API providers through a single interface.
    
    Benefits:
    - Single interface for multiple providers
    - Easy switching between providers
    - Pre-initialized clients (no overhead per request)
    - Centralized configuration management
    """
    
    def __init__(self, providers: dict):
        """
        Initialize with provider configurations.
        
        Creates an OpenAI client for each provider upfront.
        This is more efficient than creating clients on each request.
        """
        # Store the original config (needed for default_model)
        self.providers = providers
        
        # Create a client for each provider
        # These are stored privately (underscore prefix) to prevent direct access
        self._clients = {}
        
        for name, config in providers.items():
            # Each client is configured with the provider's base_url and api_key
            self._clients[name] = OpenAI(
                base_url=config["base_url"],
                api_key=config["api_key"]
            )
    
    def chat(self, provider_name: str, messages: list, **kwargs) -> str:
        """
        Send a chat request to a specific provider.
        
        Args:
            provider_name: Which provider to use (must be in self._clients)
            messages: Conversation as list of {"role": "...", "content": "..."}
            **kwargs: Optional parameters like max_tokens, temperature, model
        
        Returns:
            The assistant's response text
        
        Raises:
            ValueError: If provider_name is not recognized
        """
        # Validate provider exists
        if provider_name not in self._clients:
            raise ValueError(f"Unknown provider: {provider_name}")
        
        # Get the pre-configured client for this provider
        client = self._clients[provider_name]
        
        # Get the config (needed for default_model)
        config = self.providers[provider_name]
        
        # Allow model override via kwargs, otherwise use default
        # pop() removes the key if present, returns default if not
        model = kwargs.pop("model", config["default_model"])
        
        # Make the API call
        # Any remaining kwargs (max_tokens, temperature, etc.) are passed through
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            **kwargs
        )
        
        # Extract and return just the text content
        return response.choices[0].message.content
    
    def list_providers(self) -> list:
        """
        Return list of available provider names.
        
        Useful for UI displays or validation.
        """
        return list(self.providers.keys())


# Test the implementation
# This demonstrates the class in action with real API calls
if PROVIDERS:
    # Create the multi-provider client
    client = MultiProviderClient(PROVIDERS)
    print("Available providers:", client.list_providers())
    
    # Test each provider with a simple request
    for provider in client.list_providers():
        try:
            response = client.chat(
                provider,
                [{"role": "user", "content": "Say 'test'"}],
                max_tokens=5
            )
            print(f"{provider}: {response}")
        except Exception as e:
            print(f"{provider}: Error - {e}")