# Quickstart: Models and Providers

This notebook demonstrates how to use v-router for basic model interactions, fallback strategies, and cross-provider switching.

## What is v-router?

v-router is a unified LLM interface that provides:
- **Automatic fallback** between different models and providers
- **Unified response format** across all providers (Anthropic, OpenAI, Google, Azure)
- **Seamless provider switching** with the same API
- **Intelligent routing** based on model availability and configuration

## Core Components

### Request Models
- **`LLM`**: Configuration for a language model including provider, model name, and parameters
- **`BackupModel`**: Fallback model configuration with priority ordering
- **`Client`**: Main interface for sending requests to models

### Response Models  
- **`Response`**: Unified response format with content, usage, model info, and raw provider response
- **`Content`**: Text content blocks from the model response
- **`Usage`**: Token usage information (input/output tokens)


## Basic Example

Let's start with a simple example to see how easy it is to use different providers with the same interface.

In [1]:
from v_router import Client, LLM, BackupModel

# Create an LLM configuration
llm_config = LLM(
    model_name="claude-sonnet-4",
    provider="anthropic",
    max_tokens=100,
    temperature=0
)

# Create a client with the LLM configuration
client = Client(llm_config)

# Send a message using the unified API
response = await client.messages.create(
    messages=[
        {"role": "user", "content": "Say hello in one sentence."}
    ]
)

# Access the unified response format
print(f"Response: {response.content[0].text}")
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")

[32m2025-05-30 14:46:12,866 - v_router.router - INFO - Trying primary model: claude-sonnet-4 on anthropic[0m


Response: Hello, it's nice to meet you!
Model: claude-sonnet-4-20250514
Provider: anthropic


## Understanding the Response Format

v-router provides a unified response format across all providers. Let's examine the response structure in detail:

In [2]:
# Let's examine the unified Response structure
llm_config = LLM(
    model_name="claude-sonnet-4-20250514",
    provider="anthropic",
    max_tokens=100
)

client = Client(llm_config)

response = await client.messages.create(
    messages=[
        {"role": "user", "content": "What is Python? Answer in one sentence."}
    ]
)

# The unified Response structure provides:
print("🔍 Unified Response Structure:")
print(f"├── response.content: {type(response.content).__name__} of {len(response.content)} items")
print(f"│   └── content[0].type: '{response.content[0].type}'")
print(f"│   └── content[0].role: '{response.content[0].role}'")
print(f"│   └── content[0].text: '{response.content[0].text}'")
print(f"├── response.tool_use: {type(response.tool_use).__name__} of {len(response.tool_use)} items")
print(f"├── response.usage:")
print(f"│   ├── input_tokens: {response.usage.input_tokens}")
print(f"│   └── output_tokens: {response.usage.output_tokens}")
print(f"├── response.model: '{response.model}'")
print(f"├── response.provider: '{response.provider}'")
print(f"└── response.raw_response: {type(response.raw_response).__name__}")

print("\n✅ This same structure works for ALL providers!")

[32m2025-05-30 14:46:14,336 - v_router.router - INFO - Trying primary model: claude-sonnet-4-20250514 on anthropic[0m


🔍 Unified Response Structure:
├── response.content: list of 1 items
│   └── content[0].type: 'text'
│   └── content[0].role: 'assistant'
│   └── content[0].text: 'Python is a high-level, interpreted programming language known for its simple, readable syntax and versatility across applications like web development, data science, artificial intelligence, and automation.'
├── response.tool_use: list of 0 items
├── response.usage:
│   ├── input_tokens: 16
│   └── output_tokens: 38
├── response.model: 'claude-sonnet-4-20250514'
├── response.provider: 'anthropic'
└── response.raw_response: dict

✅ This same structure works for ALL providers!


## Fallback Example

One of v-router's key features is automatic fallback. If the primary model fails, it will try backup models in priority order.

### How Fallback Works:
1. **Primary Model**: Attempts the main model first
2. **Backup Models**: If primary fails, tries backup models by priority (1, 2, 3...)
3. **Tool Inheritance**: Backup models automatically inherit tools from the primary model
4. **Same Interface**: No changes needed in your code - v-router handles it transparently

In [3]:
# Configure fallback models with different providers
llm_config = LLM(
    model_name="claude-6",  # Primary model (intentionally non-existent)
    provider="anthropic",
    max_tokens=100,
    backup_models=[
        BackupModel(
            model=LLM(
                model_name="gpt-4o",
                provider="openai"
            ),
            priority=1  # First fallback
        ),
        BackupModel(
            model=LLM(
                model_name="gemini-1.5-pro",
                provider="google"
            ),
            priority=2  # Second fallback
        )
    ]
)

client = Client(llm_config)

# This will try claude-6 first (fail), then gpt-4o, then gemini-1.5-pro if needed
response = await client.messages.create(
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What's 2+2?"}
    ]
)
    
print(f"Response: {response.content[0].text}")
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")
print(f"\n💡 Notice: The fallback model was used seamlessly!")

[32m2025-05-30 14:46:16,594 - v_router.router - INFO - Trying primary model: claude-6 on anthropic[0m
[32m2025-05-30 14:46:16,976 - v_router.router - INFO - Trying backup model: gpt-4o on openai[0m


Response: 2 + 2 equals 4.
Model: gpt-4o-2024-08-06
Provider: openai

💡 Notice: The fallback model was used seamlessly!


## Cross-Provider Switch

You can enable cross-provider fallback by setting `try_other_providers=True`. If a call fails on one provider, the system will try another provider with the same model.

### How Cross-Provider Switching Works:
1. **Primary Provider**: Tries the specified provider first
2. **Model Mapping**: Uses models.yml to find the same model on other providers
3. **Automatic Retry**: Seamlessly switches to alternative providers
4. **Provider-Specific Formatting**: Handles different API formats automatically

In [8]:
llm_config = LLM(
    model_name="claude-opus-4",
    provider="vertexai",  # Try Vertex AI first (may not be configured)
    max_tokens=100,
    try_other_providers=True  # Enable cross-provider fallback
)

client = Client(llm_config)

response = await client.messages.create(
    messages=[
        {"role": "system", "content":"You are a friendly assistant."},
        {"role": "user", "content": "Tell me a short joke."}
    ]
)
    
print(f"Response: {response.content[0].text}")
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")
print(f"\n💡 If Vertex AI failed, it automatically tried Anthropic!")

[32m2025-05-30 15:02:12,859 - v_router.router - INFO - Trying primary model: claude-opus-4 on vertexai[0m
[32m2025-05-30 15:02:15,060 - v_router.router - INFO - Trying alternative provider: claude-opus-4 on anthropic[0m


Response: Why don't scientists trust atoms?

Because they make up everything!
Model: claude-opus-4-20250514
Provider: anthropic

💡 If Vertex AI failed, it automatically tried Anthropic!


## Testing Different Providers

Let's test the same request across different providers to show the unified interface:

In [5]:
# Test message for all providers
test_messages = [
    {"role": "user", "content": "Explain machine learning in one sentence."}
]

# Anthropic Claude
anthropic_llm = LLM(
    model_name="claude-sonnet-4",
    provider="anthropic",
    max_tokens=100
)
anthropic_client = Client(anthropic_llm)
anthropic_response = await anthropic_client.messages.create(messages=test_messages)

print("=== Anthropic Claude ===")
print(f"Model: {anthropic_response.model}")
print(f"Provider: {anthropic_response.provider}")
print(f"Response: {anthropic_response.content[0].text}")
print(f"Tokens: {anthropic_response.usage.input_tokens} in, {anthropic_response.usage.output_tokens} out")

# OpenAI GPT
openai_llm = LLM(
    model_name="gpt-4",
    provider="openai",
    max_tokens=100
)
openai_client = Client(openai_llm)
openai_response = await openai_client.messages.create(messages=test_messages)

print("\n=== OpenAI GPT ===")
print(f"Model: {openai_response.model}")
print(f"Provider: {openai_response.provider}")
print(f"Response: {openai_response.content[0].text}")
print(f"Tokens: {openai_response.usage.input_tokens} in, {openai_response.usage.output_tokens} out")

# Google Gemini
google_llm = LLM(
    model_name="gemini-1.5-pro",
    provider="google",
    max_tokens=100
)
google_client = Client(google_llm)
google_response = await google_client.messages.create(messages=test_messages)

print("\n=== Google Gemini ===")
print(f"Model: {google_response.model}")
print(f"Provider: {google_response.provider}")
print(f"Response: {google_response.content[0].text}")
print(f"Tokens: {google_response.usage.input_tokens} in, {google_response.usage.output_tokens} out")

print("\n✅ Notice: Same API, same response format, different providers!")

[32m2025-05-30 14:46:22,558 - v_router.router - INFO - Trying primary model: claude-sonnet-4 on anthropic[0m
[32m2025-05-30 14:46:24,481 - v_router.router - INFO - Trying primary model: gpt-4 on openai[0m


=== Anthropic Claude ===
Model: claude-sonnet-4-20250514
Provider: anthropic
Response: Machine learning is a method of teaching computers to recognize patterns and make predictions from data without being explicitly programmed for each specific task.
Tokens: 15 in, 29 out


[32m2025-05-30 14:46:26,207 - v_router.router - INFO - Trying primary model: gemini-1.5-pro on google[0m



=== OpenAI GPT ===
Model: gpt-4-0613
Provider: openai
Response: Machine learning is a subset of artificial intelligence that enables computers to learn and make decisions from data without being explicitly programmed.
Tokens: 15 in, 23 out

=== Google Gemini ===
Model: gemini-1.5-pro
Provider: google
Response: Machine learning is the process of enabling computers to learn from data without explicit programming.

Tokens: 7 in, 17 out

✅ Notice: Same API, same response format, different providers!


## Advanced Configuration

You can configure various parameters for fine-tuned control:

In [6]:
# Advanced LLM configuration
advanced_llm = LLM(
    model_name="claude-sonnet-4",
    provider="anthropic",
    max_tokens=200,
    temperature=0.7,  # More creative responses
    top_p=0.9,        # Nucleus sampling
    try_other_providers=True,
    backup_models=[
        BackupModel(
            model=LLM(
                model_name="gpt-4o",
                provider="openai",
                temperature=0.7  # Same temperature for consistency
            ),
            priority=1
        )
    ]
)

client = Client(advanced_llm)

response = await client.messages.create(
    messages=[
        {"role": "system", "content": "You are a creative writing assistant."},
        {"role": "user", "content": "Write a creative opening line for a sci-fi story."}
    ]
)

print(f"Creative Response: {response.content[0].text}")
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")
print(f"Usage: {response.usage.input_tokens} + {response.usage.output_tokens} = {response.usage.input_tokens + response.usage.output_tokens} tokens")

[32m2025-05-30 14:46:26,801 - v_router.router - INFO - Trying primary model: claude-sonnet-4 on anthropic[0m


Creative Response: The day humanity received its eviction notice from Earth, it arrived not as a dramatic alien invasion, but as a polite holographic memo that materialized in every living room at exactly 3:47 PM, universal time.
Model: claude-sonnet-4-20250514
Provider: anthropic
Usage: 26 + 50 = 76 tokens


## Summary

### Key Features Demonstrated:

✅ **Unified Interface**: Same API works across Anthropic, OpenAI, Google, and Azure  
✅ **Automatic Fallback**: Seamless switching between models when primary fails  
✅ **Cross-Provider Support**: Try the same model on different providers automatically  
✅ **Unified Response Format**: Consistent response structure regardless of provider  
✅ **Flexible Configuration**: Control temperature, tokens, and other parameters  

### Request Models:
- **`LLM`**: Primary configuration (model, provider, parameters)
- **`BackupModel`**: Fallback configuration with priority
- **`Client`**: Main interface for sending requests

### Response Models:
- **`Response`**: Unified response with content, usage, model info
- **`Content`**: Text content blocks from the model
- **`Usage`**: Token usage information

### Next Steps:
- Check out `quickstart_tool_calling.ipynb` to learn about function calling across providers
- Explore the `models.yml` configuration for advanced model mapping
- See the full documentation for more advanced features

v-router provides a truly unified interface for working with LLMs across all major providers!