# Lecture 1 ‚Äî OpenRouter API Testing

**Goal**: Explore different LLM models through OpenRouter API:

1. Check account credit balance
2. List available models with pricing
3. Compare outputs from Claude vs free/cheap models
4. Understand API request/response patterns
5. Observe differences in model capabilities

## Setup
This notebook requires:
- `OPENROUTER_API_KEY` (enter directly in Cell 1)



In [None]:
# Add current directory to Python path for imports
import sys
from pathlib import Path
sys.path.insert(0, str(Path.cwd()))


import json
from typing import Any, Dict, List
import pandas as pd

from openrouter_utils import (
    check_credits,
    print_remaining_credits,
    list_models,
    chat_completion,
    safe_chat,
    display_comparison
)

OPENROUTER_API_KEY = ""  # Paste your key here

# Models to test
MODELS = {
    "claude": "anthropic/claude-3.5-sonnet",
    "google-free": "google/gemma-3n-e2b-it:free",
    "qwen-free": "qwen/qwen3-4b:free",
}

print("Imports loaded")
print(f"Testing {len(MODELS)} models:", list(MODELS.keys()))

if not OPENROUTER_API_KEY or OPENROUTER_API_KEY.strip() == "":
    raise RuntimeError(
        "‚ö†Ô∏è  Please set OPENROUTER_API_KEY above before running this notebook.\n"
        "Get your key from: https://openrouter.ai/keys"
    )

print("‚úì API key configured")

‚úì API key configured


In [4]:
print_remaining_credits(OPENROUTER_API_KEY)

üí≥ API Key Credit Balance:
   Key limit:    $15.00
   Key usage:    $0.05
   Remaining:    $14.95


In [4]:
# Execute - function is imported from openrouter_utils.py
models_list = list_models(OPENROUTER_API_KEY, limit=100)

# Parse into DataFrame for easy viewing
models_df = pd.DataFrame([
    {
        "id": m.get("id", ""),
        "name": m.get("name", ""),
        "prompt_cost": m.get("pricing", {}).get("prompt", "N/A"),
        "completion_cost": m.get("pricing", {}).get("completion", "N/A"),
        "context_length": m.get("context_length", "N/A"),
    }
    for m in models_list
])

print(f"Found {len(models_df)} models")
print("\nOur test models:")
for key, model_id in MODELS.items():
    match = models_df[models_df["id"] == model_id]
    if not match.empty:
        row = match.iloc[0]
        print(f"  {key:10s} ‚Üí ${row['prompt_cost']}/1M prompt tokens")
    else:
        print(f"  {key:10s} ‚Üí {model_id} (not found in list)")

# Display sample of all models
models_df.head(20)

Found 100 models

Our test models:
  claude     ‚Üí anthropic/claude-3.5-sonnet (not found in list)
  google-free ‚Üí google/gemma-3n-e2b-it:free (not found in list)
  qwen-free  ‚Üí qwen/qwen3-4b:free (not found in list)


Unnamed: 0,id,name,prompt_cost,completion_cost,context_length
0,allenai/molmo-2-8b:free,AllenAI: Molmo2 8B (free),0.0,0.0,36864
1,allenai/olmo-3.1-32b-instruct,AllenAI: Olmo 3.1 32B Instruct,2e-07,6e-07,65536
2,bytedance-seed/seed-1.6-flash,ByteDance Seed: Seed 1.6 Flash,7.5e-08,3e-07,262144
3,bytedance-seed/seed-1.6,ByteDance Seed: Seed 1.6,2.5e-07,2e-06,262144
4,minimax/minimax-m2.1,MiniMax: MiniMax M2.1,2.8e-07,1.2e-06,196608
5,z-ai/glm-4.7,Z.AI: GLM 4.7,4e-07,1.5e-06,202752
6,google/gemini-3-flash-preview,Google: Gemini 3 Flash Preview,5e-07,3e-06,1048576
7,mistralai/mistral-small-creative,Mistral: Mistral Small Creative,1e-07,3e-07,32768
8,allenai/olmo-3.1-32b-think,AllenAI: Olmo 3.1 32B Think,1.5e-07,5e-07,65536
9,xiaomi/mimo-v2-flash:free,Xiaomi: MiMo-V2-Flash (free),0.0,0.0,262144


In [5]:
# Function is now imported from openrouter_utils.py
# Use chat_completion(OPENROUTER_API_KEY, model, messages, temperature=0.7, max_tokens=500)
print("‚úì Helper functions imported from openrouter_utils.py")

‚úì Helper function defined


In [6]:
# Define test prompts that showcase model differences

PROMPTS = [
    {
        "name": "Factual Q&A",
        "messages": [
            {
                "role": "user",
                "content": "Explain what an API is in 2-3 sentences suitable for beginners."
            }
        ]
    },
    {
        "name": "Reasoning Task",
        "messages": [
            {
                "role": "user",
                "content": "If I have 3 apples and buy 2 more, then give away half, how many do I have left? Show your reasoning."
            }
        ]
    },
    {
        "name": "Code Explanation",
        "messages": [
            {
                "role": "user",
                "content": "Explain what this Python code does: `[x**2 for x in range(5)]`"
            }
        ]
    },
    {
        "name": "Creative Writing",
        "messages": [
            {
                "role": "user",
                "content": "Write a one-sentence story about a robot learning to code."
            }
        ]
    }
]

print(f"Defined {len(PROMPTS)} test prompts:")
for p in PROMPTS:
    print(f"  - {p['name']}")

Defined 4 test prompts:
  - Factual Q&A
  - Reasoning Task
  - Code Explanation
  - Creative Writing


In [7]:
# Run all prompts through all models

results = []

for prompt_obj in PROMPTS:
    prompt_name = prompt_obj["name"]
    messages = prompt_obj["messages"]
    
    print(f"\n{'='*60}")
    print(f"Prompt: {prompt_name}")
    print(f"{'='*60}")
    
    for model_key, model_id in MODELS.items():
        print(f"\nTesting {model_key}...")
        
        result = chat_completion(
            OPENROUTER_API_KEY,
            model_id,
            messages,
            temperature=0.7,
            max_tokens=500
        )
        
        results.append({
            "prompt": prompt_name,
            "model_key": model_key,
            "model_id": model_id,
            **result
        })
        
        # Display output
        if result["error"]:
            print(f"  ‚ùå Error: {result['error']}")
        else:
            content = result["content"]
            preview = content
            print(f"  ‚úì Response: {preview}")
            
            usage = result.get("usage", {})
            if usage:
                print(f"    Tokens: {usage.get('prompt_tokens', 0)} prompt + {usage.get('completion_tokens', 0)} completion")

print("\n‚úì All tests complete")


Prompt: Factual Q&A

Testing claude...
  ‚úì Response: An API (Application Programming Interface) is like a waiter at a restaurant - it takes requests and returns what you asked for. It allows different software applications to communicate with each other by providing a set of rules and tools for exchanging data and functionality.
    Tokens: 25 prompt + 54 completion

Testing google-free...
  ‚úì Response: An API (Application Programming Interface) is like a menu for software. It allows different software programs to talk to each other and share information without needing to know the complicated details of how each program works.  Think of it as a set of rules that defines how two applications can request and exchange data.




    Tokens: 16 prompt + 60 completion

Testing qwen-free...
  ‚úì Response: An API (Application Programming Interface) is a set of rules that allows different software systems to communicate and share data. It acts like a bridge, letting one program request i

In [8]:
# Convert results to DataFrame for analysis

results_df = pd.DataFrame(results)

# Add computed fields
results_df["response_length"] = results_df["content"].str.len()
results_df["has_error"] = results_df["error"].notna()
results_df["total_tokens"] = results_df["usage"].apply(
    lambda u: u.get("total_tokens", 0) if isinstance(u, dict) else 0
)

print(f"Collected {len(results_df)} responses")
print(f"Errors: {results_df['has_error'].sum()}")

# Summary by model
summary = results_df.groupby("model_key").agg({
    "has_error": "sum",
    "response_length": "mean",
    "total_tokens": "sum"
}).round(1)

summary.columns = ["Errors", "Avg Response Length", "Total Tokens Used"]
print("\nSummary by Model:")
print(summary)

results_df.head()

Collected 12 responses
Errors: 0

Summary by Model:
             Errors  Avg Response Length  Total Tokens Used
model_key                                                  
claude            0                294.2                482
google-free       0                587.0                670
qwen-free         0                342.8               4588


Unnamed: 0,prompt,model_key,model_id,model,content,usage,error,response_length,has_error,total_tokens
0,Factual Q&A,claude,anthropic/claude-3.5-sonnet,anthropic/claude-3.5-sonnet,An API (Application Programming Interface) is ...,"{'prompt_tokens': 25, 'completion_tokens': 54,...",,277,False,79
1,Factual Q&A,google-free,google/gemma-3n-e2b-it:free,google/gemma-3n-e2b-it:free,An API (Application Programming Interface) is ...,"{'prompt_tokens': 16, 'completion_tokens': 60,...",,328,False,76
2,Factual Q&A,qwen-free,qwen/qwen3-4b:free,qwen/qwen3-4b:free,An API (Application Programming Interface) is ...,"{'prompt_tokens': 723, 'completion_tokens': 30...",,401,False,1027
3,Reasoning Task,claude,anthropic/claude-3.5-sonnet,anthropic/claude-3.5-sonnet,Let me solve this step by step:\n\n1. Initial ...,"{'prompt_tokens': 38, 'completion_tokens': 85,...",,178,False,123
4,Reasoning Task,google-free,google/gemma-3n-e2b-it:free,google/gemma-3n-e2b-it:free,Here's the breakdown:\n\n1. **Start:** You beg...,"{'prompt_tokens': 29, 'completion_tokens': 193...",,739,False,222


In [9]:
# Function is now imported from openrouter_utils.py
# Display all comparisons
for prompt_obj in PROMPTS:
    display_comparison(results_df, prompt_obj["name"])


Prompt: Factual Q&A

[CLAUDE] (anthropic/claude-3.5-sonnet)
----------------------------------------------------------------------
An API (Application Programming Interface) is like a waiter at a restaurant - it takes requests and returns what you asked for. It allows different software applications to communicate with each other by providing a set of rules and tools for exchanging data and functionality.

Tokens: 79 total


[GOOGLE-FREE] (google/gemma-3n-e2b-it:free)
----------------------------------------------------------------------
An API (Application Programming Interface) is like a menu for software. It allows different software programs to talk to each other and share information without needing to know the complicated details of how each program works.  Think of it as a set of rules that defines how two applications can request and exchange data.





Tokens: 76 total


[QWEN-FREE] (qwen/qwen3-4b:free)
----------------------------------------------------------------------
An

In [10]:
# Demonstrate robust error handling pattern
# Function is now imported from openrouter_utils.py

# Test with a simple prompt
test_result = safe_chat(OPENROUTER_API_KEY, MODELS["claude"], "What is machine learning?")

if test_result["error"]:
    print(f"‚ùå Failed after retries: {test_result['error']}")
else:
    print(f"‚úì Success!")
    print(f"Response: {test_result['content'][:200]}...")

‚úì Success!
Response: Machine learning is a branch of artificial intelligence (AI) that focuses on developing computer systems that can learn and improve from experience without being explicitly programmed. It uses algorit...


## Summary

You've now:
1. ‚úÖ Checked OpenRouter credit balance
2. ‚úÖ Listed available models with pricing
3. ‚úÖ Compared Claude vs. free models across different task types
4. ‚úÖ Learned API request/response patterns
5. ‚úÖ Observed quality vs. cost tradeoffs

## Key Takeaways

- **API Structure**: All requests use same endpoint pattern (credits, models, chat)
- **Model Selection**: Balance cost vs. quality based on use case
- **Error Handling**: Always wrap API calls with try/except and implement retries
- **Token Usage**: Monitor usage to control costs

## Next Steps

- Try different temperature values (0.0 = deterministic, 1.0 = creative)
- Test with longer, more complex prompts
- Experiment with system messages to guide behavior
- Implement token counting for cost estimation
- Add streaming responses for real-time feedback

## Resources

- [OpenRouter Docs](https://openrouter.ai/docs)
- [Model Rankings](https://openrouter.ai/rankings)
- [Pricing Calculator](https://openrouter.ai/pricing)