# EloRouter - Inference

This notebook demonstrates how to use **EloRouter** for inference.

## Overview

EloRouter uses Elo ratings to select the best LLM. It's an inference-only router
that doesn't require training - it uses pre-computed Elo ratings from benchmarks.

**Key Features**:
- No training required
- Uses established Elo rating system
- Simple and interpretable
- Based on pairwise comparisons

**Note**: EloRouter always selects the highest-rated LLM based on Elo scores.

## 1. Environment Setup

In [None]:
import os
import sys
from pathlib import Path

PROJECT_ROOT = Path(os.getcwd()).parent.parent
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

os.chdir(PROJECT_ROOT)
print(f"Working directory: {os.getcwd()}")

In [None]:
from llmrouter.models.elorouter import EloRouter
from llmrouter.utils import setup_environment

setup_environment()
print("Environment setup complete!")

## 2. Configuration

In [None]:
import yaml

CONFIG_PATH = "configs/model_config_train/elorouter.yaml"

with open(CONFIG_PATH, 'r') as f:
    config = yaml.safe_load(f)

print("Current Configuration:")
print("=" * 50)
print(yaml.dump(config, default_flow_style=False))

## 3. Load Router

In [None]:
router = EloRouter(yaml_path=CONFIG_PATH)

print("Router initialized successfully!")
print(f"Number of LLM candidates: {len(router.llm_data)}")
print(f"LLM candidates: {list(router.llm_data.keys())}")

## 4. Understanding Elo Ratings

Elo rating system:
- Higher rating = better model
- Ratings updated based on pairwise comparisons
- Used in Chatbot Arena leaderboard

In [None]:
# Display Elo ratings for each LLM
print("LLM Elo Ratings:")
print("=" * 50)

# Note: Elo ratings should be available in the router or LLM data
if hasattr(router, 'elo_ratings'):
    for model, rating in sorted(router.elo_ratings.items(), key=lambda x: x[1], reverse=True):
        print(f"  {model:30} {rating}")
else:
    print("Elo ratings are computed from benchmark data.")
    print("The router will select the model with highest average performance.")

## 5. Query Routing

In [None]:
EXAMPLE_QUERIES = [
    {"query": "What is the capital of France?"},
    {"query": "Solve the equation: 2x + 5 = 15"},
    {"query": "Write a Python function to check if a number is prime."},
    {"query": "Explain quantum computing in simple terms."},
]

print("Routing Results:")
print("=" * 60)

for i, query in enumerate(EXAMPLE_QUERIES, 1):
    result = router.route_single(query)
    print(f"{i}. {query['query'][:50]}...")
    print(f"   Routed to: {result['model_name']}")

## 6. Note on EloRouter Behavior

EloRouter is a **static** router - it always selects the same LLM (the highest-rated one)
regardless of the query content. This is useful as:
- A baseline for comparison
- When you want the overall best model
- When query-specific routing isn't necessary

In [None]:
# Verify static behavior
results = [router.route_single(q)['model_name'] for q in EXAMPLE_QUERIES]
all_same = len(set(results)) == 1

print(f"All queries routed to same model: {all_same}")
if all_same:
    print(f"Selected model: {results[0]}")

## 7. File-Based Inference

Load queries from a file and save results.

In [None]:
import json

# Load queries from a JSONL file
def load_queries_from_file(file_path):
    """Load queries from a JSONL file."""
    queries = []
    with open(file_path, 'r', encoding='utf-8') as f:
        for line in f:
            if line.strip():
                queries.append(json.loads(line))
    return queries

# Save results to a JSONL file
def save_results_to_file(results, output_path):
    """Save routing results to a JSONL file."""
    os.makedirs(os.path.dirname(output_path), exist_ok=True)
    with open(output_path, 'w', encoding='utf-8') as f:
        for result in results:
            f.write(json.dumps(result, ensure_ascii=False) + '\n')
    print(f"Results saved to: {output_path}")

# Example: Load from default query file
QUERY_FILE = "data/example_data/query_data/default_query_test.jsonl"
OUTPUT_FILE = "outputs/elorouter_results.jsonl"

if os.path.exists(QUERY_FILE):
    # Load queries
    file_queries = load_queries_from_file(QUERY_FILE)
    print(f"Loaded {len(file_queries)} queries from: {QUERY_FILE}")
    
    # Route queries
    file_results = router.route_batch(batch=file_queries[:10])
    print(f"Routed {len(file_results)} queries")
    
    # Save results
    save_results_to_file(file_results, OUTPUT_FILE)
    
    # Show sample results
    print(f"\nSample results:")
    for i, result in enumerate(file_results[:3], 1):
        print(f"  {i}. {result.get('query', '')[:40]}... -> {result['model_name']}")
else:
    print(f"Query file not found: {QUERY_FILE}")
    print("Create a JSONL file with format: {\"query\": \"Your question\"}")

## Summary

In this notebook, we:

1. **Loaded EloRouter**: No training required
2. **Understood Elo System**: Rating-based model selection
3. **Performed Routing**: Static selection of best-rated model

**Key Takeaways**:
- EloRouter is the simplest router (no training)
- Always selects the highest-rated model
- Useful as a baseline or when simplicity is preferred

**When to use EloRouter**:
- As a baseline for comparison
- When you always want the "best" model
- When computational resources are limited

**When NOT to use EloRouter**:
- When query-specific routing is important
- When cost optimization is needed
- When different queries need different model capabilities