# SVMRouter - Training

This notebook demonstrates how to train the **SVMRouter** (Support Vector Machine Router).

## Overview

SVMRouter uses a Support Vector Machine classifier to route queries to the most suitable LLM based on:
- Query embeddings (using Longformer)
- Historical performance data

**Key Features**:
- Effective in high-dimensional spaces
- Works well with clear margin of separation
- Supports probability estimation

## 1. Environment Setup

In [None]:
# Install required packages (for Colab)
# !pip install llmrouter scikit-learn transformers torch

In [None]:
import os
import sys
from pathlib import Path

# Set project root
PROJECT_ROOT = Path(os.getcwd()).parent.parent
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

os.chdir(PROJECT_ROOT)
print(f"Working directory: {os.getcwd()}")

In [None]:
# Import required modules
from llmrouter.models.svmrouter import SVMRouter, SVMRouterTrainer
from llmrouter.utils import setup_environment

setup_environment()
print("Environment setup complete!")

## 2. Configuration

SVMRouter uses the following configuration parameters:

| Parameter | Description | Default |
|-----------|-------------|--------|
| `kernel` | Kernel type: "rbf", "linear", "poly", "sigmoid" | "rbf" |
| `C` | Regularization parameter | 1.0 |
| `gamma` | Kernel coefficient | "scale" |
| `probability` | Enable probability estimates | true |

In [None]:
import yaml

# Configuration file path
CONFIG_PATH = "configs/model_config_train/svmrouter.yaml"

# Load and display configuration
with open(CONFIG_PATH, 'r') as f:
    config = yaml.safe_load(f)

print("Current Configuration:")
print("=" * 50)
print(yaml.dump(config, default_flow_style=False))

## 3. Initialize Router

In [None]:
# Initialize SVMRouter with configuration
router = SVMRouter(yaml_path=CONFIG_PATH)

print("Router initialized successfully!")
print(f"Number of training samples: {len(router.routing_data_train)}")
print(f"Number of LLM candidates: {len(router.llm_data)}")
print(f"LLM candidates: {list(router.llm_data.keys())}")

In [None]:
# Inspect the SVM model configuration
print("SVM Model Parameters:")
print(router.svm_model.get_params())

## 4. Training

In [None]:
# Initialize trainer
trainer = SVMRouterTrainer(router=router, device='cpu')

print("Trainer initialized!")
print(f"Training samples: {len(trainer.query_embedding_list)}")
print(f"Save path: {trainer.save_model_path}")

In [None]:
# Train the model
print("Starting training...")
print("=" * 50)

trainer.train()

print("=" * 50)
print("Training completed!")

## 5. Model Verification

In [None]:
# Verify the trained model
from llmrouter.utils import load_model
import numpy as np

# Load the saved model
saved_model = load_model(trainer.save_model_path)

print("Model loaded successfully!")
print(f"Model type: {type(saved_model).__name__}")
print(f"Number of support vectors: {saved_model.n_support_}")
print(f"Classes: {saved_model.classes_}")

In [None]:
# Quick prediction test
test_embedding = trainer.query_embedding_list[0].reshape(1, -1)
prediction = saved_model.predict(test_embedding)

print(f"Test prediction: {prediction[0]}")

# Get prediction probabilities
proba = saved_model.predict_proba(test_embedding)
print(f"\nPrediction probabilities:")
for model, prob in zip(saved_model.classes_, proba[0]):
    print(f"  {model}: {prob:.4f}")

## 6. Hyperparameter Tuning

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
import numpy as np

# Prepare data
X = np.array(trainer.query_embedding_list)
y = np.array(trainer.model_name_list)

# Define parameter grid
param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['rbf', 'linear'],
    'gamma': ['scale', 'auto']
}

print("Running grid search (this may take a while)...")

# Grid search
svm = SVC(probability=True)
grid_search = GridSearchCV(svm, param_grid, cv=3, scoring='accuracy', verbose=1, n_jobs=-1)
grid_search.fit(X, y)

print(f"\nBest parameters: {grid_search.best_params_}")
print(f"Best score: {grid_search.best_score_:.4f}")

## Summary

In this notebook, we:

1. **Loaded Configuration**: Set up SVMRouter with YAML configuration
2. **Trained Model**: Used SVMRouterTrainer to fit the SVM classifier
3. **Verified Model**: Loaded and tested the saved model
4. **Tuned Hyperparameters**: Found optimal parameters using grid search

**Next Steps**:
- Use `02_svmrouter_inference.ipynb` for inference
- Compare with other routers (KNN, MLP)