# PersonalizedRouter - Training

This notebook demonstrates how to train the **PersonalizedRouter** (Personalized GNN-based Router).

## Overview

PersonalizedRouter uses a Graph Neural Network (GNN) to model the relationships between queries and LLMs **with personalization**. It constructs a heterogeneous graph where:
- **Query nodes**: each query has an embedding
- **LLM nodes**: each LLM has an embedding
- **User nodes**: each user has one-hot features (for personalization)
- **Task nodes**: each task has an embedding

**Key Features**:
- Personalized routing based on user features
- Multi-task learning support
- Message passing for learning representations
- Can capture complex relational patterns between users, queries, and LLMs

## 1. Environment Setup

In [None]:
# Install required packages (for Colab)
# !pip install llmrouter-lib torch torch-geometric

In [None]:
import os
import sys
from pathlib import Path

PROJECT_ROOT = Path(os.getcwd()).parent.parent
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

os.chdir(PROJECT_ROOT)
print(f"Working directory: {os.getcwd()}")

In [None]:
import torch
from llmrouter.models.personalizedrouter import PersonalizedRouter
from llmrouter.models.personalizedrouter.trainer import PersonalizedRouterTrainer
from llmrouter.utils import setup_environment

setup_environment()

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

## 2. Configuration

PersonalizedRouter uses the following configuration parameters:

| Parameter | Description | Default |
|-----------|-------------|---------|
| `embedding_dim` | GNN hidden layer dimension | 64 |
| `edge_dim` | Edge feature dimension (cost, effect) | 2 |
| `user_num` | Number of users for personalization | 10 |
| `num_task` | Number of tasks | 5 |
| `learning_rate` | Learning rate | 0.001 |
| `weight_decay` | L2 regularization | 0.0001 |
| `train_epoch` | Training epochs | 100 |
| `batch_size` | Batch size | 4 |
| `train_mask_rate` | Edge masking rate | 0.3 |
| `split_ratio` | Train/Val/Test split | [0.6, 0.2, 0.2] |

In [None]:
import yaml

CONFIG_PATH = "configs/model_config_train/personalizedrouter.yaml"

with open(CONFIG_PATH, 'r') as f:
    config = yaml.safe_load(f)

print("Current Configuration:")
print("=" * 50)
print(yaml.dump(config, default_flow_style=False))

## 3. Initialize Router

In [None]:
router = PersonalizedRouter(yaml_path=CONFIG_PATH)

print("Router initialized successfully!")
print(f"Number of LLM candidates: {len(router.llm_data)}")
print(f"Number of users: {router.user_num}")
print(f"Number of tasks: {router.num_task}")
print(f"LLM candidates: {list(router.llm_data.keys())}")

## 4. Graph Structure Visualization

In [None]:
# Understand the graph structure
print("Graph Structure Information:")
print("=" * 50)
print(f"\nNode types:")
print(f"  - Query nodes: {router.num_queries_train} queries from training data")
print(f"  - LLM nodes: {len(router.llm_data)} models")
print(f"  - User nodes: {router.user_num} users (one-hot encoding)")
print(f"  - Task nodes: {router.num_task} tasks")
print(f"\nEdge types:")
print(f"  - Query -> LLM edges (performance scores)")
print(f"  - LLM -> LLM edges (within same family)")
print(f"\nPersonalization:")
print(f"  - Each user has a unique node")
print(f"  - User features are one-hot encoded")
print(f"  - GNN learns user-specific routing patterns")

## 5. Training

In [None]:
trainer = PersonalizedRouterTrainer(router=router, device=device)

print("Trainer initialized!")
print(f"Device: {device}")
print(f"Save path: {trainer.save_model_path}")

In [None]:
print("Starting training...")
print("=" * 50)

trainer.train()

print("=" * 50)
print("Training completed!")

## 6. Model Verification

In [None]:
# Verify the trained model
model_path = trainer.save_model_path
if os.path.exists(model_path):
    checkpoint = torch.load(model_path, map_location='cpu')
    print(f"Model saved to: {model_path}")
    print(f"Model keys: {list(checkpoint.keys())[:5]}...")
else:
    print(f"Model not found at: {model_path}")

In [None]:
test_query = {"query": "What is the capital of France?"}
result = router.route_single(test_query)

print(f"Test query: {test_query['query']}")
print(f"Routed to: {result['model_name']}")

## Summary

In this notebook, we:

1. **Loaded Configuration**: Set up PersonalizedRouter with YAML configuration
2. **Understood Graph Structure**: Multi-node graph (Query, LLM, User, Task)
3. **Trained GNN Model**: Used message passing to learn personalized representations
4. **Verified Model**: Saved trained model for inference

**Key Takeaways**:
- PersonalizedRouter models personalized routing with user features
- Each user has a unique representation in the graph
- GNN learns to route differently for different users
- Edge masking during training improves generalization

**Next Steps**:
- Use `02_personalizedrouter_inference.ipynb` for inference
- Experiment with different user numbers and task configurations