# MFRouter - Training

This notebook demonstrates how to train the **MFRouter** (Matrix Factorization Router).

## Overview

MFRouter uses matrix factorization to learn latent representations for both queries and LLMs,
then predicts the best LLM based on the similarity in the latent space.

**Key Features**:
- Learns embeddings for both queries and models
- Can capture collaborative filtering patterns
- Effective for large query-model interaction matrices

## 1. Environment Setup

In [None]:
import os
import sys
from pathlib import Path

PROJECT_ROOT = Path(os.getcwd()).parent.parent
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

os.chdir(PROJECT_ROOT)
print(f"Working directory: {os.getcwd()}")

In [None]:
from llmrouter.models.mfrouter import MFRouter, MFRouterTrainer
from llmrouter.utils import setup_environment

setup_environment()
print("Environment setup complete!")

## 2. Configuration

MFRouter uses the following configuration parameters:

| Parameter | Description | Default |
|-----------|-------------|--------|
| `latent_dim` | Dimension of latent space | 128 |
| `text_dim` | Query embedding dimension | 768 |
| `lr` | Learning rate | 0.001 |
| `epochs` | Training epochs | 5 |
| `noise_alpha` | Noise for regularization | 0.0 |

In [None]:
import yaml

CONFIG_PATH = "configs/model_config_train/mfrouter.yaml"

with open(CONFIG_PATH, 'r') as f:
    config = yaml.safe_load(f)

print("Current Configuration:")
print("=" * 50)
print(yaml.dump(config, default_flow_style=False))

## 3. Initialize Router

In [None]:
router = MFRouter(yaml_path=CONFIG_PATH)

print("Router initialized successfully!")
print(f"Number of training samples: {len(router.routing_data_train)}")
print(f"Number of LLM candidates: {len(router.llm_data)}")
print(f"LLM candidates: {list(router.llm_data.keys())}")

## 4. Training

In [None]:
trainer = MFRouterTrainer(router=router, device='cpu')

print("Trainer initialized!")
print(f"Save path: {trainer.save_model_path}")

In [None]:
print("Starting training...")
print("=" * 50)

trainer.train()

print("=" * 50)
print("Training completed!")

## 5. Model Verification

In [None]:
from llmrouter.utils import load_model
import torch

saved_model = load_model(trainer.save_model_path)

print("Model loaded successfully!")
print(f"Model type: {type(saved_model).__name__}")

In [None]:
# Test prediction
test_query = {"query": "What is the capital of France?"}
result = router.route_single(test_query)

print(f"Test query: {test_query['query']}")
print(f"Routed to: {result['model_name']}")

## Summary

In this notebook, we:

1. **Loaded Configuration**: Set up MFRouter with YAML configuration
2. **Trained Model**: Used MFRouterTrainer to learn latent representations
3. **Verified Model**: Tested routing with sample queries

**Next Steps**:
- Use `02_mfrouter_inference.ipynb` for inference
- Experiment with different latent dimensions