```{contents}
```
## Model Routing

**Model Routing** in LangGraph is the mechanism by which the system dynamically selects **which LLM (or set of LLMs)** should handle a given step of execution based on **task requirements, state, cost, latency, accuracy, or policy constraints**.
It enables **adaptive, cost-efficient, and production-grade multi-model systems**.

---

### **1. Why Model Routing Exists**

Modern LLM systems operate under competing constraints:

| Constraint     | Reality                                          |
| -------------- | ------------------------------------------------ |
| Accuracy       | Large models perform better                      |
| Cost           | Large models are expensive                       |
| Latency        | Smaller models are faster                        |
| Context length | Some models handle longer context                |
| Capabilities   | Some models specialize (code, vision, reasoning) |

A single static model is inefficient.
**Routing allows each task step to use the most appropriate model.**

---

### **2. Core Idea**

```
State → Router → Model Selection → Execution → State Update
```

The router decides the model **at runtime** based on the evolving state.

---

### **3. Where Routing Happens in LangGraph**

Routing is implemented as a **router node** that returns:

* Which model to use
* Which node to execute next

```python
def model_router(state):
    if state["task"] == "coding":
        return "code_model"
    elif state["complexity"] > 7:
        return "large_model"
    else:
        return "small_model"
```

---

### **4. Architecture Pattern**

```
            ┌──────────────┐
State ───▶  │ Model Router │
            └──────┬───────┘
                   │
     ┌─────────────┼─────────────┐
     ↓             ↓             ↓
 Small LLM     Large LLM     Code LLM
```

---

### **5. Minimal Working Example**

```python
from langgraph.graph import StateGraph, END
from typing import TypedDict

class State(TypedDict):
    prompt: str
    complexity: int
    response: str

def small_model(state):
    return {"response": f"SMALL: {state['prompt']}"}

def large_model(state):
    return {"response": f"LARGE: {state['prompt']}"}

def router(state):
    return "large" if state["complexity"] > 5 else "small"

builder = StateGraph(State)

builder.add_node("small", small_model)
builder.add_node("large", large_model)

builder.add_conditional_edges("router", router, {
    "small": "small",
    "large": "large"
})

builder.set_entry_point("router")
builder.add_edge("small", END)
builder.add_edge("large", END)

graph = builder.compile()
```

---

### **6. Advanced Routing Criteria**

| Factor     | Routing Logic                   |
| ---------- | ------------------------------- |
| Input size | Large context → large model     |
| Task type  | Code → code model               |
| User tier  | Premium → stronger model        |
| Budget     | Low → smaller model             |
| Latency    | Realtime → fast model           |
| Risk level | High risk → best model          |
| Language   | Non-English → specialized model |

---

### **7. Production-Grade Model Routing**

| Feature            | Implementation               |
| ------------------ | ---------------------------- |
| Multi-model pool   | OpenAI, Claude, local models |
| Cost tracking      | Token usage + budget guard   |
| Fallback           | Automatic model failover     |
| Canary routing     | Test new models safely       |
| A/B routing        | Quality evaluation           |
| Policy enforcement | Compliance rules             |
| Observability      | Per-model metrics            |

---

### **8. Model Routing with Loops & Agents**

In agent systems, routing happens **inside cycles**:

```
Plan → Choose Model → Execute → Evaluate → Re-route → Repeat
```

This enables **adaptive intelligence**.

---

### **9. Common Routing Variants**

| Variant              | Description              |
| -------------------- | ------------------------ |
| Static routing       | Fixed model per node     |
| Dynamic routing      | State-based decisions    |
| Hierarchical routing | Supervisor selects model |
| Ensemble routing     | Multiple models vote     |
| Fallback routing     | Failover on error        |
| Budget-aware routing | Stops overspending       |

---

### **10. Mental Model**

Model routing turns LangGraph from a static workflow into a **self-optimizing AI system**:

> **Right model, right task, right time.**


### Demonstration

In [1]:
from langgraph.graph import StateGraph, END
from typing import TypedDict

# -------------------- State Definition --------------------

class State(TypedDict):
    prompt: str
    complexity: int
    response: str

# -------------------- Models (simulated) --------------------

def small_model(state: State):
    return {"response": f"[SMALL MODEL] {state['prompt']}"}

def large_model(state: State):
    return {"response": f"[LARGE MODEL] {state['prompt']}"}

# -------------------- Router --------------------

def model_router(state: State):
    if state["complexity"] > 5:
        return "large"
    else:
        return "small"

# -------------------- Build Graph --------------------

builder = StateGraph(State)

builder.add_node("router", lambda state: state)
builder.add_node("small", small_model)
builder.add_node("large", large_model)

builder.set_entry_point("router")

builder.add_conditional_edges("router", model_router, {
    "small": "small",
    "large": "large"
})

builder.add_edge("small", END)
builder.add_edge("large", END)

graph = builder.compile()

# -------------------- Run Examples --------------------

print(graph.invoke({"prompt": "Explain AI", "complexity": 3}))
print(graph.invoke({"prompt": "Prove convergence of SGD", "complexity": 9}))


{'prompt': 'Explain AI', 'complexity': 3, 'response': '[SMALL MODEL] Explain AI'}
{'prompt': 'Prove convergence of SGD', 'complexity': 9, 'response': '[LARGE MODEL] Prove convergence of SGD'}
