# 2.5 Mastering LLM Routing

Routing allows you to direct a user's query to different processing paths (chains) based on its intent. In this exercise, we will build a router that decides whether to send a question to a **GPT-4o** model (for coding) or a **Llama 3.2** model (for general chat).

### Why use Routing?
- **Efficiency**: Use smaller models for easy questions.
- **Expertise**: Direct technical questions to models with specific instructions.
- **Cost**: Save money by only using expensive models when necessary.

## 1. Setup Models and Parser
We initialize two models. One is high-performance (GPT-4o) and the other is a local model (Llama 3.2) running via Ollama.

In [5]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch, RunnableLambda
from langchain_openai import ChatOpenAI
from langchain_community.chat_models import ChatOllama

# 1. Models
llm_complex = ChatOpenAI(model="gpt-4o", temperature=0)
llm_slow = ChatOllama(model="llama3.2", temperature=0, base_url="http://ollama:11434")

parser = StrOutputParser()

## 2. Define Specialized Prompts
Each path in our router needs a specific "personality". We create one for coding and one for general assistance.

In [6]:
# 2. Prompts
prompt_code = ChatPromptTemplate.from_messages([
    ("system", "You are a senior software engineer. Answer with code-focused explanations."),
    ("user", "{input}")
])

prompt_general = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant for general questions."),
    ("user", "{input}")
])

# 3. Chains
coding_chain = prompt_code | llm_complex | parser
chat_chain = prompt_general | llm_slow | parser

## 3. Pre-processing the Input (Intent Extraction)
Before deciding where to go, we normalize the input. In this case, we ensure the `topic` is lowercase so our keyword matching is consistent.

In [7]:
# 4. Intent extractor
def extract_intent(x: dict) -> dict:
    topic = x.get("topic", "").lower()
    return {
        "topic": topic,
        "input": x["input"]
    }

intent_extractor = RunnableLambda(extract_intent)

## 4. The Router Logic
The `RunnableBranch` acts like an `if-else` statement. 
- **IF** the topic contains keywords like "code" or "bug", use the `coding_chain`.
- **ELSE** (default), use the `chat_chain`.

In [8]:
# 5. Router (RunnableBranch)
branch = RunnableBranch(
    (
        lambda x: any(k in x["topic"] for k in ["code", "python", "bug", "error"]),
        coding_chain
    ),
    chat_chain  # Default fallback
)

## 5. Build and Run the Full Pipeline
We combine the extractor and the router into one `full_pipeline` and test it with a coding question.

In [9]:
# 6. Full pipeline
full_pipeline = intent_extractor | branch

# 7. Invoke
result = full_pipeline.invoke({
    "topic": "Python code debugging",
    "input": "Why does my list comprehension throw an IndexError?"
})

print(result)

An `IndexError` in a list comprehension typically occurs when you try to access an index that is out of range for a list. This can happen if you're iterating over one list but using indices to access another list that is shorter or not properly aligned with the first.

Here are a few common scenarios that might lead to an `IndexError` in a list comprehension:

1. **Mismatched List Lengths**: If you're using indices to access elements from multiple lists, ensure that all lists are of the same length.

   ```python
   list1 = [1, 2, 3]
   list2 = [4, 5]  # Shorter list
   result = [list1[i] + list2[i] for i in range(len(list1))]  # IndexError when i = 2
   ```

   **Solution**: Use `zip` to iterate over both lists simultaneously:

   ```python
   result = [a + b for a, b in zip(list1, list2)]
   ```

2. **Incorrect Range**: If you're using `range()` to generate indices, make sure the range does not exceed the length of the list.

   ```python
   my_list = [10, 20, 30]
   result = [my_lis