# **Routing in Retrieval Augmented Generation Systems**  

## **Introduction to Routing**  
Routing in AI systems refers to the process of directing user queries or data to the most appropriate resource, model, or database. It ensures that relevant information is accessed efficiently, improving the accuracy and effectiveness of AI applications.  

### **Why is Routing Important?**  
- Enhances **efficiency** by directing queries to the correct data source.  
- Improves **accuracy** by ensuring the right model or prompt is used.  
- Reduces **processing overhead** by avoiding unnecessary computations.  
- Enables **scalability**, making AI systems more adaptable to diverse queries.  

Routing can be categorized into two primary types:  
1. **Logical Routing** – Based on predefined rules and structured decision-making.  
2. **Semantic Routing** – Based on the meaning (semantics) of the query using AI techniques like embeddings and similarity measures. 

---

### **1. Logical Routing**  
Logical routing is a structured approach where a query is directed based on predefined logic, rules, or categorization. It follows a deterministic path based on explicit criteria such as keywords, categories, or metadata.  

### **Example: Routing Based on Programming Language**  
Consider a case where user queries are related to different programming languages (Python, JavaScript, Golang). We can define a **logical router** that classifies the query and directs it to the most relevant data source. 

In [None]:
import os

os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'

In [1]:
# Required API keys with validation
import os

from dotenv import load_dotenv
load_dotenv()  

LANGCHAIN_API_KEY = os.getenv("LANGCHAIN_API_KEY")
GROQ_API_KEY = os.getenv("GROQ_API_KEY")

In [None]:
# Install Langchain groq

# !pip install langchain-groq

In [None]:
from typing import Literal
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field

### **Define Logical Router**

In [None]:
# Data model (unchanged)
class RouteQuery(BaseModel):
    """Route a user query to the most relevant datasource."""

    datasource: Literal["python_docs", "js_docs", "golang_docs"] = Field(
        ...,
        description="Given a user question choose which datasource would be most relevant for answering their question",
    )

# LLM with function call - changed to Groq
llm = ChatGroq(
    model_name="llama3-70b-8192",
    temperature=0
)
structured_llm = llm.with_structured_output(RouteQuery)

# Prompt (unchanged)
system = """You are an expert at routing a user question to the appropriate data source.

Based on the programming language the question is referring to, route it to the relevant data source."""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

# Define router (unchanged)
router = prompt | structured_llm

In the above implementation, the model decides which programming language documentation is most relevant based on user queries.  

#### **Example Query**  


In [11]:
question = "Why doesn't the following Python code work?"
result = router.invoke({"question": question})
print(result.datasource)  # Output: python_docs

python_docs


Logical routing provides **structured** and **rule-based** query handling, ensuring that queries are sent to the correct domain.  

---

### **2. Semantic Routing**  
Semantic routing involves directing a query based on its meaning rather than predefined rules. It leverages AI techniques such as **embeddings**, **vector similarity**, and **machine learning models** to determine the most appropriate response.  

### **Example: Routing Based on Query Meaning**  
Suppose we want to route a user’s query to either a **Physics** or **Mathematics** expert based on its meaning. Instead of using keywords, we embed the query and compare it with predefined knowledge areas.  

In [14]:
from langchain_groq import ChatGroq
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.output_parsers import StrOutputParser
from langchain_huggingface import HuggingFaceEmbeddings
from sklearn.metrics.pairwise import cosine_similarity
import os

In [25]:
# Define two expert templates
physics_template = """You are a physics professor. Answer physics-related questions clearly and concisely.
When unsure, state that you don't know. Here is the question: {query}"""

chemistry_template = """You are a chemistry professor. Provide clear and concise answers to chemistry-related questions.
If you're unsure, state that you don't know. Here is the question: {query}"""

# Initialize embeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
prompt_templates = [physics_template, chemistry_template]

# Embed the templates (must be plain text)
prompt_embeddings = embeddings.embed_documents(prompt_templates)

# Define routing function (fixed)
def prompt_router(input_dict):
    query_text = input_dict["query"]    
    query_embedding = embeddings.embed_query(query_text)
    
    # Calculate cosine similarity
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    
    print("Using CHEMISTRY" if most_similar == chemistry_template else "Using PHYSICS")
    
    # Return the formatted template as a string
    return most_similar.format(query=query_text)

# Set up Groq's chat model
groq_model = ChatGroq(model_name="llama3-70b-8192", temperature=0.1)

# Define the chain with proper type handling
chain = (
    {"query": RunnablePassthrough()}
    | RunnableLambda(lambda x: x["query"] if isinstance(x, dict) else x)
    | RunnableLambda(prompt_router)
    | groq_model 
    | StrOutputParser()
)

In [26]:
# Example query (works with either format)
query = {"query": "What is mathane?"}

result = chain.invoke(query)
print(result)

Using CHEMISTRY
I think there may be a small mistake there!

I believe you meant to ask "What is methane?"

Methane is a chemical compound with the molecular formula CH₄. It is a colorless, odorless, and highly flammable gas. It is the main component of natural gas and is widely used as a fuel and a chemical feedstock. Methane is also a potent greenhouse gas, and its presence in the atmosphere contributes to climate change.

Is that what you were looking for?


In [28]:
# Example query (works with either format)
query = {"query": "What is a black hole??"}

result = chain.invoke(query)
print(result)

Using PHYSICS
A black hole is a region in space where the gravitational pull is so strong that nothing, including light, can escape. It's formed when a massive star collapses in on itself and its gravity becomes so strong that it warps the fabric of spacetime around it.

Here's what happens: when a star with a mass at least three times that of the sun runs out of fuel, it can no longer support its own weight. It collapses under its own gravity, causing a massive amount of matter to be compressed into an incredibly small point called a singularity. The gravity of this singularity is so strong that it creates a boundary called the event horizon, which marks the point of no return. Any matter or radiation that crosses the event horizon is trapped by the black hole's gravity and cannot escape.

The term "black" refers to the fact that not even light can escape the gravitational pull, making it invisible to us. The "hole" part refers to the void or empty space left behind by the collapsed s

In the above example:  
- **Semantic routing** is used to determine the most relevant expert based on the meaning of the query.  
- The **cosine similarity** function measures how closely the query relates to either Physics or Mathematics.  


### **Key Differences from Logical Routing**  
| Feature             | Logical Routing | Semantic Routing |
|---------------------|----------------|------------------|
| **Basis**          | Predefined rules | AI-based similarity matching |
| **Flexibility**    | Fixed decision paths | Dynamic and adaptable |
| **Accuracy**       | Depends on rule coverage | Adapts to user intent |
| **Scalability**    | Limited by rule complexity | Scales well with embeddings |

## **Comparison: Logical vs. Semantic Routing**  

| Aspect               | Logical Routing | Semantic Routing |
|----------------------|----------------|------------------|
| **Routing Method**  | Predefined logic and rules | AI-based meaning analysis |
| **Use Cases**       | Well-defined query categories | Complex, intent-driven queries |
| **Scalability**     | Limited by predefined rules | Adapts dynamically |
| **Flexibility**     | Rigid | Highly flexible |
| **Implementation Complexity** | Easier | More complex (requires AI models) |