## 🧭 Routing in Retrieval-Augmented Generation (RAG)

Routing in RAG systems determines **how queries are processed** and **which retrievers, models, or tools** should handle them.  
It ensures that user inputs are directed to the most suitable path for accurate and efficient responses.

---

### 🔹 1. Semantic Routing
Semantic Routing uses **embedding-based similarity** to understand the **meaning** of the query and decide where to route it.

**How it works:**
- The query is converted into an **embedding vector**.
- The system compares it with predefined route embeddings (e.g., “math questions”, “programming help”, “document retrieval”).
- The query is routed to the retriever or model with the **closest semantic match**.

**Example:**
- If the query is *"Summarize the meeting notes"*,  
  it routes to the **document summarization chain**.
- If the query is *"Explain overfitting in machine learning"*,  
  it routes to the **knowledge retrieval chain**.

**Use case:** Multi-domain RAG systems where queries can belong to diverse semantic categories.

---

### 🔹 2. Logical Routing
Logical Routing uses **explicit rules, conditions, or metadata** to direct the query flow.

**How it works:**
- Based on **if–else logic** or **structured conditions** (e.g., metadata tags, keywords, document type).
- Doesn’t rely on embeddings — uses deterministic logic.

**Example:**
```python
if "SQL" in query:
    route_to = "Database_Retriever"
elif "image" in query:
    route_to = "Vision_Model"
else:
    route_to = "General_RAG_Chain"


## Making two vector database one is for Medical and one is for legal documents

In [2]:
import getpass
import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain_core.documents import Document

from uuid import uuid4
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")


embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

### Medical Vector Database

In [3]:
# 2️⃣ Initialize OpenAI embeddings
medical_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# 3️⃣ Create (or load existing) Chroma vector store
medical_vector_store = Chroma(
    collection_name="medical_rag_collection",
    embedding_function=embeddings,
    persist_directory="medical_collection",
)

# 4️⃣ Read all .txt files and convert to Document objects
medical_documents = []
medical_text_folder = "C:/Users/aniln/Desktop/github_celery_redis/Advance_RAG2"
filename = "medical.txt"
file_path = os.path.join(medical_text_folder, filename)
with open(file_path, "r", encoding="utf-8") as f:
    texts = f.read().split("\n")
    for text in texts:
        doc = Document(
            page_content=text.strip(),
            metadata={"source": filename}
        )
        medical_documents.append(doc)

# 5️⃣ Generate unique IDs for all documents
uuids = [str(uuid4()) for _ in range(len(medical_documents))]

# 6️⃣ Add documents to vector store
medical_vector_store.add_documents(documents=medical_documents, ids=uuids)

# 7️⃣ Persist (save) the database
medical_vector_store.persist()

print(f"✅ Added {len(medical_documents)} text files to the vector database.")


  medical_vector_store = Chroma(


✅ Added 53 text files to the vector database.


  medical_vector_store.persist()


### Sport Vector Database

In [4]:
# 1️⃣ Define your local folder containing .txt files
sport_text_folder = "sport.txt"  # e.g., ./text_data/myfile.txt

# 2️⃣ Initialize OpenAI embeddings
sport_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# 3️⃣ Create (or load existing) Chroma vector store

sport_vector_store = Chroma(
    collection_name="sport_rag_collection",
    embedding_function=embeddings,
    persist_directory="sport_collection",
)

# 4️⃣ Read all .txt files and convert to Document objects
sport_text_folder = "C:/Users/aniln/Desktop/github_celery_redis/Advance_RAG2"
filename = "sport.txt"
sport_documents = []
file_path = os.path.join(sport_text_folder, filename)
with open(file_path, "r", encoding="utf-8") as f:
    texts = f.read().split("\n")

    for text in texts:
        doc = Document(
            page_content=text.strip(),
            metadata={"source": filename}
        )
        sport_documents.append(doc)


# 5️⃣ Generate unique IDs for all documents
uuids = [str(uuid4()) for _ in range(len(sport_documents))]

# 6️⃣ Add documents to vector store
sport_vector_store.add_documents(documents=sport_documents, ids=uuids)

# 7️⃣ Persist (save) the database
sport_vector_store.persist()

print(f"✅ Added {len(sport_documents)} text files to the vector database.")


✅ Added 521 text files to the vector database.


### Logical Routing

In [5]:
import os
from uuid import uuid4
from langchain_community.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

# -------------------------------
# 1️⃣ Initialize OpenAI embeddings
# -------------------------------
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# -------------------------------
# 2️⃣ Load vector stores (collections)
# -------------------------------
medical_vector_store = Chroma(
    collection_name="medical_rag_collection",
    persist_directory="medical_collection",
    embedding_function=embeddings,
)

sport_vector_store = Chroma(
    collection_name="sport_rag_collection",
    persist_directory="sport_collection",
    embedding_function=embeddings,
)

# -------------------------------
# 3️⃣ Create retrievers for each domain
# -------------------------------
medical_retriever = medical_vector_store.as_retriever(search_kwargs={"k": 3})
sport_retriever = sport_vector_store.as_retriever(search_kwargs={"k": 3})

# -------------------------------
# 4️⃣ Initialize LLM
# -------------------------------
llm = ChatOpenAI(
    model_name="gpt-4",  # or "gpt-3.5-turbo"
    temperature=0
)

# -------------------------------
# 5️⃣ Define logical router
# -------------------------------
def route_query(query: str):
    """
    Route query to the appropriate domain retriever based on keywords.
    """
    medical_keywords = ["disease", "treatment", "doctor", "symptom", "medical", "health"]
    sport_keywords = ["football", "soccer", "cricket", "match", "tournament", "sport"]

    query_lower = query.lower()

    if any(word in query_lower for word in medical_keywords):
        return medical_retriever
    elif any(word in query_lower for word in sport_keywords):
        return sport_retriever
    else:
        # Default: search both and merge results
        return [medical_retriever, sport_retriever]

# -------------------------------
# 6️⃣ Answer query with retrieved docs
# -------------------------------
def answer_query(query: str):
    retriever = route_query(query)
    
    # Retrieve documents
    if isinstance(retriever, list):
        all_docs = []
        for r in retriever:
            all_docs.extend(r.get_relevant_documents(query))
    else:
        all_docs = retriever.get_relevant_documents(query)
    
    # Print retrieved documents
    print("\nRetrieved Documents:")
    for i, doc in enumerate(all_docs, 1):
        print(f"[Doc {i}] Source: {doc.metadata.get('source', 'unknown')}")
        print(doc.page_content)
        print("-" * 80)
    
    # Concatenate documents content as context
    context = "\n\n".join([doc.page_content for doc in all_docs])

    # Create prompt template for LLM
    template = """Answer the following question based on this context:

{context}

Question: {question}
"""
    prompt = ChatPromptTemplate.from_template(template)

    # Run LLM
    answer = llm.predict(prompt.format(context=context, question=query))
    
    # Return final answer
    return answer

# -------------------------------
# 7️⃣ Test the routing
# -------------------------------
queries = [
    "What are the symptoms of diabetes?",
    "Who won the last football world cup?",
    "What is the treatment for common cold?",
    "List top cricket players in 2025."
]

for q in queries:
    print("\n" + "="*100)
    print("Query:", q)
    answer = answer_query(q)
    print("\nFinal Answer:", answer)
    print("="*100)


  llm = ChatOpenAI(



Query: What are the symptoms of diabetes?


  all_docs = retriever.get_relevant_documents(query)



Retrieved Documents:
[Doc 1] Source: medical.txt
Additionally, the patient should know who to call in the event of an emergency. Many readers will note that these elements closely resemble a competency assessment; indeed, that is the point at hand. If the physician asks the patient the questions implied above, and records the patient's responses, monitoring of changes in the patient's condition may be delegated to that patient.
--------------------------------------------------------------------------------
[Doc 2] Source: medical.txt
The last sovereign principle of documentation relates to the patient's capacity to participate in his or her own care. Examples of this include the patient's ability to understand the purposes of the various medications being prescribed, the patient's awareness of what symptoms to look for regarding exacerbation of the condition, and the patient's knowledge of what symptoms or states of mind constitute an emergency.
--------------------------------------

  answer = llm.predict(prompt.format(context=context, question=query))



Final Answer: The text does not provide information on the symptoms of diabetes.

Query: Who won the last football world cup?

Retrieved Documents:
[Doc 1] Source: sport.txt
· World Cups
--------------------------------------------------------------------------------
[Doc 2] Source: sport.txt
· FIH World Cup Qualifying Tournaments
--------------------------------------------------------------------------------
[Doc 3] Source: sport.txt
· Junior World Cups
--------------------------------------------------------------------------------

Final Answer: The context does not provide information on the last football world cup winner.

Query: What is the treatment for common cold?

Retrieved Documents:
[Doc 1] Source: medical.txt
First, record the risk-benefit analysis of important decisions in the clinical care of the patient. This risk-benefit analysis should include even obvious or “given” benefits. This is a point where many clinicians fall short because, in being risk-aversive, they ten

### Semantic Routing

In [9]:
import os
from uuid import uuid4
from langchain_community.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from numpy import dot
from numpy.linalg import norm
from langchain.utils.math import cosine_similarity
import numpy as np
    

# -------------------------------
# 1️⃣ Initialize OpenAI embeddings
# -------------------------------
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# -------------------------------
# 2️⃣ Load vector stores (collections)
# -------------------------------
medical_vector_store = Chroma(
    collection_name="medical_rag_collection",
    persist_directory="medical_collection",
    embedding_function=embeddings,
)

sport_vector_store = Chroma(
    collection_name="sport_rag_collection",
    persist_directory="sport_collection",
    embedding_function=embeddings,
)

# -------------------------------
# 3️⃣ Create retrievers
# -------------------------------
medical_retriever = medical_vector_store.as_retriever(search_kwargs={"k": 3})
sport_retriever = sport_vector_store.as_retriever(search_kwargs={"k": 3})

# -------------------------------
# 4️⃣ Initialize LLM
# -------------------------------
llm = ChatOpenAI(model_name="gpt-4", temperature=0)

# -------------------------------
# 5️⃣ Precompute "domain embeddings"
#    For example, each domain has a representative prompt
# -------------------------------
domain_prompts = {
    "medical": "Medical domain: diseases, symptoms, treatments, health, doctors",
    "sport": "Sports domain: football, soccer, cricket, matches, tournaments, players"
}

domain_embeddings = {
    domain: embeddings.embed_query(text)
    for domain, text in domain_prompts.items()
}

# -------------------------------
# 6️⃣ Semantic routing based on query embedding
# -------------------------------
def semantic_route(query: str):
    query_emb = np.array(embeddings.embed_query(query)).reshape(1, -1)  # make 2D
    
    # Compute cosine similarity safely
    scores = {}
    for domain, dom_emb in domain_embeddings.items():
        dom_emb_2d = np.array(dom_emb).reshape(1, -1)
        scores[domain] = cosine_similarity(query_emb, dom_emb_2d)[0][0]  # extract scalar

    # Pick the domain with the highest similarity
    best_domain = max(scores, key=scores.get)
    
    if best_domain == "medical":
        return medical_retriever
    else:
        return sport_retriever

# -------------------------------
# 7️⃣ Answer query function
# -------------------------------
def answer_query(query: str):
    retriever = semantic_route(query)
    
    docs = retriever.get_relevant_documents(query)
    
    # Print retrieved docs
    print("\nRetrieved Documents:")
    for i, doc in enumerate(docs, 1):
        print(f"[Doc {i}] Source: {doc.metadata.get('source', 'unknown')}")
        print(doc.page_content)
        print("-" * 80)
    
    context = "\n\n".join([doc.page_content for doc in docs])
    
    # Use prompt template
    template = """Answer the following question based on this context:

{context}

Question: {question}
"""
    prompt = ChatPromptTemplate.from_template(template)
    
    answer = llm.predict(prompt.format(context=context, question=query))
    return answer

# -------------------------------
# 8️⃣ Test
# -------------------------------
queries = [
    "What are the symptoms of diabetes?",
    "Who won the last football world cup?"
]

for q in queries:
    print("\n" + "="*80)
    print("Query:", q)
    answer = answer_query(q)
    print("\nFinal Answer:", answer)
    print("="*80)



Query: What are the symptoms of diabetes?

Retrieved Documents:
[Doc 1] Source: medical.txt
Additionally, the patient should know who to call in the event of an emergency. Many readers will note that these elements closely resemble a competency assessment; indeed, that is the point at hand. If the physician asks the patient the questions implied above, and records the patient's responses, monitoring of changes in the patient's condition may be delegated to that patient.
--------------------------------------------------------------------------------
[Doc 2] Source: medical.txt
The last sovereign principle of documentation relates to the patient's capacity to participate in his or her own care. Examples of this include the patient's ability to understand the purposes of the various medications being prescribed, the patient's awareness of what symptoms to look for regarding exacerbation of the condition, and the patient's knowledge of what symptoms or states of mind constitute an emerge