# RAG from Scratch: Routing

In using RAG, the model often receives the wrong context, either it's too broad, too narrow ot not relevant.

These are some things that can make this happen:
1. Not every question needs retrieval, yet the model pipeline will retrive the data in every cases.
2. Knowledge sources can come in different formats (documents, SQL, databases, code, APIs, etc.)
3. One single retriever cannot fit every task.

<br/>
<br/>
For this reason, we can implement extra step after doing the query transformation step, which is the **Routing**. Routing is a step where we choose to most appropriate source for a query before retrieval, because the right source might be stored in different database or even different database types (relational db, graph db, and vector stores).

Without routing, we would send every question to the same retriever every single time. By implementing routing system. the routing can then decide the following issue:
1. Whether the particular question need retrieval or not.
2. If yes, which retriever or knowledge database should be used?
3. If no, should the LLM answer directly?

## Method 1: Logical Routing

We give the LLM knowledge of the various data sources that we have at our disposal, and we let the LLM to use its logic to reason which database to be used and apply the question to.

Logical Routing is based on rules or categories. It will classify the question and decide the action. When we ask something like "Explain quantum computing" it will route to scientific pages like wikipedia. But for question like "Where is the nearest ATM", it will route it to a real-time API retriever.

In [2]:
! pip install -q langchain_community tiktoken langchain-ollama langchainhub chromadb langchain youtube-transcript-api pytube


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [None]:
import os
from access import Access

os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = Access.LANGCHAIN_API_KEY

## Method 2: Semantic Routing

Sematic routing is a method that is based on embeddings similarity. In this method, we don't define rules. The step will be as follows:
1. We convert the question into a vector (embedding)
2. Compares it to vector "profiles" of each retriever
3. Picks the retriever with the closest semantic match

With this approach, we can have different knowledge base each for medical, finance, and even more. This method is more flecible and works even when the topic is vague or overlapping. It is good for large or unstructured knowledge spaces. The disadvantage is that we need embeddings and vector search to implement this and it is also harder to debug because the decision is not rule based.