<a href="https://colab.research.google.com/github/salehihaniyeh/LLM-RAG-Routing/blob/main/RAG_Routing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
! pip install langchain_community tiktoken langchain-groq langchainhub chromadb langchain youtube-transcript-api pytube --use-deprecated=legacy-resolver
!pip install langchain_groq[embeddings]
!pip install sentence_transformers

In [39]:
import os
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = "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"
os.environ['GROQ_API_KEY'] = "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"

#### Logical & Semantic Routing

In [40]:
from typing import Literal

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_groq import ChatGroq

# Data model
class RouteQuery(BaseModel):
    """Route a user query to the most relevant datasource."""

    datasource: Literal["python_docs", "js_docs", "literature_docs"] = Field(
        ...,
        description="Given a user question choose which datasource would be most relevant for answering their question",
    )

# LLM with function call
llm = ChatGroq(model="mixtral-8x7b-32768", temperature=0)
structured_llm = llm.with_structured_output(RouteQuery)

# Prompt
system = """You are an expert at routing a user question to the appropriate data source.

Based on the programming language the question is referring to, route it to the relevant data source."""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

# Define router
router = prompt | structured_llm

In [41]:
question1 = """Which paper answers the question:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(["human", "speak in {language}"])
prompt.invoke("french")
"""

result1 = router.invoke({"question": question1})
print(result1)
print(result1.datasource)

datasource='literature_docs'
literature_docs


In [42]:
question2 = """Why is my code giving me errors:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(["human", "speak in {language}"])
prompt.invoke("french")
"""

result2 = router.invoke({"question": question2})
print(result2)
print(result2.datasource)

datasource='python_docs'
python_docs


In [43]:
def choose_route(result):
    if "python_docs" in result.datasource.lower():
        ### Logic here
        return "chain for python_docs"
    elif "js_docs" in result.datasource.lower():
        ### Logic here
        return "chain for js_docs"
    else:
        ### Logic here
        return "literature_docs"

from langchain_core.runnables import RunnableLambda

full_chain = router | RunnableLambda(choose_route)

In [44]:
full_chain.invoke({"question": question1})


'literature_docs'

#### Semantic Routing

In [45]:
from langchain.utils.math import cosine_similarity
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain_groq import ChatGroq

# Two prompts
ML_template = """You are a an expert in Machine learning. \
You are great at answering questions about large language models (LLM) in a concise and easy to understand manner. \
When you don't know the answer to a question you admit that you don't know.

Here is a question:
{query}"""

medicine_template = """You are a very good physician. You are great at answering medical questions. \
You are so good because you have years of experience and know human body very well and have a broad \
knowledge about illnesses and their symptoms.\
When you don't know the answer to a question you admit that you don't know.


Here is a question:
{query}"""

# Embed prompts
embeddings = SentenceTransformerEmbeddings()
prompt_templates = [ML_template, medicine_template]
prompt_embeddings = embeddings.embed_documents(prompt_templates)

# Route question to prompt
def prompt_router(input):
    # Embed question
    query_embedding = embeddings.embed_query(input["query"])
    # Compute similarity
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    # Chosen prompt
    print("Using Machine Learning" if most_similar == ML_template else "Using Medicine")
    return PromptTemplate.from_template(most_similar)


chain = (
    {"query": RunnablePassthrough()}
    | RunnableLambda(prompt_router)
    | ChatGroq()
    | StrOutputParser()
)

print(chain.invoke("How does RAG improve the ability of language models?"))
print(chain.invoke("I've been experiencing persistent fatigue and headaches,\
 along with unexplained weight loss and increased thirst and urination. What is the underlying cause."))

Using Machine Learning
RAG, or Retrieval-Augmented Generation, is a technique that combines a language model with a retrieval system to improve its ability to generate accurate and relevant responses.

In a RAG model, the language model is enhanced with the ability to retrieve relevant information from a large corpus of documents, such as a database or the entire Wikipedia, before generating a response. This allows the model to access a wider range of information beyond its pre-trained knowledge, which can lead to more accurate and up-to-date responses.

RAG can improve the ability of language models in several ways:

1. Increased Factual Accuracy: By retrieving relevant information from a large corpus of documents, RAG models can generate more factually accurate responses.
2. Improved Relevance: RAG models can generate responses that are more relevant to the user's question or input, as they have access to a wider range of information.
3. Reduced Hallucinations: RAG models are less li