# Rag From Scratch: Routing

![Screenshot 2024-03-25 at 8.08.30 PM.png](https://i.imgur.com/jwRw5Je.png)

## Enviornment

`(1) Packages`

In [1]:
%pip install langchain_community==0.3.3 tiktoken==0.8.0 langchain-openai==0.2.3 langchain_together==0.2.0 langchainhub==0.1.21 chromadb==0.5.15 langchain==0.3.4 youtube-transcript-api==0.6.2 pytube==15.0.0

Collecting langchain_community
  Downloading langchain_community-0.3.3-py3-none-any.whl.metadata (2.8 kB)
Collecting tiktoken
  Downloading tiktoken-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting langchain-openai
  Downloading langchain_openai-0.2.3-py3-none-any.whl.metadata (2.6 kB)
Collecting langchainhub
  Downloading langchainhub-0.1.21-py3-none-any.whl.metadata (659 bytes)
Collecting chromadb
  Downloading chromadb-0.5.15-py3-none-any.whl.metadata (6.8 kB)
Collecting langchain
  Downloading langchain-0.3.4-py3-none-any.whl.metadata (7.1 kB)
Collecting youtube-transcript-api
  Downloading youtube_transcript_api-0.6.2-py3-none-any.whl.metadata (15 kB)
Collecting pytube
  Downloading pytube-15.0.0-py3-none-any.whl.metadata (5.0 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting langchain-core<0.4.0,>=0.3.12 (from langchain_community)
  D

`(2) LangSmith`

https://docs.smith.langchain.com/

In [6]:
from google.colab import userdata
import os
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = userdata.get('LANGCHAIN_API_KEY')

`(3) API Keys`

In [7]:
os.environ['TOGETHER_API_KEY'] = userdata.get('TOGETHER_API_KEY')

## Part 10: Logical and Semantic routing

Use function-calling for classification.

Flow:

![Screenshot 2024-03-15 at 3.29.30 PM.png](https://i.imgur.com/9Ni4Fdr.png)

Docs:

https://python.langchain.com/docs/use_cases/query_analysis/techniques/routing#routing-to-multiple-indexes

In [11]:
from typing import Literal

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
# from langchain_openai import ChatOpenAI
from langchain_together import ChatTogether, TogetherEmbeddings

# Data model
class RouteQuery(BaseModel):
    """Route a user query to the most relevant datasource."""

    datasource: Literal["python_docs", "js_docs", "golang_docs"] = Field(
        ...,
        description="Given a user question choose which datasource would be most relevant for answering their question",
    )

# LLM with function call
# llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)


llm = ChatTogether(
    model="meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo"
)
embd = TogetherEmbeddings(model="BAAI/bge-large-en-v1.5")
structured_llm = llm.with_structured_output(RouteQuery)

# Prompt
system = """You are an expert at routing a user question to the appropriate data source.

Based on the programming language the question is referring to, route it to the relevant data source."""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

# Define router
router = prompt | structured_llm

Note: we used function calling to produce structured output.

![Screenshot 2024-03-16 at 12.38.23 PM.png](https://i.imgur.com/i43OWp4.png)

In [12]:
question = """Why doesn't the following code work:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(["human", "speak in {language}"])
prompt.invoke("french")
"""

result = router.invoke({"question": question})

In [13]:
result

RouteQuery(datasource='python_docs')

In [14]:
result.datasource

'python_docs'

Once we have this, it is trivial to define a branch that uses `result.datasource`

https://python.langchain.com/docs/expression_language/how_to/routing

In [15]:
def choose_route(result):
    if "python_docs" in result.datasource.lower():
        ### Logic here
        return "chain for python_docs"
    elif "js_docs" in result.datasource.lower():
        ### Logic here
        return "chain for js_docs"
    else:
        ### Logic here
        return "golang_docs"

from langchain_core.runnables import RunnableLambda

full_chain = router | RunnableLambda(choose_route)

In [16]:
full_chain.invoke({"question": question})

'chain for python_docs'

Trace:

https://smith.langchain.com/public/c2ca61b4-3810-45d0-a156-3d6a73e9ee2a/r

### Semantic routing

Flow:

![Screenshot 2024-03-15 at 3.30.08 PM.png](https://i.imgur.com/XdGFAdF.png)

Docs:

https://python.langchain.com/docs/expression_language/cookbook/embedding_router

In [18]:
from langchain_together import ChatTogether, TogetherEmbeddings

llm = ChatTogether(
    model="meta-llama/Meta-Llama-3-70B-Instruct-Lite"
)
embd = TogetherEmbeddings(model="BAAI/bge-large-en-v1.5")

In [19]:
from langchain.utils.math import cosine_similarity
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
# from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Two prompts
physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise and easy to understand manner. \
When you don't know the answer to a question you admit that you don't know.

Here is a question:
{query}"""

math_template = """You are a very good mathematician. You are great at answering math questions. \
You are so good because you are able to break down hard problems into their component parts, \
answer the component parts, and then put them together to answer the broader question.

Here is a question:
{query}"""

# Embed prompts
# embeddings = OpenAIEmbeddings()

prompt_templates = [physics_template, math_template]
prompt_embeddings = embd.embed_documents(prompt_templates)

# Route question to prompt
def prompt_router(input):
    # Embed question
    query_embedding = embd.embed_query(input["query"])
    # Compute similarity
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    # Chosen prompt
    print("Using MATH" if most_similar == math_template else "Using PHYSICS")
    return PromptTemplate.from_template(most_similar)


chain = (
    {"query": RunnablePassthrough()}
    | RunnableLambda(prompt_router)
    | llm
    | StrOutputParser()
)

print(chain.invoke("What's a black hole"))

Using PHYSICS
Black holes! One of the most fascinating and mind-bending phenomena in the universe.

A black hole is a region in space where the gravitational pull is so strong that nothing, including light, can escape. It's formed when a massive star collapses in on itself and its gravity becomes so strong that it warps the fabric of spacetime around it.

Imagine a sinkhole in spacetime, where everything that gets too close to the edge gets pulled in and can't climb back out. That's roughly the idea behind a black hole.

Here's how it happens: when a massive star runs out of fuel, it collapses under its own gravity, causing a massive amount of matter to be compressed into an incredibly small point, called a singularity. The gravity at this point becomes so strong that it creates a boundary called the event horizon, which marks the point of no return. Once something crosses the event horizon, it's trapped forever.

Now, here's the really cool part: black holes come in different sizes, r

Trace:

https://smith.langchain.com/public/98c25405-2631-4de8-b12a-1891aded3359/r

# Rag From Scratch: Query Construction

![Screenshot 2024-03-25 at 8.20.28 PM.png](https://i.imgur.com/1XD2HKl.png)

For graph and SQL, see helpful resources:

https://blog.langchain.dev/query-construction/

https://blog.langchain.dev/enhancing-rag-based-applications-accuracy-by-constructing-and-leveraging-knowledge-graphs/

## Part 11: Query structuring for metadata filters

Flow:

![Screenshot 2024-03-16 at 1.12.10 PM.png](https://i.imgur.com/y5Vt4Mx.png)

Many vectorstores contain metadata fields.

This makes it possible to filter for specific chunks based on metadata.

Let's look at some example metadata we might see in a database of YouTube transcripts.

Docs:

https://python.langchain.com/docs/use_cases/query_analysis/techniques/structuring

In [20]:
from langchain_community.document_loaders import YoutubeLoader

docs = YoutubeLoader.from_youtube_url(
    "https://www.youtube.com/watch?v=pbAd8O1Lvm4", add_video_info=True
).load()

docs[0].metadata

PytubeError: Exception while accessing title of https://youtube.com/watch?v=pbAd8O1Lvm4. Please file a bug report at https://github.com/pytube/pytube

Let’s assume we’ve built an index that:

1. Allows us to perform unstructured search over the `contents` and `title` of each document
2. And to use range filtering on `view count`, `publication date`, and `length`.

We want to convert natural langugae into structured search queries.

We can define a schema for structured search queries.

In [21]:
import datetime
from typing import Literal, Optional, Tuple
from langchain_core.pydantic_v1 import BaseModel, Field

class TutorialSearch(BaseModel):
    """Search over a database of tutorial videos about a software library."""

    content_search: str = Field(
        ...,
        description="Similarity search query applied to video transcripts.",
    )
    title_search: str = Field(
        ...,
        description=(
            "Alternate version of the content search query to apply to video titles. "
            "Should be succinct and only include key words that could be in a video "
            "title."
        ),
    )
    min_view_count: Optional[int] = Field(
        None,
        description="Minimum view count filter, inclusive. Only use if explicitly specified.",
    )
    max_view_count: Optional[int] = Field(
        None,
        description="Maximum view count filter, exclusive. Only use if explicitly specified.",
    )
    earliest_publish_date: Optional[datetime.date] = Field(
        None,
        description="Earliest publish date filter, inclusive. Only use if explicitly specified.",
    )
    latest_publish_date: Optional[datetime.date] = Field(
        None,
        description="Latest publish date filter, exclusive. Only use if explicitly specified.",
    )
    min_length_sec: Optional[int] = Field(
        None,
        description="Minimum video length in seconds, inclusive. Only use if explicitly specified.",
    )
    max_length_sec: Optional[int] = Field(
        None,
        description="Maximum video length in seconds, exclusive. Only use if explicitly specified.",
    )

    def pretty_print(self) -> None:
        for field in self.__fields__:
            if getattr(self, field) is not None and getattr(self, field) != getattr(
                self.__fields__[field], "default", None
            ):
                print(f"{field}: {getattr(self, field)}")

Now, we prompt the LLM to produce queries.

In [24]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_together import ChatTogether, TogetherEmbeddings

system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Given a question, return a database query optimized to retrieve the most relevant results.

If there are acronyms or words you are not familiar with, do not try to rephrase them."""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)
# llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)


llm = ChatTogether(
    model="meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo"
)
embd = TogetherEmbeddings(model="BAAI/bge-large-en-v1.5")
structured_llm = llm.with_structured_output(RouteQuery)

structured_llm = llm.with_structured_output(TutorialSearch)
query_analyzer = prompt | structured_llm

In [25]:
query_analyzer.invoke({"question": "Let's find all videos for RAG pipeline released after 2023"}).pretty_print()

content_search: RAG pipeline
title_search: RAG pipeline
earliest_publish_date: 2023-01-01


In [26]:
query_analyzer.invoke(
    {"question": "videos on chat langchain published in 2023"}
).pretty_print()

content_search: chat langchain
title_search: chat langchain
earliest_publish_date: 2023-01-01
latest_publish_date: 2023-12-31


In [27]:
query_analyzer.invoke(
    {"question": "videos that are focused on the topic of chat langchain that are published before 2024"}
).pretty_print()

content_search: chat langchain
title_search: chat langchain
latest_publish_date: 2024-01-01


In [29]:
query_analyzer.invoke(
    {
        "question": "how to use multi-modal models in an agent, only videos under 5 minutes"
    }
).pretty_print()

To then connect this to various vectorstores, you can follow [here](https://python.langchain.com/docs/modules/data_connection/retrievers/self_query#constructing-from-scratch-with-lcel).