# RAG basics (Video 1-4)

This will cover the 3 basic parts of RAG:
- Indexing
- Retrieval
- Generation

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = os.getenv("LANGCHAIN_API_KEY")
os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")

## Part 1
Quickstart : [Link](https://python.langchain.com/docs/use_cases/question_answering/quickstart/)

In [3]:
import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [4]:
# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

In [5]:
docs

[Document(page_content='\n\n      LLM Powered Autonomous Agents\n    \nDate: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n\n\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview#\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final re

In [6]:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

In [7]:
len(splits)

66

In [8]:
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()

In [10]:
#### RETRIEVAL and GENERATION ####
from langchain_openai import ChatOpenAI

# Prompt
prompt = hub.pull("rlm/rag-prompt")

# LLM
llm = ChatOpenAI(model_name="gpt-4-turbo", temperature=0)

# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)



'Task Decomposition refers to the process of breaking down a complex task into smaller, more manageable parts or steps. This technique allows an agent or system to handle and solve intricate problems more effectively by addressing each component separately. Methods such as Chain of Thought and Tree of Thoughts utilize this approach by structuring the problem-solving process into sequential or branching steps, respectively.'

In [11]:
#Question
from IPython.display import display, Markdown

display(Markdown(rag_chain.invoke("What is Task Decomposition?")))

Task Decomposition refers to the process of breaking down a complex task into smaller, more manageable parts or steps. This technique allows an agent or system to handle and solve intricate problems more effectively by addressing each component separately. Methods such as Chain of Thought and Tree of Thoughts enhance this process by structuring the decomposition into sequential or branching steps, respectively.

In [12]:
display(Markdown(rag_chain.invoke("What is MIPS?")))

MIPS stands for Maximum Inner Product Search. It involves saving the embedding representation of information into a vector store database to support fast searches for the maximum inner product. This process is commonly optimized using approximate nearest neighbors (ANN) algorithms to balance speed and accuracy.

In [14]:
display(Markdown(rag_chain.invoke("Summarize me the section Component 1: Planning in context of this piece of text.")))

In the section "Component 1: Planning" from the provided text, planning involves an agent understanding the multiple steps required for a complex task and strategizing accordingly. The technique of Chain of Thought (CoT) is highlighted as a method to enhance model performance by breaking down complex tasks into simpler, manageable steps, thereby aiding in the planning process. This approach not only simplifies tasks but also provides insight into the model's thought process, making it easier to plan and execute tasks effectively.

## Part 2: Indexing

Deep dive into indexing with dummy data

In [17]:
# Documents
question = "What kinds of pets do I like?" 
document = """My favorite pet is a cat."""

In [18]:
import tiktoken ##BPE provided by OpenAI

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

num_tokens_from_string(question, "cl100k_base")

8

In [19]:
from langchain_openai import OpenAIEmbeddings
embd = OpenAIEmbeddings()
query_result = embd.embed_query(question)
document_result = embd.embed_query(document)
len(query_result)

1536

In [20]:
len(document_result)

1536

So both question and document has been converted to a 1536-dim vector. We can next see the cosine similarity of these vectors.

In [21]:
import numpy as np

def cosine_similarity(vec1, vec2):
    dot_product = np.dot(vec1, vec2)
    norm_vec1 = np.linalg.norm(vec1)
    norm_vec2 = np.linalg.norm(vec2)
    return dot_product / (norm_vec1 * norm_vec2)

similarity = cosine_similarity(query_result, document_result)
print("Cosine Similarity:", similarity)

Cosine Similarity: 0.8289352264743555


Lets try a bigger document. 

We will add some random text around this main original one liner, and try to see the top few sentences or documents that match this..

In [22]:
question = "What kinds of pets do I like?" 
document = """The sun dripped honey through the leaves of the old oak tree, casting dappled shadows across the worn swing set. A gentle breeze stirred the air, carrying with it the sweet scent of honeysuckle. It was a perfect afternoon for lounging, and that's exactly what I was doing, sprawled out on the creaky wooden swing. But I wasn't alone. Curled up in my lap, a purring rumble emanating from her tiny chest, was Luna, my favorite pet. My favorite pet is a cat. Luna's fur, the color of twilight, shimmered in the sunlight. Her eyes, emerald pools flecked with gold, blinked contentedly. Every so often, she'd twitch her whiskers, captivated by a flitting butterfly or a rustling leaf. As I stroked her soft fur, a wave of peace washed over me. In that moment, with Luna by my side, the world seemed to melt away. The worries and stresses of the day faded into insignificance. All that mattered was the warmth of the sun, the gentle breeze, and the rhythmic purr of my furry companion. """

In [26]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
splits = text_splitter.split_text(document)

In [27]:
len(splits)

13

In [28]:
splits

['The sun dripped honey through the leaves of the old oak tree, casting dappled shadows across the',
 'shadows across the worn swing set. A gentle breeze stirred the air, carrying with it the sweet',
 "with it the sweet scent of honeysuckle. It was a perfect afternoon for lounging, and that's exactly",
 "and that's exactly what I was doing, sprawled out on the creaky wooden swing. But I wasn't alone.",
 "But I wasn't alone. Curled up in my lap, a purring rumble emanating from her tiny chest, was Luna,",
 "chest, was Luna, my favorite pet. My favorite pet is a cat. Luna's fur, the color of twilight,",
 'color of twilight, shimmered in the sunlight. Her eyes, emerald pools flecked with gold, blinked',
 "with gold, blinked contentedly. Every so often, she'd twitch her whiskers, captivated by a flitting",
 'by a flitting butterfly or a rustling leaf. As I stroked her soft fur, a wave of peace washed over',
 'peace washed over me. In that moment, with Luna by my side, the world seemed to me

In [32]:
from langchain_openai import OpenAIEmbeddings
embd = OpenAIEmbeddings()
query_result = embd.embed_query(question)
document_result = embd.embed_documents(splits)

In [34]:
len(query_result)

1536

In [35]:
len(document_result)

13

In [36]:
for i, page_result in enumerate(document_result):
    print(splits[i])
    print("similarity: ", cosine_similarity(query_result, page_result))
    print("-"*100)

The sun dripped honey through the leaves of the old oak tree, casting dappled shadows across the
similarity:  0.6827142722652356
----------------------------------------------------------------------------------------------------
shadows across the worn swing set. A gentle breeze stirred the air, carrying with it the sweet
similarity:  0.7008705706501422
----------------------------------------------------------------------------------------------------
with it the sweet scent of honeysuckle. It was a perfect afternoon for lounging, and that's exactly
similarity:  0.7018123928004747
----------------------------------------------------------------------------------------------------
and that's exactly what I was doing, sprawled out on the creaky wooden swing. But I wasn't alone.
similarity:  0.724511092495357
----------------------------------------------------------------------------------------------------
But I wasn't alone. Curled up in my lap, a purring rumble emanating from her ti

In [37]:
question2 = "What is the name of my pet?"
query_result2 = embd.embed_query(question2)
for i, page_result in enumerate(document_result):
    print(splits[i])
    print("similarity: ", cosine_similarity(query_result2, page_result))
    print("-"*100)

The sun dripped honey through the leaves of the old oak tree, casting dappled shadows across the
similarity:  0.6900611163677115
----------------------------------------------------------------------------------------------------
shadows across the worn swing set. A gentle breeze stirred the air, carrying with it the sweet
similarity:  0.6942903022912007
----------------------------------------------------------------------------------------------------
with it the sweet scent of honeysuckle. It was a perfect afternoon for lounging, and that's exactly
similarity:  0.6907643538102749
----------------------------------------------------------------------------------------------------
and that's exactly what I was doing, sprawled out on the creaky wooden swing. But I wasn't alone.
similarity:  0.7192597918205863
----------------------------------------------------------------------------------------------------
But I wasn't alone. Curled up in my lap, a purring rumble emanating from her t

In [38]:
question3 = "What colour is my pet?"
query_result3 = embd.embed_query(question3)
for i, page_result in enumerate(document_result):
    print(splits[i])
    print("similarity: ", cosine_similarity(query_result3, page_result))
    print("-"*100)

The sun dripped honey through the leaves of the old oak tree, casting dappled shadows across the
similarity:  0.6925147429317072
----------------------------------------------------------------------------------------------------
shadows across the worn swing set. A gentle breeze stirred the air, carrying with it the sweet
similarity:  0.6919675790482366
----------------------------------------------------------------------------------------------------
with it the sweet scent of honeysuckle. It was a perfect afternoon for lounging, and that's exactly
similarity:  0.6913791661242072
----------------------------------------------------------------------------------------------------
and that's exactly what I was doing, sprawled out on the creaky wooden swing. But I wasn't alone.
similarity:  0.7101865913765214
----------------------------------------------------------------------------------------------------
But I wasn't alone. Curled up in my lap, a purring rumble emanating from her t

In [39]:
question4 = "Describe the weather."
query_result4 = embd.embed_query(question4)
for i, page_result in enumerate(document_result):
    print(splits[i])
    print("similarity: ", cosine_similarity(query_result4, page_result))
    print("-"*100)

The sun dripped honey through the leaves of the old oak tree, casting dappled shadows across the
similarity:  0.7609812484170175
----------------------------------------------------------------------------------------------------
shadows across the worn swing set. A gentle breeze stirred the air, carrying with it the sweet
similarity:  0.7648093217934215
----------------------------------------------------------------------------------------------------
with it the sweet scent of honeysuckle. It was a perfect afternoon for lounging, and that's exactly
similarity:  0.7396629735021693
----------------------------------------------------------------------------------------------------
and that's exactly what I was doing, sprawled out on the creaky wooden swing. But I wasn't alone.
similarity:  0.7386856351062898
----------------------------------------------------------------------------------------------------
But I wasn't alone. Curled up in my lap, a purring rumble emanating from her t

Now load the big document blog post from before

In [79]:
import bs4
# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

In [80]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
splits = text_splitter.split_documents(docs)
len(splits)

212

In [82]:
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma

vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings(model="text-embedding-3-small"))

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()

## Part 3: Retrieval

Lets start with the cat example

In [89]:
question = "What is my pet's name?" 
document = """The sun dripped honey through the leaves of the old oak tree, casting dappled shadows across the worn swing set. A gentle breeze stirred the air, carrying with it the sweet scent of honeysuckle. It was a perfect afternoon for lounging, and that's exactly what I was doing, sprawled out on the creaky wooden swing. But I wasn't alone. Curled up in my lap, a purring rumble emanating from her tiny chest, was Luna, my favorite pet. My favorite pet is a cat. Luna's fur, the color of twilight, shimmered in the sunlight. Her eyes, emerald pools flecked with gold, blinked contentedly. Every so often, she'd twitch her whiskers, captivated by a flitting butterfly or a rustling leaf. As I stroked her soft fur, a wave of peace washed over me. In that moment, with Luna by my side, the world seemed to melt away. The worries and stresses of the day faded into insignificance. All that mattered was the warmth of the sun, the gentle breeze, and the rhythmic purr of my furry companion. """

In [90]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=10)
dummy_splits = text_splitter.split_text(document)

In [91]:
dummy_splits

['The sun dripped honey through the leaves of the',
 'of the old oak tree, casting dappled shadows',
 'shadows across the worn swing set. A gentle',
 'A gentle breeze stirred the air, carrying with it',
 'with it the sweet scent of honeysuckle. It was a',
 'It was a perfect afternoon for lounging, and',
 "and that's exactly what I was doing, sprawled out",
 "out on the creaky wooden swing. But I wasn't",
 "I wasn't alone. Curled up in my lap, a purring",
 'a purring rumble emanating from her tiny chest,',
 'chest, was Luna, my favorite pet. My favorite pet',
 "pet is a cat. Luna's fur, the color of twilight,",
 'twilight, shimmered in the sunlight. Her eyes,',
 'Her eyes, emerald pools flecked with gold,',
 "gold, blinked contentedly. Every so often, she'd",
 "she'd twitch her whiskers, captivated by a",
 'by a flitting butterfly or a rustling leaf. As I',
 'As I stroked her soft fur, a wave of peace washed',
 'washed over me. In that moment, with Luna by my',
 'by my side, the world s

In [92]:
dummy_vectorstore = Chroma.from_texts(texts=dummy_splits, embedding=OpenAIEmbeddings(model="text-embedding-3-small"))

# Retrieve and generate using the relevant snippets of the blog.
dummy_retriever = dummy_vectorstore.as_retriever()

In [73]:
docs = dummy_retriever.get_relevant_documents(question)

In [74]:
len(docs)

4

In [75]:
docs

[Document(page_content='chest, was Luna, my favorite pet. My favorite pet'),
 Document(page_content='my furry companion.'),
 Document(page_content="pet is a cat. Luna's fur, the color of twilight,"),
 Document(page_content='gentle breeze, and the rhythmic purr of my furry')]

In [76]:
docs = dummy_retriever.get_relevant_documents("What are my favourite pets?")
docs

[Document(page_content='chest, was Luna, my favorite pet. My favorite pet'),
 Document(page_content='my furry companion.'),
 Document(page_content="pet is a cat. Luna's fur, the color of twilight,"),
 Document(page_content='gentle breeze, and the rhythmic purr of my furry')]

In [77]:
docs = dummy_retriever.get_relevant_documents("Describe the weather.")
docs

[Document(page_content='mattered was the warmth of the sun, the gentle'),
 Document(page_content='A gentle breeze stirred the air, carrying with it'),
 Document(page_content='The sun dripped honey through the leaves of the'),
 Document(page_content='gentle breeze, and the rhythmic purr of my furry')]

In [78]:
docs = dummy_retriever.get_relevant_documents("What colour is my pet ?")
docs

[Document(page_content="pet is a cat. Luna's fur, the color of twilight,"),
 Document(page_content='chest, was Luna, my favorite pet. My favorite pet'),
 Document(page_content='my furry companion.'),
 Document(page_content='gentle breeze, and the rhythmic purr of my furry')]

Now the big one

In [83]:
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings(model="text-embedding-3-small"))

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever(search_kwargs = {"k":1})

In [85]:
docs = retriever.get_relevant_documents("What is Task Decomposition?")
len(docs)

1

## Part 4: Generation

In generation, we feed the retrieved documents into the prompt conetxt window of the LLM and try to get the output from it. 
![rag-part1](rag-part1.png)

In [86]:
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

# Prompt
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
prompt

ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template='Answer the question based only on the following context:\n{context}\n\nQuestion: {question}\n'))])

In [87]:
# LLM
llm = ChatOpenAI(model_name="gpt-4-turbo", temperature=0)

In [88]:
# Chain
chain = prompt | llm

In [122]:
question = "What is my pet doing ?"
docs = dummy_retriever.get_relevant_documents(question)
docs

[Document(page_content='my furry companion.'),
 Document(page_content='my furry companion.'),
 Document(page_content='chest, was Luna, my favorite pet. My favorite pet'),
 Document(page_content='chest, was Luna, my favorite pet. My favorite pet')]

In [123]:
chain.invoke({"context": docs, "question": question}).content

'Based on the context provided, your pet, Luna, is on your chest.'

In [102]:
from langchain import hub
prompt_hub_rag = hub.pull("rlm/rag-prompt")

In [103]:
prompt_hub_rag

ChatPromptTemplate(input_variables=['context', 'question'], metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))])

In [109]:
# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [110]:
rag_chain = (
    {"context": dummy_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt_hub_rag
    | llm
    | StrOutputParser()
)

In [125]:
rag_chain.invoke("What colour is my pet ?")

'The color of your pet, Luna the cat, is described as the color of twilight.'