In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

True

## What is RAG (retrieval augmented generation)?
Basically, shoving lot of extra information in the prompt.

In [2]:
# Example 

from langchain.chat_models import init_chat_model
model = init_chat_model("llama-3.3-70b-versatile", model_provider="groq")

prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(
    prompt_template.format(
        context="Ajit has two sisters, Preeti and Sweta. Ajit is male.",
        question="Who is Preeti's bother?"         
    ))

print(response)

content="Ajit is Preeti's brother. He is also the brother of Sweta. There is no other information about any other brother of Preeti." additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 33, 'prompt_tokens': 119, 'total_tokens': 152, 'completion_time': 0.132235416, 'prompt_time': 0.01627205, 'queue_time': 0.0595006, 'total_time': 0.148507466}, 'model_name': 'llama-3.3-70b-versatile', 'system_fingerprint': 'fp_3f3b593e33', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None} id='run--bbc5be90-688f-4bfd-8524-2c52535070f1-0' usage_metadata={'input_tokens': 119, 'output_tokens': 33, 'total_tokens': 152}


In [3]:
print(response.content)

Ajit is Preeti's brother. He is also the brother of Sweta. This is stated in the given context where Ajit is mentioned as the male sibling of Preeti and Sweta.


>RAG is all about cleverly pushing is as much information in the context with minimum possible tokens

In [3]:
# little non-trival RAG example

# grab a novel
with open("indianTales.txt", encoding="utf-8") as file:
    text = file.read()

In [4]:
# lets try to talk with this book

from langchain.chat_models import init_chat_model
model = init_chat_model("llama-3.3-70b-versatile", model_provider="groq")

# Prepare your prompt
prompt_template = """You are a novel reader. You are given collection of stories:
{collection_of_stories}
You are tasked to make a list of story titles in this collection. Write a short summary for each story in Hindi Language. Skip the story from the list, if the story is not provided in the text. 
"""

response = model.invoke(prompt_template.format(collection_of_stories = text[:len(text)//20]))

print(response)

content='यहाँ दी गई कहानियों की सूची है:\n\n1. **शेर और बगुला** - इस कहानी में, एक बगुला एक शेर की मदद करता है जिसके गले में एक हड्डी फंस जाती है। बगुला हड्डी को निकाल देता है, लेकिन शेर उसका शुक्रिया नहीं अदा करता है और कहता है कि वह भाग्यशाली है कि वह जीवित है।\n\n2. **राजकुमार और राजकुमारी लबाम** - एक राजकुमार अपनी माता की आज्ञा के विरुद्ध जाकर राजकुमारी लबाम को ढूंढने निकलता है। वह कई जंगलों से गुजरता है, जहां उसे विभिन्न जानवरों से मिलने का अवसर मिलता है, जिनमें से कुछ उसकी मदद करते हैं।\n\nकृपया ध्यान दें कि अन्य कहानियों के लिए पाठ उपलब्ध नहीं है, इसलिए उनके लिए सारांश नहीं लिखा जा सकता है।' additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 304, 'prompt_tokens': 5232, 'total_tokens': 5536, 'completion_time': 1.043832378, 'prompt_time': 0.406922307, 'queue_time': 0.049479601, 'total_time': 1.450754685}, 'model_name': 'llama-3.3-70b-versatile', 'system_fingerprint': 'fp_2ddfbb0da0', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None} 

In [8]:
# Better printing

from IPython.display import Markdown
Markdown(response.content)

यहाँ दी गई कहानियों की सूची है:

1. **शेर और बगुला** - इस कहानी में, एक बगुला एक शेर की जान बचाता है जिसके गले में एक हड्डी फंस जाती है। बगुला शेर को बचाने के लिए उसके मुंह में जाता है, लेकिन शेर उसे निगलने की कोशिश नहीं करता है। बाद में, शेर बगुला को धन्यवाद देने के बजाय उसे खतरे में डालता है, जिससे बगुला निराश होता है।
2. **राजा के बेटे ने कैसे राजकुमारी लबाम को जीता** - इस कहानी में, एक राजकुमार अपनी माँ की सलाह की अवहेलना करता है और एक जंगल में जाता है, जहाँ वह एक परिवार के राजा से मिलता है। राजकुमार एक राजकुमारी लबाम के बारे में सुनता है और उसे ढूंढने का फैसला करता है। वह कई परीक्षणों और चुनौतियों का सामना करता है, जिसमें एक एंट राजा और एक बाघ से मिलना शामिल है, जो उसे अपनी यात्रा में मदद करते हैं।

इन कहानियों के अलावा, अन्य कहानियों की सूची दी गई है, लेकिन उनकी कहानियाँ प्रदान नहीं की गई हैं।

### Problem: Too much raw information in the context makes the prompt too long.
- Costly
- adds noise
### Solution: Use RAG
https://python.langchain.com/docs/tutorials/rag/
### To understand RAG, we need to understand Semantic Search
https://python.langchain.com/docs/tutorials/retrievers/

In [5]:
import sys
print(sys.executable)


c:\Users\rishi\myproject\venv\Scripts\python.exe


In [3]:
# load text using RELEVANT loader
from langchain_community.document_loaders import TextLoader

loader = TextLoader("indianTales.txt", encoding="utf-8")
docs = loader.load()


In [4]:
# Split document into small chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

print(f"Split given book into {len(all_splits)} sub-documents.")

Split given book into 563 sub-documents.


### Embedding

In [7]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# Create a vector store
# (CLASSROOM DISCUSSION: What are vector stores? What do we make them?)
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embeddings)

# Adding documents to vector store
document_ids = vector_store.add_documents(documents=all_splits)


In [None]:
# document_ids

In [8]:
# extract chunks which matches with your query

search_results = vector_store.similarity_search_with_score(
    "What is the role of Lion in the story?",
    k = 10
)

In [9]:
search_results

[(Document(id='e201d001-d9d2-4d48-8884-40138bbc5df3', metadata={'source': 'indianTales.txt', 'start_index': 117080}, page_content='How came the crown in the jaws of the tiger? The king of Ujjaini had a\nweek before gone with all his hunters on a hunting expedition. All of\na sudden the tiger-king started from the wood, seized the king, and\nvanished.'),
  0.7410933853822097),
 (Document(id='f43b479b-dbc6-44f7-9dfc-20f718fb2876', metadata={'source': 'indianTales.txt', 'start_index': 110013}, page_content='goldsmith. Do not release him; and if you do, you shall surely repent\nof it one day or other." Thus advising, the hungry tiger went away\nwithout waiting for an answer.'),
  0.7157341176789158),
 (Document(id='b4cc63cd-7a8d-4c41-8761-ba4057f4cdf7', metadata={'source': 'indianTales.txt', 'start_index': 110182}, page_content='[Illustration:]'),
  0.7022634590788741),
 (Document(id='09fecd75-d0a0-4b92-8849-8c9c5047286d', metadata={'source': 'indianTales.txt', 'start_index': 12770}, page_

### What is RAG?
Retrieve using semantic search and dump the similar chunks in the context of the prompt.
LLM sees the question and retrieved docs in its prompt and generates tokens accordingly.

In [17]:
prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

In [18]:
doc_content = "\n\n".join(doc.page_content+"\n"+"="*50+"\n" for (doc,score) in search_results)
print(doc_content)

How came the crown in the jaws of the tiger? The king of Ujjaini had a
week before gone with all his hunters on a hunting expedition. All of
a sudden the tiger-king started from the wood, seized the king, and
vanished.


goldsmith. Do not release him; and if you do, you shall surely repent
of it one day or other." Thus advising, the hungry tiger went away
without waiting for an answer.


[Illustration:]


mouth struck one end of the bone with his beak. Whereupon the bone
dropped and fell out. As soon as he had caused the bone to fall, he
got out of the lion's mouth, striking the stick with his beak so that
it fell out, and then settled on a branch. The lion gets well, and
one day was eating a buffalo he had killed. The crane thinking "I will
sound him," settled on a branch just over him, and in conversation
spoke this first verse:


"Where do you come from? Who are you?" asked the king, entering the
room.

"O king!" replied the prince, "I am the son of a king who rules over
such-and-su

In [19]:
# make the LLM read see the prompt, and analyse the retrieved document, and generate response

from langchain.chat_models import init_chat_model
model = init_chat_model("llama-3.3-70b-versatile", model_provider="groq")

prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(prompt_template.format(
    context=doc_content,
    question="What is the role of Lion in the story?"))

In [20]:
# Better printing

from IPython.display import Markdown
Markdown(response.content)

There is no mention of a lion playing a role in the overall story, only a brief mention of a lion in a separate anecdote where a crane helps a lion by removing a bone from its mouth. The lion's role is limited to this isolated incident. I don't know the lion's role in the main story.

# RAG Summary

In [22]:
# RAG summary

# Read a doc
from langchain_community.document_loaders import TextLoader
loader = TextLoader("indianTales.txt", encoding="utf-8")
docs = loader.load()


# Split document into small chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

print(f"Split given book into {len(all_splits)} sub-documents.")

# embedding
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# Create a vector store
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embeddings)

# Adding documents to vector store
document_ids = vector_store.add_documents(documents=all_splits)


Split given book into 563 sub-documents.


In [23]:
# extract chunks which matches with your query

search_results = vector_store.similarity_search_with_score(
    "What is the role of Lion in the story?",
    k = 10
)

doc_content = "\n\n".join(doc.page_content+"\n"+"="*50+"\n" for (doc,score) in search_results)
print(doc_content)

How came the crown in the jaws of the tiger? The king of Ujjaini had a
week before gone with all his hunters on a hunting expedition. All of
a sudden the tiger-king started from the wood, seized the king, and
vanished.


goldsmith. Do not release him; and if you do, you shall surely repent
of it one day or other." Thus advising, the hungry tiger went away
without waiting for an answer.


[Illustration:]


mouth struck one end of the bone with his beak. Whereupon the bone
dropped and fell out. As soon as he had caused the bone to fall, he
got out of the lion's mouth, striking the stick with his beak so that
it fell out, and then settled on a branch. The lion gets well, and
one day was eating a buffalo he had killed. The crane thinking "I will
sound him," settled on a branch just over him, and in conversation
spoke this first verse:


"Where do you come from? Who are you?" asked the king, entering the
room.

"O king!" replied the prince, "I am the son of a king who rules over
such-and-su

In [24]:
prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(prompt_template.format(
    context=doc_content,
    question="What is the role of Lion in the story?"))


from IPython.display import Markdown
Markdown(response.content)

There is no mention of a lion playing a role in the story provided. The context seems to be about a tiger, a king, and other characters, but does not include a lion as a significant character. I don't know the role of a lion in the story as it is not mentioned.