In [46]:
from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.getenv("GOOGLE_API_KEY")
print("API key found:", bool(api_key))


API key found: True


## What is RAG (retrieval augmented generation)?
Basically, shoving lot of extra information in the prompt.

In [47]:
# Example 

from langchain.chat_models import init_chat_model
model = init_chat_model("llama-3.3-70b-versatile", model_provider="groq")

prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(
    prompt_template.format(
        context="Founder of SpaceX, CEO of twitter",
        question="Who am i talking about?"         
    ))

print(response)

content='You are talking about Elon Musk, who is the founder of SpaceX and the CEO of Twitter. He is a well-known entrepreneur and business magnate. Elon Musk is likely the person you are referring to based on the given context.' additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 47, 'prompt_tokens': 106, 'total_tokens': 153, 'completion_time': 0.123317843, 'prompt_time': 0.040865754, 'queue_time': 0.045371036, 'total_time': 0.164183597}, 'model_name': 'llama-3.3-70b-versatile', 'system_fingerprint': 'fp_3f3b593e33', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None} id='run--4c758b5d-1553-41a9-9bba-38b40e41eda6-0' usage_metadata={'input_tokens': 106, 'output_tokens': 47, 'total_tokens': 153}


In [48]:
print(response.content)

You are talking about Elon Musk, who is the founder of SpaceX and the CEO of Twitter. He is a well-known entrepreneur and business magnate. Elon Musk is likely the person you are referring to based on the given context.


>RAG is all about cleverly pushing is as much information in the context with minimum possible tokens

In [49]:
# little non-trival RAG example

# grab a novel
file = open("indianTales.txt",encoding="utf-8")
text = file.read()

In [50]:
# lets try to talk with this book

from langchain.chat_models import init_chat_model
model = init_chat_model("openai/gpt-oss-120b", model_provider="groq")

# Prepare your prompt
prompt_template = """You are a novel reader. You are given collection of stories:
{collection_of_stories}
You are tasked to make a list of story titles in this collection. Write a short summary for each story in english Language. Skip the story from the list, if the story is not provided in the text. 
"""

response = model.invoke(prompt_template.format(collection_of_stories = text[:len(text)//20]))

print(response)



In [51]:
# Better printing

from IPython.display import Markdown
Markdown(response.content)

**Story Titles and Summaries from the Provided Text**

| # | Title | Short Summary |
|---|-------|----------------|
| 1 | **The Lion and the Crane** | A lion is choking on a bone; a white crane (the Bodhisatta in a past life) bravely extracts it by inserting its head into the lion’s mouth and using a stick to pry the bone out. After the lion recovers, the crane asks for gratitude, but the lion replies that he would gladly eat the crane again. The tale ends with the moral that ingratitude follows a good deed. |
| 2 | **How the Raja’s Son Won the Princess Labam** | A prince is forbidden by his mother to hunt on the fourth side of the kingdom, where the beautiful Princess Labam lives. Ignoring the warning, he ventures there, encounters a lone parrot that refuses to reveal Labam’s location, and returns home despondent. Determined, he sets out on a long quest, aided by a series of kindnesses: an ant‑king promises help if needed, he removes a thorn from a tiger’s foot and gains the tiger’s gratitude, and later meets four fakirs who each possess a magical object. The story ends (in the excerpt) with the prince gathering these allies, preparing him for the ultimate challenge of finding Princess Labam. |

*Only these two stories appear in the portion of the ebook that was supplied, so they are the only titles for which a summary can be provided.*

### Problem: Too much raw information in the context makes the prompt too long.
- Costly
- adds noise
### Solution: Use RAG
https://python.langchain.com/docs/tutorials/rag/
### To understand RAG, we need to understand Semantic Search
https://python.langchain.com/docs/tutorials/retrievers/

In [52]:
# load text using RELEVANT loader
from langchain_community.document_loaders import TextLoader
loader = TextLoader("indianTales.txt",encoding="utf-8")
docs = loader.load()

In [53]:
# Split document into small chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

print(f"Split given book into {len(all_splits)} sub-documents.")

Split given book into 563 sub-documents.


### Embedding

In [54]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore

# Load API key from environment
api_key = os.getenv("GOOGLE_API_KEY")

embeddings = GoogleGenerativeAIEmbeddings(
    model="models/text-embedding-004",
    api_key=api_key
)

vector_store = InMemoryVectorStore(embeddings)

# all_splits must be a list of Document objects
# Example:
# from langchain_core.documents import Document
# all_splits = [Document(page_content="This is text")]

document_ids = vector_store.add_documents(documents=all_splits)


In [55]:
document_ids

['6226ab41-14af-4b85-8b87-21667e54ccba',
 '9339d358-e681-4886-8020-968d4ad0c85e',
 '6fa9deb0-9531-492f-bb96-0a3c60da44fa',
 'f6855bbf-491b-4786-a306-8f148d9281fc',
 'dbe7a5ce-e9e9-4b31-b724-c81d6e79f47f',
 'a446f69a-d6c5-4531-8d97-12a448490d36',
 'bb694dbc-4337-4ac4-a4fa-3c75216cd3cf',
 '4f26baa7-6b9e-4adc-86f5-3688211c51de',
 '7fe796ce-f74b-477f-88d8-1718b9f1ca86',
 '559389d7-c943-484c-9df5-ca01281ac151',
 'dc2fcaac-bb00-4a88-b73a-b662f82bd204',
 'a4ee8093-c8a8-40fa-9ea1-6e1577b7b569',
 'b4b72645-a8fd-4a19-b00b-04fe63318f3a',
 'e3344c95-d530-4354-a663-445c37960ea0',
 '5c521b41-6e41-46a3-9a2b-6f31c9f16013',
 'e71bd753-397e-4ad9-8582-c203095b35f0',
 '5098a2da-3e40-4e1b-8a1f-168dd22bc7db',
 '1925512b-4eed-4bfb-b127-f064d64e3eda',
 '21b92d45-ff07-40f8-b986-54e97f6e8bd6',
 '183af207-2674-4776-b3e8-77d3e0293b30',
 '3afd715d-1842-4cb9-b693-d0e2ede64744',
 '16de8f9b-1a29-4127-b759-ddccc0dfa5e5',
 '502e8ced-210e-4b7c-92bb-2736aa192f91',
 '71154d54-af2c-49fa-a88a-3bb385260527',
 'dea8b5d9-0558-

In [56]:
# extract chunks which matches with your query

search_results = vector_store.similarity_search_with_score(
    "What is the role of Lion in the story?",
    k = 10
)

In [57]:
search_results

[(Document(id='caf3b161-c8f4-4281-b1ea-08e9534b680f', metadata={'source': 'indianTales.txt', 'start_index': 110013}, page_content='goldsmith. Do not release him; and if you do, you shall surely repent\nof it one day or other." Thus advising, the hungry tiger went away\nwithout waiting for an answer.'),
  0.730208683032708),
 (Document(id='5a06c984-4405-4b91-ae02-b5266446a48d', metadata={'source': 'indianTales.txt', 'start_index': 30490}, page_content='When he heard of the demons the Raja\'s son was very sad. "What can I\ndo?" he said to himself. "How can I fight with these two demons?" Then\nhe thought of his tiger: and the tiger and his wife came to him and\nsaid, "Why are you so sad?" The Raja\'s son answered, "The king has\nordered me to fight with his two demons and kill them. How can I do\nthis?" "Do not be frightened," said the tiger. "Be happy. I and my\nwife will fight with them for you."\n\n[Illustration:]'),
  0.7208798323482595),
 (Document(id='f1f9169c-6748-47cd-bc77-1478b1

### What is RAG?
Retrieve using semantic search and dump the similar chunks in the context of the prompt.
LLM sees the question and retrieved docs in its prompt and generates tokens accordingly.

In [58]:
prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

In [59]:
doc_content = "\n\n".join(doc.page_content+"\n"+"="*50+"\n" for (doc,score) in search_results)
print(doc_content)

goldsmith. Do not release him; and if you do, you shall surely repent
of it one day or other." Thus advising, the hungry tiger went away
without waiting for an answer.


When he heard of the demons the Raja's son was very sad. "What can I
do?" he said to himself. "How can I fight with these two demons?" Then
he thought of his tiger: and the tiger and his wife came to him and
said, "Why are you so sad?" The Raja's son answered, "The king has
ordered me to fight with his two demons and kill them. How can I do
this?" "Do not be frightened," said the tiger. "Be happy. I and my
wife will fight with them for you."

[Illustration:]


[Illustration:]

"What man hurt you that you roared so loud?" said the wife.

"No one hurt me," answered the husband; "but a Raja's son came and
took the thorn out of my foot."

"Where is he? Show him to me," said his wife.

"If you promise not to kill him, I will call him," said the tiger.

"I won't kill him; only let me see him," answered his wife.

Then the ti

In [60]:
# make the LLM read see the prompt, and analyse the retrieved document, and generate response

from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")

# Create a vector store
# (CLASSROOM DISCUSSION: What are vector stores? What do we make them?)
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embeddings)

# Adding documents to vector store
document_ids = vector_store.add_documents(documents=all_splits)



from langchain.chat_models import init_chat_model
model = init_chat_model("llama-3.3-70b-versatile", model_provider="groq")

prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(prompt_template.format(
    context=doc_content,
    question="What is the role of Lion in the story?"))

In [61]:
# Better printing

from IPython.display import Markdown
Markdown(response.content)

There is no mention of a lion playing a specific role in the story provided. The context appears to be about a tiger, a Raja's son, and other characters, but it does not discuss a lion's role. I don't know the role of a lion in the story.

# RAG Summary

In [62]:
# RAG summary

# Read a doc
from langchain_community.document_loaders import TextLoader
loader = TextLoader("indianTales.txt",encoding="utf-8")
docs = loader.load()

# Split document into small chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

print(f"Split given book into {len(all_splits)} sub-documents.")

# embedding
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")

# Create a vector store
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embeddings)

# Adding documents to vector store
document_ids = vector_store.add_documents(documents=all_splits)


Split given book into 563 sub-documents.


In [63]:
# extract chunks which matches with your query

search_results = vector_store.similarity_search_with_score(
    "What is the role of Lion in the story?",
    k = 10
)

doc_content = "\n\n".join(doc.page_content+"\n"+"="*50+"\n" for (doc,score) in search_results)
print(doc_content)

goldsmith. Do not release him; and if you do, you shall surely repent
of it one day or other." Thus advising, the hungry tiger went away
without waiting for an answer.


When he heard of the demons the Raja's son was very sad. "What can I
do?" he said to himself. "How can I fight with these two demons?" Then
he thought of his tiger: and the tiger and his wife came to him and
said, "Why are you so sad?" The Raja's son answered, "The king has
ordered me to fight with his two demons and kill them. How can I do
this?" "Do not be frightened," said the tiger. "Be happy. I and my
wife will fight with them for you."

[Illustration:]


[Illustration:]

"What man hurt you that you roared so loud?" said the wife.

"No one hurt me," answered the husband; "but a Raja's son came and
took the thorn out of my foot."

"Where is he? Show him to me," said his wife.

"If you promise not to kill him, I will call him," said the tiger.

"I won't kill him; only let me see him," answered his wife.

Then the ti

In [64]:
prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(prompt_template.format(
    context=doc_content,
    question="What is the role of Lion in the story?"))


from IPython.display import Markdown
Markdown(response.content)

There is no mention of a lion playing a role in the provided story context. The context seems to be about a tiger and other characters, but does not mention a lion's role. I don't know the answer based on the given context.