In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

True

## What is RAG (retrieval augmented generation)?
Basically, shoving lot of extra information in the prompt.

In [2]:
# Example 

from langchain.chat_models import init_chat_model

model = init_chat_model(
    "llama-3.3-70b-versatile",
    model_provider="groq",
    api_key="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")

prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(
    prompt_template.format(
        context="Ajit has two sisters, Preeti and Sweta. Ajit is male.",
        question="Who is Preeti's bother?"         
    ))

print(response)

content="Ajit is Preeti's brother. He is also the brother of Sweta. This is stated in the context where it mentions Ajit has two sisters, Preeti and Sweta, implying Ajit is their male sibling." additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 49, 'prompt_tokens': 119, 'total_tokens': 168, 'completion_time': 0.172710254, 'prompt_time': 0.016677427, 'queue_time': 0.052131503, 'total_time': 0.189387681}, 'model_name': 'llama-3.3-70b-versatile', 'system_fingerprint': 'fp_2ddfbb0da0', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None} id='run--2f526dde-3684-4c25-b446-cab089bae9c2-0' usage_metadata={'input_tokens': 119, 'output_tokens': 49, 'total_tokens': 168}


In [3]:
print(response.content)

Ajit is Preeti's brother. He is also the brother of Sweta. This is stated in the context where it mentions Ajit has two sisters, Preeti and Sweta, implying Ajit is their male sibling.


>RAG is all about cleverly pushing is as much information in the context with minimum possible tokens

In [None]:
# little non-trival RAG example

# grab a novel
file = open("The Old Curiosity Shop.txt")
text = file.read()

In [6]:
# lets try to talk with this book

from langchain.chat_models import init_chat_model

model = init_chat_model(
    "llama-3.3-70b-versatile",
    model_provider="groq",
    api_key="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")

# Prepare your prompt
prompt_template = """You are a novel reader. You are given collection of stories:
{collection_of_stories}
You are tasked to make a list of story titles in this collection. Write a short summary for each story in English Language. Skip the story from the list, if the story is not provided in the text. 
"""

response = model.invoke(prompt_template.format(collection_of_stories = text[:len(text)//2000]))

print(response)

content="Based on the provided text, here is the list of story titles with a short summary for each:\n\n1. **The Finest Story in the World** - This story is not provided in the text, so I'll skip it.\n2. **With the Main Guard** - This story is not provided in the text, so I'll skip it.\n3. **Wee Willie Winkie** - This story is not provided in the text, so I'll skip it.\n4. **The Rout of the White Hussars** - This story is not provided in the text, so I'll skip it.\n5. **At Twenty-two** - This story is not provided in the text, so I'll skip it.\n6. **The Courting of Dinah Shadd** - This story is not provided in the text, so I'll skip it.\n7. **The Story of Muhammad Din** - This story is not provided in the text, so I'll skip it.\n\nUnfortunately, none of the stories are provided in the text, so I couldn't write a summary for any of them. If you provide the actual stories, I'd be happy to help with the summaries. \n\nHowever, I can suggest that these stories are part of Rudyard Kipling's

In [7]:
# Better printing

from IPython.display import Markdown
Markdown(response.content)

Based on the provided text, here is the list of story titles with a short summary for each:

1. **The Finest Story in the World** - This story is not provided in the text, so I'll skip it.
2. **With the Main Guard** - This story is not provided in the text, so I'll skip it.
3. **Wee Willie Winkie** - This story is not provided in the text, so I'll skip it.
4. **The Rout of the White Hussars** - This story is not provided in the text, so I'll skip it.
5. **At Twenty-two** - This story is not provided in the text, so I'll skip it.
6. **The Courting of Dinah Shadd** - This story is not provided in the text, so I'll skip it.
7. **The Story of Muhammad Din** - This story is not provided in the text, so I'll skip it.

Unfortunately, none of the stories are provided in the text, so I couldn't write a summary for any of them. If you provide the actual stories, I'd be happy to help with the summaries. 

However, I can suggest that these stories are part of Rudyard Kipling's collection, and they might be related to his typical themes of colonialism, adventure, and human relationships, given his style and the time period in which he wrote.

### Problem: Too much raw information in the context makes the prompt too long.
- Costly
- adds noise
### Solution: Use RAG
https://python.langchain.com/docs/tutorials/rag/
### To understand RAG, we need to understand Semantic Search
https://python.langchain.com/docs/tutorials/retrievers/

In [20]:
# load text using RELEVANT loader
from langchain_community.document_loaders import TextLoader
loader = TextLoader("The Old curiosity Shop.txt")
docs = loader.load()

In [21]:
# Split document into small chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

print(f"Split given book into {len(all_splits)} sub-documents.")

Split given book into 881 sub-documents.


### Embedding

In [22]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",
    google_api_key="AIzaSyDQU_npYx8c_ntmKXQKBkU6GBmHV-ndHRI"
)


In [26]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",
    google_api_key="AIzaSyDQU_npYx8c_ntmKXQKBkU6GBmHV-ndHRI"
)

# Create a vector store
# (CLASSROOM DISCUSSION: What are vector stores? What do we make them?)
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embeddings)

# Adding documents to vector store
document_ids = vector_store.add_documents(documents=all_splits)


GoogleGenerativeAIError: Error embedding content: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerUserPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerUserPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerProjectPerModel-FreeTier"
}
, links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
]

In [12]:
document_ids

NameError: name 'document_ids' is not defined

In [None]:
# extract chunks which matches with your query

search_results = vector_store.similarity_search_with_score(
    "What is the theme of the book?",
    k = 10
)

GoogleGenerativeAIError: Error embedding content: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerUserPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerUserPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerProjectPerModel-FreeTier"
}
, links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
]

In [15]:
search_results

NameError: name 'search_results' is not defined

### What is RAG?
Retrieve using semantic search and dump the similar chunks in the context of the prompt.
LLM sees the question and retrieved docs in its prompt and generates tokens accordingly.

In [13]:
prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

In [14]:
doc_content = "\n\n".join(doc.page_content+"\n"+"="*50+"\n" for (doc,score) in search_results)
print(doc_content)

NameError: name 'search_results' is not defined

In [None]:
# make the LLM read see the prompt, and analyse the retrieved document, and generate response

from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",
    google_api_key="AIzaSyDQU_npYx8c_ntmKXQKBkU6GBmHV-ndHRI"
)

# Create a vector store
# (CLASSROOM DISCUSSION: What are vector stores? What do we make them?)
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embeddings)

# Adding documents to vector store
document_ids = vector_store.add_documents(documents=all_splits)



from langchain.chat_models import init_chat_model
model = init_chat_model("llama-3.3-70b-versatile", model_provider="groq")

prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(prompt_template.format(
    context=doc_content,
    question="What is the theme of the book?"))

GoogleGenerativeAIError: Error embedding content: 400 API key expired. Please renew the API key. [reason: "API_KEY_INVALID"
domain: "googleapis.com"
metadata {
  key: "service"
  value: "generativelanguage.googleapis.com"
}
, locale: "en-US"
message: "API key expired. Please renew the API key."
]

In [39]:
# Better printing

from IPython.display import Markdown
Markdown(response.content)

There is no mention of a lion in the provided context. The context appears to be a list of tale titles and snippets of text from various stories, but none of them mention a lion. I don't know the role of a lion in the story as it is not referenced in the given context.

# RAG Summary

In [None]:
# RAG summary

# Read a doc
from langchain_community.document_loaders import TextLoader
loader = TextLoader("The Old Curiosity Shop.txt")
docs = loader.load()

# Split document into small chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

print(f"Split given book into {len(all_splits)} sub-documents.")

# embedding
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",
    google_api_key="AIzaSyCEe4LxbYaFR0qW3hHRk8l-097bt0CE7B0"
)

# Create a vector store
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore(embeddings)

# Adding documents to vector store
document_ids = vector_store.add_documents(documents=all_splits)


Split given book into 1579 sub-documents.


In [None]:
# extract chunks which matches with your query

search_results = vector_store.similarity_search_with_score(
    "What is the theme of the book?",
    k = 11
)

doc_content = "\n\n".join(doc.page_content+"\n"+"="*50+"\n" for (doc,score) in search_results)
print(doc_content)

416  Indian  Tales 

"Take  some  more  whiskey  and  go  on,"  I  said. 
"That  was  the  first  village  you  came  into.  How 
did  you  get  to  be  King  ?  "


Indian  Tales


1 78  Indian  Tales


238  Indian   Tales


The  Man  Who  Would  be  King  427


288  Indian    Tales


226  Indian  Tales


134  Indian  Tales 

' 'But  what  do  you  know  about  Polonius?"  I 
demanded.  This  was  a  new  side  of  Mulvaney's 
character.


The  Gate  of  the  Hundred  Sorrows  451


In  the  House  of  Suddhoo  559


The  Incarnation  of  Krishna  Mulvaney        473



In [None]:
prompt_template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

response = model.invoke(prompt_template.format(
    context=doc_content,
    question="What is the theme of the book?"))


from IPython.display import Markdown
Markdown(response.content)

I don't know the role of the Lion in the story as there is no mention of a Lion in the provided context. The context appears to be a collection of Indian Tales with various titles and snippets of text, but none of them mention a Lion. I couldn't find any relevant information to answer the question.

: 