## **What is the RAG system?**

This is called retrieval augmented generation (RAG), as you would retrieve the relevant data and use it as augmented context for the LLM. Instead of relying solely on knowledge derived from the training data, a RAG workflow pulls relevant information from knowledge base and connects static LLMs with real-time data retrieval.


### **Why we create a RAG System?**

Retrieval systems (RAG) give LLM systems access to factual, access-controlled, timely information.

**1. RAG Helps Avoid Making Things Up (Hallucination)**

**Example:**
In finance, giving wrong advice about investments like stocks or mutual funds can cause people to lose money. RAG helps avoid this by looking up the right information from trusted sources before giving answers.


**2. Saves Money Compared to Training a Whole New Model**

**Example:**
Banks check if someone is good at paying back loans. Training a special AI model just for this can be expensive. Instead, RAG can use an existing AI model and simply search for the needed info from databases, saving time and money.


**3. Gives Accurate and Trustworthy Answers**

**Example:**
When customers ask questions, they expect correct and clear answers. RAG can pull real information from product manuals, rules, or past chats to give smart and helpful replies that build trust.


**4. Good at Giving Info for a Specific Field**

**Example:**
In law, advice must match the specific rules of a place or case. RAG can search legal documents like local laws or past cases to give answers that fit exactly what the client needs.


[10 Reasons Why You Need to Implement RAG: A Game Changer in AI](https://www.advancinganalytics.co.uk/blog/2023/11/7/10-reasons-why-you-need-to-implement-rag-a-game-changer-in-ai)

#### **RAG Practical Usecase**
1. Document Question Answering Systems
2. Conversational agents
3. Real-time Event Commentary
4. Content Generation
5. Personalised Recommendation
6. Virtual Assistants

In [2]:
# ! uv pip install langchain openai tiktoken rapidocr-onnxruntime

In [1]:
# os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

import os
from dotenv import load_dotenv
load_dotenv()
# use Groq models

True

1. Data Ingestion
2. Data Reterival
3. Data Generation

#### **Data Ingestion**

The data used here is the full text of the "State of the Union" address, which will be processed for retrieval-augmented generation tasks.

[State of the Union - Wikipedia](https://en.wikipedia.org/wiki/State_of_the_Union#:~:text=Though%20the%20language%20of%20the,as%20late%20as%20March%207)

In [5]:
# ! uv pip install --upgrade langchain

In [7]:
from langchain_community.document_loaders.text import TextLoader
from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS

In [16]:
with open("state_of_the_union.txt","r", encoding="utf8") as f:
  data = f.read()

loder = TextLoader('state_of_the_union.txt', encoding="utf8")
document = loder.load()

In [11]:
print('\n'.join(document[0].page_content.splitlines()[:5]))

Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  

Last year COVID-19 kept us apart. This year we are finally together again. 

Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. 


### **Chunking of the Data**

#### **All Text Splitters Available in LangChain**

üîó [LangChain Text Splitters Documentation](https://python.langchain.com/docs/how_to/#text-splitters)


#### **CharacterTextSplitter vs RecursiveCharacterTextSplitter**

* **CharacterTextSplitter**: Splits text based on a specific character (like a period, newline, etc.).
* **RecursiveCharacterTextSplitter**: Smarter ‚Äì tries to split on bigger chunks first (like paragraphs), and if not possible, goes to smaller ones (sentences, words).


#### **Visualize Text Chunking**

üîó [Chunking Visualizer Tool](https://chunkviz.up.railway.app/)

Use this tool to see how your text will be split using different splitters.

In [14]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
text_chunk=text_splitter.split_documents(documents=document)
text_chunk

[Document(metadata={'source': 'state_of_the_union.txt'}, page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny.'),
 Document(metadata={'source': 'state_of_the_union.txt'}, page_content='Six days ago, Russia‚Äôs Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelensk

In [17]:
text_chunk[1].page_content

'Six days ago, Russia‚Äôs Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.'

In [15]:
print(text_chunk[3].page_content)

Please rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. 

Throughout our history we‚Äôve learned this lesson when dictators do not pay a price for their aggression they cause more chaos.   

They keep moving.   

And the costs and the threats to America and the world keep rising.   

That‚Äôs why the NATO Alliance was created to secure peace and stability in Europe after World War 2.


In [26]:
# ! uv pip install faiss-cpu

In [29]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.vectorstores import FAISS

# Correct model name
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# Now build the vector store
vectorstore = FAISS.from_documents(text_chunk, embeddings)

# Create a retriver
retriver=vectorstore.as_retriever()

In [30]:
from langchain.prompts import ChatPromptTemplate

template = """
You are an intelligent assistant for answering questions using retrieved context.

Instructions:
- Use only the information provided in the context to answer the question.
- If the answer is not in the context, respond with: "I don't know."
- Do not make up information.
- Use as many relevant sentences as needed from the context.
- Keep your answer clear and concise.

Question: {question}
Context: {context}
Answer:
"""

prompt = ChatPromptTemplate.from_template(template)

In [35]:
from langchain_groq import ChatGroq
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

output_parser = StrOutputParser()

llm = ChatGroq(model="moonshotai/kimi-k2-instruct")

In [36]:
chain = (
    {"context": retriver, "question": RunnablePassthrough()} | prompt | llm | output_parser
)

In [37]:
response = chain.invoke("Who is the russian precident")
type(response)

str

In [41]:
response.split(".")

['The context refers to Putin as a Russian dictator and mentions his actions regarding Ukraine',
 ' However, it does not explicitly state that he is the Russian president',
 " Therefore, based solely on the provided context, I don't know",
 '']

In [42]:
chain.invoke("How is the United States supporting Ukraine economically and militarily?")

'The United States is supporting Ukraine economically and militarily by providing military assistance, economic assistance, and humanitarian assistance. They are giving more than $1 billion in direct assistance to Ukraine and will continue to aid the Ukrainian people as they defend their country.'

In [44]:
chain.invoke("What action is the U.S. taking to address rising gas prices?")

'The U.S. is addressing rising gas prices by releasing 30 million barrels from its own Strategic Petroleum Reserve and stands ready to do more if necessary. Additionally, the United States has worked with 30 other countries to release 60 million barrels of oil from reserves around the world.'

In [None]:

# Dry Run Explanation:
# 1. Import necessary libraries for environment setup, document loading, text splitting, embeddings, vector storage, and LLM integration.
# 2. Load environment variables (e.g., API keys) from a .env file using dotenv.
# 3. Load the "State of the Union" text file using TextLoader from LangChain.
# 4. Split the loaded document into smaller chunks using RecursiveCharacterTextSplitter for efficient processing.
# 5. Create embeddings for the text chunks using GoogleGenerativeAIEmbeddings (model: embedding-001).
# 6. Build a FAISS vector store from the text chunks and embeddings for efficient similarity search.
# 7. Set up a retriever from the FAISS vector store to fetch relevant context.
# 8. Define a prompt template for the LLM to answer questions based only on the retrieved context.
# 9. Initialize a ChatGroq LLM model (moonshotai/kimi-k2-instruct) for generating answers.
# 10. Create a chain that combines the retriever, prompt, LLM, and output parser to process queries.
# 11. Example queries are executed to demonstrate the RAG system's functionality.

import os
from dotenv import load_dotenv
from langchain_community.document_loaders.text import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

# Load environment variables (e.g., API keys)
load_dotenv()

# Load the State of the Union text file
# Dry Run: TextLoader reads the file 'state_of_the_union.txt' and loads it into a LangChain Document object.
loader = TextLoader('state_of_the_union.txt', encoding="utf8")
document = loader.load()

# Split the document into chunks
# Dry Run: RecursiveCharacterTextSplitter splits the document into chunks of 500 characters with 50-character overlap.
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
text_chunks = text_splitter.split_documents(documents=document)

# Initialize embeddings
# Dry Run: GoogleGenerativeAIEmbeddings creates vector representations of text chunks using Google's embedding-001 model.
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# Build FAISS vector store
# Dry Run: FAISS.from_documents creates a vector store from the text chunks and their embeddings for similarity search.
vectorstore = FAISS.from_documents(text_chunks, embeddings)

# Create a retriever
# Dry Run: The vector store is converted to a retriever to fetch relevant document chunks based on query similarity.
retriever = vectorstore.as_retriever()

# Define the prompt template
# Dry Run: ChatPromptTemplate defines a template instructing the LLM to answer using only the provided context.
template = """
You are an intelligent assistant for answering questions using retrieved context.

Instructions:
- Use only the information provided in the context to answer the question.
- If the answer is not in the context, respond with: "I don't know."
- Do not make up information.
- Use as many relevant sentences as needed from the context.
- Keep your answer clear and concise.

Question: {question}
Context: {context}
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)

# Initialize the LLM
llm = ChatGroq(model="moonshotai/kimi-k2-instruct")

# Create the RAG chain
# Dry Run: The chain combines the retriever (fetches context), prompt (formats input), LLM (generates answer), and output parser (extracts string output).
output_parser = StrOutputParser()
chain = (
    {"context": retriever, "question": RunnablePassthrough()} | prompt | llm | output_parser
)


response1 = chain.invoke("Who is the russian precident")
print("Response 1:", response1)

response2 = chain.invoke("How is the United States supporting Ukraine economically and militarily?")
print("Response 2:", response2)

response3 = chain.invoke("What action is the U.S. taking to address rising gas prices?")
print("Response 3:", response3)
