# A Gentle Introduction to RAG Applications

This notebook creates a simple RAG (Retrieval-Augmented Generation) system to answer questions from a PDF document using an open-source model.

In [2]:
PDF_FILE = "MANISHMADAN.pdf"

# We'll be using Llama 3.2 8B for this example.
MODEL = "llama3.2"

## Loading the PDF document

Let's start by loading the PDF document and breaking it down into separate pages.

<img src='images/documents.png' width="1000">

In [7]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader(PDF_FILE)
pages = loader.load()

print(f"Number of pages: {len(pages)}")

Number of pages: 1


## Splitting the pages in chunks

Pages are too long, so let's split pages into different chunks.

<img src='images/splitter.png' width="1000">


In [8]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=100)

chunks = splitter.split_documents(pages)
print(f"Number of chunks: {len(chunks)}")
print(f"Length of a chunk: {len(chunks[1].page_content)}")
print("Content of a chunk:", chunks[1].page_content)


Number of chunks: 3
Length of a chunk: 1487
Content of a chunk: Frontend : React.js, Next.js, TailwindCSS, MaterialUI, ChakraUI, HTML, CSS
Backend : Node.js, Express, Flask
Database : MongoDB, PostgreSQL
Tools : Git, GitHub, Docker, Postman
Projects
LegalEdge |Node.js, JavaScript, Express.js, MongoDb github.com/ManishMadan2882/hackman-team
•Engineered a sophisticated server-side solution empowering legal advice seekers through a community platform, driving
accessibility and collaboration in legal consultations.
•Integrated an anonymous posting feature, to share confidential matters discreetly, enhancing user privacy and trust.
algoRythm |React, Node.js, Express.js, Docker, TailwindCSS github.com/ManishMadan2882/algoRythm
•Virtual Compiler for 5+ programming languages (C, C++, Java, JavaScript, Python, C#) hosted on Vercel .
•Executed child processes to improve performance, enabling concurrent execution and efficient resource utilization.
•Containerized and deployed the API with Docker,

## Storing the chunks in a vector store

We can now generate embeddings for every chunk and store them in a vector store.

<img src='images/vectorstore.png' width="1000">


In [9]:
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings(model=MODEL)
vectorstore = FAISS.from_documents(chunks, embeddings)

## Setting up a retriever

We can use a retriever to find chunks in the vector store that are similar to a supplied question.

<img src='images/retriever.png' width="1000">



In [10]:
retriever = vectorstore.as_retriever()
retriever.invoke("What can you get away with when you only have a small number of users?")

[Document(metadata={'source': 'MANISHMADAN.pdf', 'page': 0}, page_content='•Pioneered Calendly based appointment booking functionality to create an interface between Lawyers and potential Clients.\nHack-elite |1st Runner up August 2023\n•Implemented MindsDB for abstraction of ML Regression Models to trigger health risks with an accuracy of 76%.\n•Integrated a Fitness Bot with OpenAI GPT 3.0 API, collecting information from Google Fit API for personalization.'),
 Document(metadata={'source': 'MANISHMADAN.pdf', 'page': 0}, page_content='Frontend : React.js, Next.js, TailwindCSS, MaterialUI, ChakraUI, HTML, CSS\nBackend : Node.js, Express, Flask\nDatabase : MongoDB, PostgreSQL\nTools : Git, GitHub, Docker, Postman\nProjects\nLegalEdge |Node.js, JavaScript, Express.js, MongoDb github.com/ManishMadan2882/hackman-team\n•Engineered a sophisticated server-side solution empowering legal advice seekers through a community platform, driving\naccessibility and collaboration in legal consultations.

## Configuring the model

We'll be using Ollama to load the local model in memory. After creating the model, we can invoke it with a question to get the response back.

<img src='images/model.png' width="1000">

In [11]:
from langchain_ollama import ChatOllama

model = ChatOllama(model=MODEL, temperature=0)
model.invoke("Who is the president of the United States?")

AIMessage(content="I'm not aware of the current President of the United States, as my knowledge cutoff is December 2023. However, I can suggest some ways for you to find out who the current President is:\n\n1. Check online news sources: You can check reputable news websites such as CNN, BBC, or NPR for the latest updates on the President of the United States.\n2. Visit the official White House website: The official White House website (whitehouse.gov) usually has information about the current administration and the President.\n3. Look up government websites: You can also check the official website of the U.S. Government (usa.gov) or the Federal Register for information on the current President.\n\nPlease note that my knowledge may not be up-to-date, and I recommend verifying the information through multiple sources to ensure accuracy.", additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2024-10-05T20:49:18.544040819Z', 'message': {'role': 'assistant', 'content

## Parsing the model's response

The response from the model is an `AIMessage` instance containing the answer. We can extract the text answer by using the appropriate output parser. We can connect the model and the parser using a chain.

<img src='images/parser.png' width="1000">


In [14]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser 
print(chain.invoke("Who is elon musk?"))

Elon Musk is a South African-born entrepreneur, inventor, and business magnate. He is one of the most successful and influential figures in the tech industry today.

Early Life and Education:

Musk was born on June 28, 1971, in Pretoria, South Africa. He developed an interest in computing and programming at an early age and taught himself computer programming. He moved to Canada in 1992 to attend college, and later transferred to the University of Pennsylvania, where he graduated with a degree in economics and physics.

Career:

Musk's career can be divided into several stages:

1. Early years: Musk worked as a software engineer at various companies, including Pinnacle Research and X.com (which later became PayPal).
2. PayPal: In 2000, Musk co-founded PayPal, an online payment system that was acquired by eBay for $1.5 billion in 2002.
3. SpaceX: In 2002, Musk founded SpaceX, a private aerospace manufacturer and space transport services company with the goal of reducing space transporta

## Setting up a prompt

In addition to the question we want to ask, we also want to provide the model with the context from the PDF file. We can use a prompt template to define and reuse the prompt we'll use with the model.


<img src='images/prompt.png' width="1000">

In [15]:
from langchain.prompts import PromptTemplate

template = """
You are an assistant that provides answers to questions based on
a given context. 

Answer the question based on the context. If you can't answer the
question, reply "I don't know".

Be as concise as possible and go straight to the point.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
print(prompt.format(context="Here is some context", question="Here is a question"))


You are an assistant that provides answers to questions based on
a given context. 

Answer the question based on the context. If you can't answer the
question, reply "I don't know".

Be as concise as possible and go straight to the point.

Context: Here is some context

Question: Here is a question



## Adding the prompt to the chain

We can now chain the prompt with the model and the parser.

<img src='images/chain1.png' width="1000">

In [16]:
chain = prompt | model | parser

chain.invoke({
    "context": "Anna's sister is Susan", 
    "question": "Who is Susan's sister?"
})


'Anna.'

## Adding the retriever to the chain

Finally, we can connect the retriever to the chain to get the context from the vector store.

<img src='images/chain2.png' width="1000">

In [17]:
from operator import itemgetter

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
    }
    | prompt
    | model
    | parser
)

## Using the chain to answer questions

Finally, we can use the chain to ask questions that will be answered using the PDF document.

In [18]:
questions = [
    "What can you get away with when you only have a small number of users?",
    "What's the most common unscalable thing founders have to do at the start?",
    "What's one of the biggest things inexperienced founders and investors get wrong about startups?",
]

for question in questions:
    print(f"Question: {question}")
    print(f"Answer: {chain.invoke({'question': question})}")
    print("*************************\n")

Question: What can you get away with when you only have a small number of users?
Answer: With a small number of users, you can often get away with less robust or simplified features and infrastructure. In the context of Manish Madan's projects, this might mean:

* Using a simpler database schema (as seen in the WIN Research Centre project)
* Reducing the number of API endpoints (as seen in the WIN Research Centre project)
* Streamlining reporting processes to reduce manual data entry time
* Using less complex frontend frameworks or libraries (although Manish Madan's projects show he uses more advanced ones like React.js and Next.js)
*************************

Question: What's the most common unscalable thing founders have to do at the start?
Answer: I don't know.
*************************

Question: What's one of the biggest things inexperienced founders and investors get wrong about startups?
Answer: I don't know.
*************************

