# A Gentle Introduction to RAG Applications

This notebook creates a simple RAG (Retrieval-Augmented Generation) system to answer questions from a PDF document using an open-source model.

In [2]:
PDF_FILE = "LexMachina.pdf"

# We'll be using Llama 3.1 8B for this example.
MODEL = "llama3.2:latest"

## Loading the PDF document

Let's start by loading the PDF document and breaking it down into separate pages.

<img src='images/documents.png' width="1000">

In [3]:
! pip install langchain_community 



In [4]:
! pip install pypdf



In [5]:

from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader(PDF_FILE)
pages = loader.load()

print(f"Number of pages: {len(pages)}")
print(f"Length of a page: {len(pages[1].page_content)}")
print("Content of a page:", pages[1].page_content)

Number of pages: 4
Length of a page: 923
Content of a page: 3. Technical Details
Architecture&Workflow:Thearchitectureof LexMachinacomprisesseveral components:
● Frontend:DevelopedentirelywithFlutterallowingseamlessdeployment acrossmobile(Android, iOS), web, anddesktop.● Backend:Node.jswithExpresshandleshighthroughput andintegratesLangChainfornatural languageprocessingandRAG. Queriesareprocessedthroughthebackend,whichinteractswithlocallyhostedLLaMAmodels. K3sorchestratesbackenddeployment onprivatecloudoron-premises.● AI Model Integration:Meta’sLlama3.2(11Band90BVisionInstruct)forlegalquestions, document interpretation, andPDFextraction.● Database&CloudStorage:Self-hostedPostgreSQLforreal-timedatabase, MinIOforstorage, andOAuth2forauthentication, ensuringprivacy.
DataSources:
● PubliclyAvailableLegal Texts: Laws, regulations, government notifications, andlegalsummariesfromofficial sources.● UserContributions: Suggestionsfromcommunityusersforupdatedlegalinterpretations.



## Splitting the pages in chunks

Pages are too long, so let's split pages into different chunks.

<img src='images/splitter.png' width="1000">


In [6]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=100)

chunks = splitter.split_documents(pages)
print(f"Number of chunks: {len(chunks)}")
print(f"Length of a chunk: {len(chunks[1].page_content)}")
print("Content of a chunk:", chunks[1].page_content)


Number of chunks: 4
Length of a chunk: 922
Content of a chunk: 3. Technical Details
Architecture&Workflow:Thearchitectureof LexMachinacomprisesseveral components:
● Frontend:DevelopedentirelywithFlutterallowingseamlessdeployment acrossmobile(Android, iOS), web, anddesktop.● Backend:Node.jswithExpresshandleshighthroughput andintegratesLangChainfornatural languageprocessingandRAG. Queriesareprocessedthroughthebackend,whichinteractswithlocallyhostedLLaMAmodels. K3sorchestratesbackenddeployment onprivatecloudoron-premises.● AI Model Integration:Meta’sLlama3.2(11Band90BVisionInstruct)forlegalquestions, document interpretation, andPDFextraction.● Database&CloudStorage:Self-hostedPostgreSQLforreal-timedatabase, MinIOforstorage, andOAuth2forauthentication, ensuringprivacy.
DataSources:
● PubliclyAvailableLegal Texts: Laws, regulations, government notifications, andlegalsummariesfromofficial sources.● UserContributions: Suggestionsfromcommunityusersforupdatedlegalinterpretations.


## Storing the chunks in a vector store

We can now generate embeddings for every chunk and store them in a vector store.

<img src='images/vectorstore.png' width="1000">


In [7]:
! pip install -U langchain-ollama



In [19]:
! pip install faiss-gpu

ERROR: Could not find a version that satisfies the requirement faiss-gpu (from versions: none)
ERROR: No matching distribution found for faiss-gpu


In [9]:
from langchain_community.vectorstores import FAISS
from langchain_ollama import OllamaEmbeddings

embeddings = OllamaEmbeddings(model=MODEL)
vectorstore = FAISS.from_documents(chunks, embeddings)

In [10]:
! pip install faiss-cpu



## Setting up a retriever

We can use a retriever to find chunks in the vector store that are similar to a supplied question.

<img src='images/retriever.png' width="1000">



In [20]:
retriever = vectorstore.as_retriever()
retriever.invoke(" Technical Details of LexMachina ")

[Document(metadata={'source': 'LexMachina.pdf', 'page': 3}, page_content="5. Impact&Potential\nTargetAudience:\n● Citizensseekingeasyaccesstolegal information.● Legal Professionalslookingforaquickreferencetool.● Students&ResearchersstudyingIndianlaw.\nImpact&Benefits:\n● Social Impact: Empowercitizenswithknowledgeof theirrights, reducingrelianceonlegal intermediaries.● EconomicImpact: Lowerlegal costsbyprovidingaccessiblelegal information.\nScalability&FuturePlans:\n● FutureFeatures: Integrateacommunityreviewsystemforlegal updates. Developvoice-basedsearchanddocument analysis.● Global Expansion: Expandtoincludelegal systemsof othercountries.\n6. ConclusionLexMachinaismorethananAI tool, it'sabeaconofhopeforthoseoverwhelmedbytheIndianlegal system. Bysimplifyingcomplexlawsandbreakingdownlanguagebarriers, weempowercitizenstounderstandandassert theirrights. LexMachinabridgesthegapbetweenpeopleandjustice, turningconfusionintoclarityandhelplessnessintoempowerment. Webelievethat whenknowledgei

## Configuring the model

We'll be using Ollama to load the local model in memory. After creating the model, we can invoke it with a question to get the response back.

<img src='images/model.png' width="1000">

In [21]:
from langchain_ollama import ChatOllama

model = ChatOllama(model=MODEL, temperature=0)
model.invoke("What's LexMachina?")

AIMessage(content="Lex Machina is a company that provides data and analytics for intellectual property (IP) litigation. They offer a platform that allows users to analyze and compare the performance of attorneys, law firms, and judges in patent cases.\n\nThe platform uses data from publicly available sources, such as court filings and transcripts, to provide insights on various aspects of IP litigation, including:\n\n1. Attorney performance: Lex Machina analyzes the success rates, win-loss records, and other metrics for individual attorneys and law firms.\n2. Judge performance: The platform evaluates judges' decisions and outcomes in patent cases, providing information on their track record and consistency.\n3. Case strategy: Lex Machina offers data-driven insights on case strategies, including the use of expert witnesses, discovery tactics, and settlement negotiations.\n\nBy analyzing this data, users can gain a better understanding of the IP litigation landscape, identify trends and 

In [22]:
model.invoke(" What's the technical details of LexMachina? ")

AIMessage(content='I couldn\'t find any information on "LexMachina." It\'s possible that it\'s a fictional or non-existent entity, or it may be a term that is not widely known or used.\n\nIf you could provide more context or clarify what you mean by "LexMachina," I\'d be happy to try and help you further.', additional_kwargs={}, response_metadata={'model': 'llama3.2:latest', 'created_at': '2024-10-30T06:25:27.4963824Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 3260488700, 'load_duration': 55021800, 'prompt_eval_count': 37, 'prompt_eval_duration': 436804000, 'eval_count': 70, 'eval_duration': 2767226000}, id='run-3f169f12-ea65-4b70-b7bd-461d81832c35-0', usage_metadata={'input_tokens': 37, 'output_tokens': 70, 'total_tokens': 107})

## Parsing the model's response

The response from the model is an `AIMessage` instance containing the answer. We can extract the text answer by using the appropriate output parser. We can connect the model and the parser using a chain.

<img src='images/parser.png' width="1000">


In [24]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser 
print(chain.invoke("What's the technical details of LexMachina ?"))

I couldn't find any information on "LexMachina." It's possible that it's a fictional or non-existent entity, or it may be a term that is not widely known or used.

If you could provide more context or clarify what you mean by "LexMachina," I'll do my best to help.


## Setting up a prompt

In addition to the question we want to ask, we also want to provide the model with the context from the PDF file. We can use a prompt template to define and reuse the prompt we'll use with the model.


<img src='images/prompt.png' width="1000">

In [15]:
from langchain.prompts import PromptTemplate

template = """
You are an assistant that provides answers to questions based on
a given context. 

Answer the question based on the context. If you can't answer the
question, reply "I don't know".

Be as concise as possible and go straight to the point.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
print(prompt.format(context="Here is some context", question="Here is a question"))


You are an assistant that provides answers to questions based on
a given context. 

Answer the question based on the context. If you can't answer the
question, reply "I don't know".

Be as concise as possible and go straight to the point.

Context: Here is some context

Question: Here is a question



## Adding the prompt to the chain

We can now chain the prompt with the model and the parser.

<img src='images/chain1.png' width="1000">

In [16]:
chain = prompt | model | parser

chain.invoke({
    "context": "Anna's sister is Susan", 
    "question": "Who is Susan's sister?"
})


'Anna.'

## Adding the retriever to the chain

Finally, we can connect the retriever to the chain to get the context from the vector store.

<img src='images/chain2.png' width="1000">

In [26]:
from operator import itemgetter

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
    }
    | prompt
    | model
    | parser
)

## Using the chain to answer questions

Finally, we can use the chain to ask questions that will be answered using the PDF document.

In [28]:
questions = [
    "What's the technical details of LexMachina ?",
    "What's the problem LexMachina solves ?",
    "Who's US President ?",
]

for question in questions:
    print(f"Question: {question}")
    print(f"Answer: {chain.invoke({'question': question})}")
    print("*************************\n")

Question: What's the technical details of LexMachina ?
Answer: The technical details of LexMachina are:

* Architecture & Workflow:
	+ Frontend: Developed entirely with Flutter for seamless deployment across mobile (Android, iOS), web, and desktop.
	+ Backend: Node.js with Express handles high throughput and integrates LangChain for natural language processing and RAG.
	+ AI Model Integration: Meta's Llama 3.2 (11B) and 90B Vision Instruct for legal questions, document interpretation, and PDF extraction.
* Database & Cloud Storage:
	+ Self-hosted PostgreSQL for real-time database
	+ MinIO for storage
	+ OAuth2 for authentication, ensuring privacy
*************************

Question: What's the problem LexMachina solves ?
Answer: LexMachina solves the problem of complex and intimidating access to Indian legal laws for the average citizen, reducing reliance on legal intermediaries and providing accessible legal information.
*************************

Question: Who's US President ?
Answer