In [None]:
import os 
os.environ

# A Gentle Introduction to RAG Applications

This notebook creates a simple RAG (Retrieval-Augmented Generation) system to answer questions from a PDF document using an open-source model.

In [1]:
from transformers import AutoModel

# Get the default cache directory
from transformers import TRANSFORMERS_CACHE

print(TRANSFORMERS_CACHE)

  from .autonotebook import tqdm as notebook_tqdm


C:\Users\MSI\.cache\huggingface\hub


In [2]:
import torch

# Check default directory where models are saved
print(torch.hub.get_dir())

C:\Users\MSI/.cache\torch\hub


In [3]:
import os 
print(os.path.expanduser('~/.cache/huggingface/transformers/'))


C:\Users\MSI/.cache/huggingface/transformers/


In [4]:
import os

# Get the path of the Hugging Face cache directory
hf_cache_dir = os.path.expanduser('~/.cache/huggingface/transformers/')

# List the contents of the cache directory
for root, dirs, files in os.walk(hf_cache_dir):
    print("Downloaded models:", dirs)
    break

In [5]:
# change dir 
os.environ['TRANSFORMERS_CACHE'] = '/new/path/to/store/models' #HF 
torch.hub.set_dir('/new/path/to/store/models') #pytorch

In [3]:
PDF_FILE = "FAQ_GDG.pdf"

# We'll be using Llama 3.1 8B for this example.
MODEL = "llama3.1"

## Loading the PDF document

Let's start by loading the PDF document and breaking it down into separate pages.

<img src='images/documents1.png' width="1000">

In [4]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader(PDF_FILE)
pages = loader.load()

print(f"Number of pages: {len(pages)}")
print(f"Length of a page: {len(pages[1].page_content)}")
print("Content of a page:", pages[1].page_content)

Number of pages: 12
Length of a page: 2000
Content of a page: W h y
w a s
m y
a c c o u n t
t u r n e d
t o
p r i v a t e ?
If
we
r easonably
belie v e
content
in
y our
pr oﬁle
violates
our
content
policy
,
y our
account
will
be
switched
t o
priv ate
and
the
content
in
y our
pr oﬁle
will
be
deleted.
Y ou
won 't
be
able
t o
mak e
y our
account
public
again
for
at
least
60
da ys.
Google
also
r eser v es
the
right
t o
suspend
or
terminate
y our
access
t o
the
ser vices
or
delete
y our
Google
Account,
as
described
in
the
T aking
action
in
case
of
pr oblems
section
of
the
Google
T erms
of
Ser vice.
W h a t
h a p p e n s
w h e n
I
i n t e g r a t e
m y
p r o ﬁ l e
w i t h
a
t h i r d - p a r t y
a p p
o r
s e r v i c e ?
If
y ou
authoriz e
an
application
t o
access
y our
Google
De v eloper
Pr ogr am
pr oﬁle,
that
application
will
be
able
t o
see
y our
pr oﬁle
information,
e v en
if
y ou
ha v e
not
made
y our
pr oﬁle
public.
Learn
mor e
about
how
t o
manage
thir d-par ty
apps
and
ser vices
wi

## Splitting the pages in chunks

Pages are too long, so let's split pages into different chunks.

<img src='images/splitter1.png' width="1000">


In [5]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=100)

chunks = splitter.split_documents(pages)
print(f"Number of chunks: {len(chunks)}")
print(f"Length of a chunk: {len(chunks[1].page_content)}")
print("Content of a chunk:", chunks[1].page_content)


Number of chunks: 33
Length of a chunk: 693
Content of a chunk: b y
going
t o
de v elopers.google.com/pr oﬁle/u/me
.
H o w
d o
I
e d i t
m y
p r o ﬁ l e ?
Y ou
can
edit
y our
Google
De v eloper
Pr ogr am
pr oﬁle
b y
going
t o
de v elopers.google.com/pr oﬁle/u/me
.
W h a t
h a p p e n s
i f
I
m a k e
m y
p r o ﬁ l e
p u b l i c ?
Making
y our
pr oﬁle
public
mak es
it
viewable
b y
any one
online.
This
includes
y our
name,
image,
r ole,
company
or
school,
bio,
badges
y ou'v e
r eceiv ed,
stats,
and
y our
social
media
links
(including
GitHub,
GitLab,
X,
Link edIn,
and
Stack
Ov erﬂow).
Y our
pages
sa v ed,
pages
r ated,
and
e v ents
attended
ar e
not
par t
of
y our
public
pr oﬁle.
Y ou
can
change
y our
pr oﬁle
priv acy
settings
under
the
Account
tab
at


## Storing the chunks in a vector store

We can now generate embeddings for every chunk and store them in a vector store.

<img src='images/vectorstore1.png' width="1000">


In [6]:
!pip install -qU langchain-community faiss-cpu

In [7]:
import faiss
print(faiss.__version__)

1.9.0


In [8]:
pip install -qU langchain-huggingface

Note: you may need to restart the kernel to use updated packages.


In [9]:
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings

embed_model = "nomic-embed-text"
embeddings = OllamaEmbeddings(model=embed_model)
vectorstore = FAISS.from_documents(chunks, embeddings)

## Setting up a retriever

We can use a retriever to find chunks in the vector store that are similar to a supplied question.

<img src='images/retriever1.png' width="1000">



In [10]:
retriever = vectorstore.as_retriever()
retriever.invoke("what is GDSC ?")

[Document(metadata={'source': 'FAQ_GDG.pdf', 'page': 9}, page_content='GDSC\nFAQ\nThe\npurpose\nof\nthis\ndocument\nis\nto\ncapture\nfrequently\nasked\nquestions\nabout\nthe\nGDSC\nprogram.\nJoin\nGDSC\nWho\nshould\njoin\nGoogle\nDeveloper\nStudent\nClubs?\nCollege\nand\nuniv ersity\nstudents\nar e\nencour aged\nt o\njoin\nGoogle\nDe v eloper\nStudent\nClubs.\nCan\nI\njoin\nmultiple\nchapters?\nY ou\ncan\npar ticipate\nin\ne v ents\nor ganiz ed\nb y\nmultiple\nchapters,\nhowe v er\nif\ny ou\ndecide\nt o\ndedicate\ny ourself\nt o\nbecome\na\nGDSC\nLead\nor\nCor e\nT eam\nMember ,\ny ou\nwill\nbe\noﬃcially\nassigned\nt o\none\nchapter .\nWhat\ndoes\na\nGDSC\nlead\ndo?\nIn\ngener al,\nGDSC\nleaders\nar e\nfocused\non\nthe\nfollowing\nar eas:\n●\nStar t\na\nclub\n-\nW ork\nwith\ny our\nuniv ersity\nor\ncollege\nt o\nstar t\na\nstudent\nclub.\nSelect\na\ncor e\nteam\nand'),
 Document(metadata={'source': 'FAQ_GDG.pdf', 'page': 11}, page_content='GDSC\nLeads\nshould\nbe\na v ailable\nt o\nrun

## Configuring the model

We'll be using Ollama to load the local model in memory. After creating the model, we can invoke it with a question to get the response back.

<img src='images/model.png' width="1000">

In [11]:
import langchain_community
print(dir(langchain_community))

['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', 'docstore', 'document_loaders', 'embeddings', 'utils', 'vectorstores']


In [12]:
from langchain_ollama import ChatOllama

model = ChatOllama(model=MODEL, temperature=0)
model.invoke("Who is the president of the United States?")

AIMessage(content='As of my last update in April 2023, Joe Biden is the President of the United States. He took office on January 20, 2021, succeeding Donald Trump as the 46th President of the United States. Please note that this information might change over time due to elections or other political developments.', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2024-11-26T09:29:02.474455Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 45513527900, 'load_duration': 19073096000, 'prompt_eval_count': 19, 'prompt_eval_duration': 3279129000, 'eval_count': 65, 'eval_duration': 23143425000}, id='run-52f6821a-4be6-41ca-ae08-4127063e71fa-0', usage_metadata={'input_tokens': 19, 'output_tokens': 65, 'total_tokens': 84})

In [25]:
%pip install -qU langchain-groq


Note: you may need to restart the kernel to use updated packages.


In [13]:
# Initialize the model
from langchain_groq import ChatGroq

model = ChatGroq(
    temperature=0.4,
    model= "llama-3.1-70b-versatile", #"llama3-70b-8192",
    api_key="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",
    verbose= True,
    max_retries=3,

)

## Parsing the model's response

The response from the model is an `AIMessage` instance containing the answer. We can extract the text answer by using the appropriate output parser. We can connect the model and the parser using a chain.

<img src='images/parser.png' width="1000">


In [14]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser 
print(chain.invoke("Who is the president of the United States?"))

As of my knowledge cutoff in 2023, the President of the United States is Joe Biden. However, please note that my information may not be up to date. For the most recent information, I recommend checking a reliable news source.


## Setting up a prompt

In addition to the question we want to ask, we also want to provide the model with the context from the PDF file. We can use a prompt template to define and reuse the prompt we'll use with the model.


<img src='images/prompt.png' width="1000">

In [15]:
from langchain.prompts import PromptTemplate

template = """
You are an assistant that provides answers to questions based on
a given context. 

Answer the question based on the context. If you can't answer the
question, reply "I don't know".

Be as concise as possible and go straight to the point.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
print(prompt.format(context="Here is some context", question="Here is a question"))


You are an assistant that provides answers to questions based on
a given context. 

Answer the question based on the context. If you can't answer the
question, reply "I don't know".

Be as concise as possible and go straight to the point.

Context: Here is some context

Question: Here is a question



## Adding the prompt to the chain

We can now chain the prompt with the model and the parser.

<img src='images/chain11.png' width="1000">

In [16]:
chain = prompt | model | parser

chain.invoke({
    "context": "Anna's sister is Susan", 
    "question": "Who is Susan's sister?"
})


'Anna.'

## Adding the retriever to the chain

Finally, we can connect the retriever to the chain to get the context from the vector store.

<img src='images/chain22.png' width="1000">

In [17]:
from operator import itemgetter

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
    }
    | prompt
    | model
    | parser
)

## Using the chain to answer questions

Finally, we can use the chain to ask questions that will be answered using the PDF document.

In [18]:
questions = [
    "What is GDG ?",
    "What is GDSC ?",
    "What is GDE?",
]

for question in questions:
    print(f"Question: {question}")
    print(f"Answer: {chain.invoke({'question': question})}")
    print("*************************\n")

Question: What is GDG ?
Answer: GDG stands for Google Developer Groups.
*************************

Question: What is GDSC ?
Answer: The context doesn't explicitly define what GDSC is, but based on the content, it appears to be "Google Developer Student Clubs".
*************************

Question: What is GDE?
Answer: GDE stands for Google Developer Experts.
*************************



In [27]:
q= "how many members in GDG carthage ? "

chain.invoke({'question': q})

"I don't know"

In [19]:
q= "how can i join GDG and be a GDE "
chain.invoke({'question': q})

'To join GDG: \n1. Visit the members site at https://gdg.community.dev/. \n2. If a nearby chapter doesn’t exist, you can apply to create a new GDG chapter in your city.\n\nTo become a GDE (Google Developer Expert), you need to meet certain requirements (mentioned in the context), but the application process is not explicitly mentioned in the context.'

In [25]:
q= "do i need to pay any fees to be a member in GDG and attend workshops ? "

chain.invoke({'question': q})

'No, there is no cost to join a chapter or attend events.'