### Importing required libraries

In [1]:
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_cohere.embeddings import CohereEmbeddings
from langchain.vectorstores import FAISS
from langchain_groq import ChatGroq
from langchain.chains import RetrievalQA
from dotenv import load_dotenv
import os

In [2]:
# LOADING ENVIRONMENT VARIABLES
load_dotenv()

True

In [3]:
# SWITCHING TO PROJECT'S ROOT DIRECTORY IN ORDER TO DOWNLOAD AND READ THE SAMPLE DOCUMENTS
os.chdir("../")

In [4]:
# DOWNLOADING SAMPLE DOCUMENTS
!python utils/download_sample_documents.py

Downloading Kidney-Stones-Patient-Guide.pdf to sample_documents/...
sample_documents/Kidney-Stones-Patient-Guide.pdf already exists. Skipping download.

Downloading budget_speech.pdf to sample_documents/...
sample_documents/budget_speech.pdf already exists. Skipping download.



### Loading and Extracting PDF Content

In [5]:
pdf_loader = PyPDFLoader("sample_documents/Kidney-Stones-Patient-Guide.pdf")
pages = pdf_loader.load()

In [6]:
pages[0].page_content

'KIDNEY STONES\nKidney Stones Patient Guide'

### Creating Chunks from Pages Content

In [7]:
# JOINING TEXT FROM ALL PAGES OF MULTI-PAGE PDF INTO A SINGLE TEXT
# IT ENABLES CHUNKS TO BE FORMED USING TEXT FROM ADJACENT PAGES TO RETAIN CONTEXT
# BY DEFAULT, `RecursiveCharacterTextSplitter` CREATES CHUNKS PAGE-WISE THAT MEANS IT DOESN'T
# INCLUDE TEXT FROM NEXT PAGE TO THE LAST CHUNK OF THE CURRENT PAGE
full_text = "\n".join([page.page_content for page in pages])

In [8]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=400)
chunks = text_splitter.create_documents([full_text])

### Initializing Embedding Model

In [9]:
embeddings = CohereEmbeddings(model="embed-english-light-v3.0")

### Storing Embeddings to Vector DB (FAISS)

In [10]:
db = FAISS.from_documents(chunks, embeddings)

### Initializing LLM

In [11]:
llm = ChatGroq(model="mistral-saba-24b", temperature=0)

### Initializing QA Chain

In [12]:
qa_chain = RetrievalQA.from_chain_type(llm, retriever=db.as_retriever(), chain_type_kwargs={"verbose": True})

### Asking Questions to the Document

In [13]:
query = "What type of stone is formed due to high volume of uric acid in urine?"
answer = qa_chain.invoke(query)
print(answer["result"])



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
•  Passing urine more often or a burning feeling when you 
pass urine .
•  Urine that is dark or red due to blood . (Sometimes urine 
has only small amounts of red blood cells that can’t be 
seen with the naked eye .)
• Nausea and vomiting .
• A feeling of pain at the tip of the penis in men .
What are Kidney Stones Made of? 
Kidney stones come in many types and colors . The way your 
kidney stones will be treated depends on the type of stone 
you have . The path to prevent new stones from forming will 
also depend on your stone type . 
Calcium stones (80% of stones)
Calcium stones are the most common type . There are two 
types of calcium stones: calcium oxalate and calc

In [14]:
query = "How can we prevent forming them?"
answer = qa_chain.invoke(query)
print(answer["result"])



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
health care provider will perform tests to find out what is causing 
your stones . After finding out why you get stones, your health 
care provider may give you tips to help stop them from coming 
back . Some of the tests he or she may do are listed below .
Medical and dietary history
Your health care provider will ask questions about your 
personal and family medical history . He or she may ask:
•  Have you had more than one kidney stone before?
•  Has anyone in your family had stones?
•  Do you have a medical condition that may increase your 
chance of having stones, like frequent diarrhea, gout or 
diabetes?
Knowing your eating habits is also helpful . You may be 
eati

In [15]:
query = "List down all types of kidney stones mentioned in the document."
answer = qa_chain.invoke(query)
print(answer["result"])



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
stone at some point in their life . In 1994, that number rose 
to about 5 in every 100 people . At this time, about 1 in 10 
Americans will have a kidney stone during his or her lifetime . 
Children getting kidney stones has also become more 
common in recent years .
Race, gender and ethnicity play a part in who may get kidney 
stones . Whites are more likely to get kidney stones than 
African-Americans or other races . Men get kidney stones 
more often than women . Still, the number of women getting 
kidney stones is rising .
Kidney stones are often very painful and can keep happening 
in some people . Kidney stone attacks lead to over 2 million 
visits to the doctor and