## RAG MODEL FOR ONE SINGLE BOOK 

In [4]:
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader("data\Medical_book.pdf")
data = loader.load()  

In [5]:
len(data)

637

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
docs = text_splitter.split_documents(data)
print("Total number of documents: ",len(docs))

Total number of documents:  3424


In [7]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from dotenv import load_dotenv
load_dotenv() 
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector = embeddings.embed_query("hello, world!")
vector[:5]

[0.05168594419956207,
 -0.030764883384108543,
 -0.03062233328819275,
 -0.02802734263241291,
 0.01813093200325966]

In [8]:
vectorstore = Chroma.from_documents(documents=docs, embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

In [9]:
retriever=vectorstore.as_retriever(search_type="similarity",search_kwargs={"k":10})
retrieved_docs=retriever.invoke("how cough forms?")

In [10]:
len(retrieved_docs)

10

In [11]:
print(retrieved_docs[5].page_content)

progress to the life-threatening disease emphysema.
A mild cough, sometimes called smokers’ cough, is
usually the first visible sign of chronic bronchitis.
Coughing brings up phlegm, although the amount varies
considerably from person to person. Wheezing and
shortness of breathmay accompany the cough. Diag-
nostic tests show a decrease in lung function. As the dis-
ease advances, breathing becomes difficult and activity
decreases. The body does not get enough oxygen, lead-
ing to changes in the composition of the blood.
Diagnosis
Initial diagnosis of bronchitis is based on observing
the patient’s symptoms and health history. The physician
will listen to the patient’s chest with a stethoscope for
specific sounds that indicate lung inflammation, such as
moist rales and crackling, and wheezing, that indicates
airway narrowing. Moist rales is a bubbling sound heard
with a stethoscope that is caused by fluid secretion in the
bronchial tubes.


In [12]:
from langchain_google_genai import ChatGoogleGenerativeAI
llm=ChatGoogleGenerativeAI(model="gemini-1.5-pro",temperature=0.1,max_tokens=50)

In [13]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)
prompt=ChatPromptTemplate.from_messages(
    [
        ("system",system_prompt),
        ("human","{input}"),
    ]
)


### RESPONSE FROM THE SOURCE 

In [14]:
question_answer_chain=create_stuff_documents_chain(llm,prompt)
rag_chain=create_retrieval_chain(retriever,question_answer_chain)

In [15]:
response=rag_chain.invoke({"input" : "What is Myopia?"})
print(response["answer"],end="\n")

Myopia, or nearsightedness, is a refractive error where the eye focuses light in front of the retina instead of on it. This causes distant objects to appear blurry while close objects remain clear.  It can be corrected with glasses, contact lenses


## RAG MODEL TREATED FOR MULTIPLE BOOKS

In [16]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.vectorstores import FAISS
books = [
    "data/Medical_book.pdf",
    "data/Human_Anatomy.pdf",
    "data/Gray's anatomy for students.pdf",
    "data/Clinically oriented anatomy.pdf",
    "data\harrison’s-principles-of-internal-medicine-21st-edition.pdf"
]
data_books = []
for book in books:
    loader = PyPDFLoader(book)
    data = loader.load()
    data_books.extend(data)

In [17]:
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector_store = FAISS.from_documents(data_books, embeddings)

In [23]:
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
from langchain.chains import LLMChain

vector_store.save_local("faiss_index")

def query_rag_model(question: str):
    retrieved_docs = vector_store.similarity_search(question)
    docs_content = "\n".join([doc.page_content for doc in retrieved_docs])
    system_message = SystemMessagePromptTemplate.from_template(
        "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}"
    )
    human_message = HumanMessagePromptTemplate.from_template("{input}")
    chat_prompt = ChatPromptTemplate.from_messages([system_message, human_message])
    llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.2)
    chain = LLMChain(llm=llm, prompt=chat_prompt)
    inputs = {"context": docs_content, "input": question}
    if inputs["input"] =="":
        return f"You haven't given a question please give a question."
    else :
        response = chain.run(inputs)
        return response

question = "what is an eye?"
response = query_rag_model(question)
print(response,end=" ")

The eye is the organ of vision.  It contains receptors for vision and a refracting system that focuses light rays onto the retina.  The eyeball is protected by the orbit, formed by various bones.
 

## CONCLUSION 

THE ABOVE RAG MODEL WORKS ON LANGCHAIN AND GEMINI PRO VERSION 1.5 AND PYTHON VERSION 3.10.0 . THIS MODEL IS TREATED WITH 5 BOOKS OF THE HUMAN ANATOMY AND AS FAR IT HAS SHOWN 100% ACCURACY.