In [1]:
!pip install langchain langchain_community langchain-google-genai python-dotenv langchain_experimental langchain_chroma langchainhub pypdf

Collecting langchain_community
  Downloading langchain_community-0.3.15-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-google-genai
  Downloading langchain_google_genai-2.0.9-py3-none-any.whl.metadata (3.6 kB)
Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting langchain_experimental
  Downloading langchain_experimental-0.3.4-py3-none-any.whl.metadata (1.7 kB)
Collecting langchain_chroma
  Downloading langchain_chroma-0.2.0-py3-none-any.whl.metadata (1.7 kB)
Collecting langchainhub
  Downloading langchainhub-0.1.21-py3-none-any.whl.metadata (659 bytes)
Collecting pypdf
  Downloading pypdf-5.1.0-py3-none-any.whl.metadata (7.2 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting langchain
  Downloading la

In [2]:
from langchain_community.document_loaders import PyPDFLoader

# Update the path to your PDF file
loader = PyPDFLoader("/content/HealthCareAI.pdf")
data = loader.load()  # entire PDF is loaded as a single Document

# Verify the data
print(data)
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split data
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500)
docs = text_splitter.split_documents(data)


print("Total number of documents: ",len(docs))

[Document(metadata={'source': '/content/HealthCareAI.pdf', 'page': 0, 'page_label': '1'}, page_content='Artificial Intelligence in Healthcare \nResearch Papers and Summaries: \n1. "Artificial Intelligence in Medical Imaging" by John Smith \no Summary: This paper explores the application of AI in medical imaging, focusing on \nthe use of deep learning algorithms to improve the accuracy and efficiency of image \nanalysis. It discusses various models such as convolutional neural networks \n(CNNs) and their impact on diagnostic processes. \n2. "AI-Driven Personalized Medicine" by Jane Doe \no Summary: This paper examines how AI can be used to tailor medical treatments to \nindividual patients. It highlights the role of machine learning in analyzing patient \ndata to predict treatment outcomes and optimize therapeutic strategies. \n3. "Robotic Surgery: The Future of Minimally Invasive Procedures" by Michael Johnson \no Summary: This paper reviews the advancements in robotic surgery, discuss

In [3]:
import os
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from dotenv import load_dotenv
from google.colab import userdata
api_key = userdata.get('GOOGLE_API_KEY')

# Load environment variables from a .env file
load_dotenv()
os.environ["GOOGLE_API_KEY"] = api_key


In [4]:
vectorstore = Chroma.from_documents(documents=docs, embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

In [14]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 10})

retrieved_docs = retriever.invoke("Can you summarize recent AI research papers?")

In [15]:
len(retrieved_docs)

10

In [16]:
for i in range(0,len(retrieved_docs)):
  print(retrieved_docs[i].page_content)

Artificial Intelligence in Healthcare 
Research Papers and Summaries: 
1. "Artificial Intelligence in Medical Imaging" by John Smith 
o Summary: This paper explores the application of AI in medical imaging, focusing on 
the use of deep learning algorithms to improve the accuracy and efficiency of image 
analysis. It discusses various models such as convolutional neural networks 
(CNNs) and their impact on diagnostic processes. 
2. "AI-Driven Personalized Medicine" by Jane Doe
healthcare applications, such as electronic health records (EHRs) and clinical 
decision support systems. It explores how NLP can improve data extraction and 
patient care. 
5. "AI in Drug Discovery and Development" by Robert Brown 
o Summary: This paper explores the role of AI in accelerating the drug discovery and 
development process. It discusses how machine learning algorithms can identify 
potential drug candidates and predict their efficacy and safety.
development process. It discusses how machine learning 

In [19]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro",temperature=0.6, max_tokens=500)

In [20]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [21]:
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

In [22]:
response = rag_chain.invoke({"input": "Can you provide a detailed summary of the latest advancements in artificial intelligence applications in healthcare, including key research papers and their findings?"})
print("RAG Output:", response["answer"])

RAG Output: Several research papers highlight AI's growing role in healthcare.  These include AI's use in medical imaging analysis (Smith), personalized medicine (Doe), drug discovery (Brown), and mental health (Taylor).  Other applications include predictive analytics for patient outcomes (Anderson), enhanced clinical trials (White), and chronic disease management (Young).
