# Medical Chatbot with Retrieval-Augmented Generation (RAG)

This Jupyter Notebook demonstrates how to build a Retrieval-Augmented Generation (RAG) medical chatbot. RAG is a powerful approach that combines language models with a retrieval component to enhance responses by incorporating information from a knowledge base. This setup is particularly useful for domains like healthcare, where accurate and factual information is essential.

## 1. Importing Libraries

We begin by importing the necessary libraries:
- **`langchain`**: Used to manage prompt templates, chains, and memory for conversational contexts.
- **`OpenAI`**: Provides the language model for response generation.

In [1]:
%%writefile requirements.txt
langchain
langchain-community
langchain-openai
pypdf
langchain-chroma
gradio

Writing requirements.txt


In [2]:
!pip install -q -r  requirements.txt

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m31.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m298.0/298.0 kB[0m [31m18.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.7/56.7 MB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m319.8/319.8 kB[0m [31m17.0 MB/s[0m eta [36m0:00:

In [3]:
from langchain_openai import OpenAI
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain import hub
from google.colab import userdata
import os

##2. Setting Up Environment Variables
To interact with the OpenAI API, you need an API key. The code snippet below fetches the API key stored as an environment variable. Ensure the variable OPENAI_API_KEY is set in your environment for the chatbot to function correctly.

In [4]:
os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

## 3. Indexing

We start by preparing our documents for retrieval using embeddings and vector storage.

### 3.1 Load

In this step, we load medical literature from a PDF file. The document used here is **The GALE Encyclopedia of Medicine**, which provides reliable medical information that the chatbot will draw upon when answering questions.

You can upload this or other medical documents to provide a robust foundation for the chatbot's responses.

In [5]:
from google.colab import files
files.upload()

Saving data.pdf to data.pdf


In [6]:
def load_documents(path):
  loader = PyPDFLoader(file_path=path)
  pages = loader.load_and_split()
  return pages

In [24]:
data_path = "/content/data.pdf"
pages =load_documents(data_path)

In [None]:
print(pages[50].page_content)

### 3.2 Split

The documents are split into chunks to make the retrieval process more efficient. This allows for targeted responses to user queries.

In [25]:
def text_split(document, chunk_size = 1000, chunk_overlap = 0):
  text_splitter = RecursiveCharacterTextSplitter(chunk_size = chunk_size, chunk_overlap =  chunk_overlap)
  text_splitted = text_splitter.split_documents(document)
  return text_splitted

In [26]:
text_splitted = text_split(pages)

In [27]:
print(text_splitted[1500].page_content)

Agoraphobic Foundation of Canada. P.O. Box 132, Chomedey,
Laval, Quebec. H7W 4K2, Canada.
Agoraphobics In Motion. 605 W. 11 Mile Rd., Royal Oak, MI
48067. (248) 547-0400.
American Psychiatric Association. 1400 K Street NW, Washing-
ton DC 20005. (888) 357-7924. <http://www.psych.org>.
Anxiety Disorders Association of America. 11900 Parklawn
Dr., Ste. 100, Rockville, MD 20852. (301) 231-9350.
<http://www.adaa.org>.
National Alliance for the Mentally Ill (NAMI). Colonial Place
Three, 2107 Wilson Blvd., Ste. 300, Arlington, V A 22201-
3042. (800) 950-6264. <http://www.nami.org>.
National Anxiety Foundation. 3135 Custer Dr., Lexington, KY
40517. (606) 272-7166. <http://www.lexington-on-line.
com/naf.html>.
National Institute of Mental Health. Mental Health Public
Inquiries, 5600 Fishers Lane, Room 15C-05, Rockville,
MD 20857. (888) 826-9438. <http://www.nimh.nih.gov>.
National Mental Health Association. 1021 Prince St., Alexan-
dria, V A 22314. (703) 684-7722. <http://www.nmha.org>.


### 3.3 Store

We create embeddings of our text chunks and store them in a vector database. This allows us to search for similar content efficiently.

In [28]:
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
db = Chroma.from_documents(text_splitted, embeddings)

## 4. Retrieval and Generation

With the indexed documents, we can now retrieve relevant information and generate responses based on user questions.

### 4.1 Retrieve

We set up a retriever to find content related to a user's query based on similarity with stored embeddings.

In [29]:
retriever = db.as_retriever(search_type= "similarity")

In [30]:
question = retriever.invoke("what's diabetes")

In [31]:
for i in range(len(question)):
  print(f"the {i+1}th similar content :\n \n {question[i].page_content}\n \n")

the 1th similar content :
 
 about 10 years after the beginning of diabetes. In the Unit-
ed States, new cases of blindness are most often caused by
diabetic retinopathy. Among these new cases of blindness,
12% are people between the ages of 20 to 44 years, and
19% are people between the ages of 45 to 64 years.
Causes and symptoms
There are many causes of retinopathy. Some of the
more common ones are listed below.
Diabetic retinopathy
Diabetes is a complex disorder characterized by an
inability of the body to properly regulate the levels of
sugar and insulin (a hormone made by the pancreas) in the
blood. As diabetes progresses, the blood vessels that feed
the retina become damaged in different ways. The dam-
aged vessels can have bulges in their walls (aneurysms),
they can leak blood into the surrounding jelly-like material
(vitreous) that fills the inside of the eyeball, they can
become completely closed, or new vessels can begin to
grow where there would not normally be blood vessels

### 4.2 Multi Query

we create multiple versions of a question to capture different perspectives or possible interpretations. This improves retrieval diversity.

In [32]:
llm = OpenAI(model_name="gpt-3.5-turbo-instruct")

In [33]:
from langchain.prompts import ChatPromptTemplate

# Multi Query: Different Perspectives
template = """You are an AI language model assistant. Your task is to generate five
different versions of the given user question to retrieve relevant documents from a vector
database. By generating multiple perspectives on the user question, your goal is to help
the user overcome some of the limitations of the distance-based similarity search.
Provide these alternative questions separated by newlines. Original question: {question}"""
prompt_perspectives = ChatPromptTemplate.from_template(template)

from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

generate_queries = (
    prompt_perspectives
    | ChatOpenAI(temperature=0)
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

In [34]:
generate_queries.invoke("what's diabetes")

['1. Can you provide information on diabetes?',
 '2. What are the key aspects of diabetes that I should know about?',
 '3. Could you explain the causes and symptoms of diabetes?',
 '4. What are the different types of diabetes and their effects on the body?',
 '5. How can diabetes be managed and treated effectively?']

In [35]:
from langchain.load import dumps, loads

def get_unique_union(documents: list[list]):
    """ Unique union of retrieved docs """
    # Flatten list of lists, and convert each Document to string
    flattened_docs = [dumps(doc) for sublist in documents for doc in sublist]
    # Get unique documents
    unique_docs = list(set(flattened_docs))
    return [loads(doc) for doc in unique_docs]

# Retrieve
question = "What's diabetes'?"
retrieval_chain = generate_queries | retriever.map() | get_unique_union
docs = retrieval_chain.invoke({"question":question})

4

In [42]:
for i in range(len(docs)):
  print(f"{i+1}. {docs[1].page_content}\n \n")


1. myasthenia gravis. This type of polyglandular defi-
ciency syndrome often produces insulin-dependent dia-
betes mellitus (IDDM).
• Type III disease may produce diabetes or adrenal fail-
ure combined with thyroid problems. It may also
include baldness (alopecia), anemia, and vitiligo (con-
dition characterized by white patches on normally pig-
mented skin).
Not all symptoms of any syndrome appear at once or
in the same patient.
Diagnosis
Because these diseases evolve over time, the final
diagnosis may not appear for years. A family history is
very helpful in knowing what to expect. Any single
endocrine abnormality should heighten suspicion that
KEY TERMS
Antibody —A weapon in the body’s immune
defense arsenal that attacks a specific antigen.
Congenital—Present at birth.
Myasthenia gravis—A disease that causes muscle
weakness.
Rubella—German measles.
Syndrome —A collection of abnormalities that
occur often enough to suggest they have a com-
mon cause.
 

2. myasthenia gravis. This typ

### 4.3 Generate

We process the retrieved information to generate a response using a language model, enhancing the chatbot's ability to answer complex queries.

In [53]:
from operator import itemgetter
from langchain_openai import ChatOpenAI
from langchain_core.runnables import RunnablePassthrough
from langchain.memory import ConversationBufferWindowMemory
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers.string import StrOutputParser

# RAG template
template = """Answer the following question based on this context and previous conversation:

Context:
{context}

Chat history:
{chat_history}

New human question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

# Initialize model and memory
llm = ChatOpenAI(temperature=1)
memory = ConversationBufferWindowMemory(k = 3, memory_key="chat_history")

# Define RAG pipeline without memory directly
final_rag_chain = (
    {"context": itemgetter("context"),
     "question": itemgetter("question"),
     "chat_history": itemgetter("chat_history")}  # Include chat_history as a key here
    | prompt
    | llm
    | StrOutputParser()
)

# Retrieve the chat history from memory and include it in the input
def invoke_with_memory(question, retrieval_chain):
    # Load current chat history from memory
    chat_history = memory.load_memory_variables({}).get("chat_history", "")
    result = final_rag_chain.invoke({
        "question": question,
        "context": retrieval_chain,  # Assuming retrieval_chain is defined elsewhere
        "chat_history": chat_history,
    })
    # Update memory with the new interaction
    memory.save_context({"question": question}, {"answer": result})
    return result

# Example call
invoke_with_memory("what's diabetes?", retrieval_chain=retrieval_chain)


  memory = ConversationBufferWindowMemory(k = 3, memory_key="chat_history")


'Diabetes is a complex disorder characterized by an inability of the body to properly regulate the levels of sugar and insulin in the blood.'

In [38]:
invoke_with_memory("what's its cause?", retrieval_chain=retrieval_chain)

'Based on the context provided, the cause of disorders like somatoform disorders and paraphilias can be influenced by factors such as unconscious reflection or imitation of parental behaviors, cultural influences, biological factors, difficulty forming personal relationships, childhood trauma, and conditioning.'

## 5. UI

Finally, we build a simple interface with Gradio to interact with the medical chatbot. Users can type questions, and the chatbot will respond with relevant information.

In [56]:
import gradio as gr

chat_history = []

def medical_chatbot(query):
    global chat_history

    response = invoke_with_memory(query, retrieval_chain)
    response = response.replace("Based on the context and previous conversation, ", "")
    response = response.replace("Based on the conversation and context provided, ", "")
    response = response.replace("Based on our previous conversation,", "")
    response = response.replace("Based on the context provided and our previous conversation,", "")


    chat_history.append((query, response))

    return chat_history

def reset_conversation():
    global chat_history
    chat_history = []
    memory.aclear()
    return chat_history

with gr.Blocks() as interface:
    gr.Markdown("# Medical Chatbot Assistant")
    gr.Markdown("Ask me any medical question, and I'll try to provide helpful information based on the provided data.")

    chatbot = gr.Chatbot()
    query = gr.Textbox(label="Your Question", placeholder="Type your medical question here...")

    submit_button = gr.Button("Get Answer")
    reset_button = gr.Button("Start New Conversation")

    submit_button.click(fn=medical_chatbot, inputs=query, outputs=chatbot)
    reset_button.click(fn=reset_conversation, inputs=None, outputs=chatbot)

    submit_button.click(lambda: "", None, query)

interface.launch()



Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://84a3d3c458fd6b3f4f.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


