<a href="https://colab.research.google.com/github/SriVarshaCheruku/MedicalChatbot/blob/main/Medical_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

BioMistral Medical RAG Chatbot using BioMistral OpenSource LLM

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [9]:
# Install required packages
!pip install langchain sentence-transformers chromadb langchain_community pypdf


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [3]:
from langchain_community.document_loaders import PyPDFDirectoryLoader

# ✅ Corrected path with space in " BioMistral"
loader = PyPDFDirectoryLoader("/content/drive/MyDrive/ BioMistral/Data")
docs = loader.load()

print(f"Loaded documents: {len(docs)}")

# View a sample document
if docs:
    print(docs[0].page_content[:500])
else:
    print("⚠️ Still empty — double-check file format or permissions.")


Loaded documents: 95
YOUR GUIDE TO
A Healthy Heart
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
National Institutes of Health
National Heart, Lung, and Blood Institute


In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Split the loaded documents into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(docs)

print(f"Total chunks created: {len(chunks)}")

# Optional: See a sample chunk
print("\nSample chunk:")
print(chunks[0].page_content)


Total chunks created: 585

Sample chunk:
YOUR GUIDE TO
A Healthy Heart
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
National Institutes of Health
National Heart, Lung, and Blood Institute


In [5]:
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma

# Load embedding model (specialized for medical data)
embedding_model = SentenceTransformerEmbeddings(model_name="NeuML/pubmedbert-base-embeddings")

# Build vector database from chunks
vectorstore = Chroma.from_documents(chunks, embedding_model)

print("✅ Vectorstore created successfully.")


✅ Vectorstore created successfully.


In [6]:
# Example medical question
query = "Who is at risk of heart disease?"

# Perform similarity search in the vectorstore
results = vectorstore.similarity_search(query, k=3)

# Show the top 3 results
for i, doc in enumerate(results):
    print(f"\nResult {i+1}:\n{doc.page_content}")



Result 1:
4
Who Is at Risk?
Risk factors are conditions or habits that make a person more likely
to develop a disease. They can also increase the chances that an
existing disease will get worse. Important risk factors for heart dis-
ease that you can do something about are cigarette smoking, high

Result 2:
heart disease risk increases enormously. The message is clear: You
need to take heart disease risk seriously, and the best time to reduce
that risk is now.
Your Guide to a Healthy Heart

Result 3:
at least one risk factor for heart disease.
Every risk factor counts. Research shows that each individual risk
factor greatly increases the chances of developing heart disease.
Moreover, the worse a particular risk factor is, the more likely you


In [8]:
from langchain_community.llms import LlamaCpp
from langchain.chains import RetrievalQA

# Step 1: Load the Mistral LLM from your uploaded .gguf file
llm = LlamaCpp(
    model_path="/content/drive/MyDrive/ BioMistral/mistral-7b-instruct-v0.1.Q4_K_M.gguf",
    n_ctx=2048,
    temperature=0.1,
    max_tokens=512,
    verbose=True
)

# Step 2: Connect your retriever (built from vectorstore)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# Step 3: Build the chatbot pipeline using RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    chain_type="stuff"
)


llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from /content/drive/MyDrive/ BioMistral/mistral-7b-instruct-v0.1.Q4_K_M.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.1
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: -

In [10]:

# Step 4: Ask your medical chatbot a question
query = input("Enter your medical question: ")
response = qa_chain.run(query)

print("💬 Answer from Chatbot:\n", response)


Enter your medical question: What are the symptoms of heart disease in women?


llama_perf_context_print:        load time =   75915.51 ms
llama_perf_context_print: prompt eval time =   75915.31 ms /   278 tokens (  273.08 ms per token,     3.66 tokens per second)
llama_perf_context_print:        eval time =   40434.65 ms /   109 runs   (  370.96 ms per token,     2.70 tokens per second)
llama_perf_context_print:       total time =  116426.38 ms /   387 tokens


💬 Answer from Chatbot:
  The symptoms of heart disease in women can be similar to those in men, such as chest pain or discomfort, shortness of breath, nausea, vomiting, sweating, and lightheadedness. However, some women may have different symptoms, such as fatigue, weakness, swelling in the ankles and feet, and difficulty sleeping. It’s important to note that many women with heart disease have no symptoms at all, which is why regular check-ups and screenings are crucial for early detection and treatment.


Sample Questions:-
Who is at risk of heart disease?
What are the symptoms of heart disease in women?
What are the major risk factors for a heart attack?
How does high blood pressure affect heart health?
What are the warning signs of a heart attack?
What are the early signs of heart disease?
How can I reduce my risk of heart disease?
What foods are good for heart health?
How much sodium should I consume daily?
What does HDL and LDL cholesterol mean?
How does exercise help prevent heart disease?
What is a normal blood pressure reading?
What is angioplasty and when is it needed?
Are heart attack symptoms different in men and women?
What is a stress test used for?
What should I do if I think I’m having a heart attack?
Can young people get heart disease?
What is the role of cholesterol in heart health?
What lifestyle changes help prevent heart problems?
How often should I get my cholesterol checked?
