Building BioMistral RAG Chatbot using BioMistral Open Source LLM

Load the google drive

In [13]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Installing packages

In [1]:
!pip install langchain sentence-transformers chromadb llama-cpp-python langchain_community pypdf



Import libraries

In [2]:
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain_community.llms import LlamaCpp
from langchain.chains import RetrievalQA, LLMChain

Import document

In [3]:
loader = PyPDFDirectoryLoader("/content/drive/MyDrive/Biomistral/Data")
docs = loader.load()

In [4]:
len(docs)

20

In [5]:
docs[10]

Document(metadata={'producer': 'ReportLab PDF Library - www.reportlab.com', 'creator': '(unspecified)', 'creationdate': '2025-09-25T09:00:43+00:00', 'author': '(anonymous)', 'keywords': '', 'moddate': '2025-09-25T09:00:43+00:00', 'subject': '(unspecified)', 'title': '(anonymous)', 'trapped': '/False', 'source': '/content/drive/MyDrive/Biomistral/Data/heart_diseases_training_full.pdf', 'total_pages': 20, 'page': 10, 'page_label': '11'}, page_content='Surgical and Interventional Procedures\nKey surgical approaches include: - Coronary artery bypass grafting (CABG): Restores\nblood flow to heart. - Valve repair/replacement: Treats stenosis or regurgitation. -\nPacemakers and ICDs: Correct electrical conduction problems. - Heart transplant: Option\nfor end-stage heart failure. Minimally invasive approaches and robotic-assisted surgeries\nare improving patient outcomes and recovery times.')

Chunking

In [6]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = text_splitter.split_documents(docs)

In [7]:
len(chunks)

42

In [8]:
chunks[30]

Document(metadata={'producer': 'ReportLab PDF Library - www.reportlab.com', 'creator': '(unspecified)', 'creationdate': '2025-09-25T09:00:43+00:00', 'author': '(anonymous)', 'keywords': '', 'moddate': '2025-09-25T09:00:43+00:00', 'subject': '(unspecified)', 'title': '(anonymous)', 'trapped': '/False', 'source': '/content/drive/MyDrive/Biomistral/Data/heart_diseases_training_full.pdf', 'total_pages': 20, 'page': 14, 'page_label': '15'}, page_content='Arrhythmias\nArrhythmias are disorders of heart rhythm, ranging from harmless to life-threatening: -\nBradycardia: Slow heartbeat. - Tachycardia: Fast heartbeat. - Atrial fibrillation: Common\narrhythmia with risk of stroke. - Ventricular fibrillation: Medical emergency leading to')

In [9]:
chunks[31]

Document(metadata={'producer': 'ReportLab PDF Library - www.reportlab.com', 'creator': '(unspecified)', 'creationdate': '2025-09-25T09:00:43+00:00', 'author': '(anonymous)', 'keywords': '', 'moddate': '2025-09-25T09:00:43+00:00', 'subject': '(unspecified)', 'title': '(anonymous)', 'trapped': '/False', 'source': '/content/drive/MyDrive/Biomistral/Data/heart_diseases_training_full.pdf', 'total_pages': 20, 'page': 14, 'page_label': '15'}, page_content='cardiac arrest. Treatment includes anti-arrhythmic drugs, catheter ablation, and\nimplantable devices (pacemakers, defibrillators).')

Embedding

In [None]:
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "paste your access token here"

In [11]:
embeddings = SentenceTransformerEmbeddings(model_name="NeuML/pubmedbert-base-embeddings")

  embeddings = SentenceTransformerEmbeddings(model_name="NeuML/pubmedbert-base-embeddings")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Vector store creation

In [12]:
vectorstore = Chroma.from_documents(chunks, embeddings)

In [13]:
query = "What are the different types of heart diseases?"
search_results  = vectorstore.similarity_search(query)

In [14]:
search_results

[Document(metadata={'page': 3, 'author': '(anonymous)', 'moddate': '2025-09-25T09:00:43+00:00', 'title': '(anonymous)', 'keywords': '', 'total_pages': 20, 'creationdate': '2025-09-25T09:00:43+00:00', 'producer': 'ReportLab PDF Library - www.reportlab.com', 'creator': '(unspecified)', 'trapped': '/False', 'page_label': '4', 'subject': '(unspecified)', 'source': '/content/drive/MyDrive/Biomistral/Data/heart_diseases_training_full.pdf'}, page_content='Types of Heart Diseases\nHeart diseases can be broadly classified into: 1. Coronary artery disease (plaque buildup\nin arteries). 2. Arrhythmias (irregular heartbeats). 3. Cardiomyopathy (diseases of heart\nmuscle). 4. Congenital heart disease (structural defects from birth). 5. Valvular heart'),
 Document(metadata={'total_pages': 20, 'subject': '(unspecified)', 'page_label': '1', 'creationdate': '2025-09-25T09:00:43+00:00', 'trapped': '/False', 'keywords': '', 'moddate': '2025-09-25T09:00:43+00:00', 'source': '/content/drive/MyDrive/Biomist

In [15]:
retriever = vectorstore.as_retriever(search_kwargs={'k':3})

In [16]:
retriever.get_relevant_documents(query)

  retriever.get_relevant_documents(query)


[Document(metadata={'subject': '(unspecified)', 'source': '/content/drive/MyDrive/Biomistral/Data/heart_diseases_training_full.pdf', 'page_label': '4', 'keywords': '', 'producer': 'ReportLab PDF Library - www.reportlab.com', 'title': '(anonymous)', 'trapped': '/False', 'creationdate': '2025-09-25T09:00:43+00:00', 'creator': '(unspecified)', 'moddate': '2025-09-25T09:00:43+00:00', 'author': '(anonymous)', 'total_pages': 20, 'page': 3}, page_content='Types of Heart Diseases\nHeart diseases can be broadly classified into: 1. Coronary artery disease (plaque buildup\nin arteries). 2. Arrhythmias (irregular heartbeats). 3. Cardiomyopathy (diseases of heart\nmuscle). 4. Congenital heart disease (structural defects from birth). 5. Valvular heart'),
 Document(metadata={'producer': 'ReportLab PDF Library - www.reportlab.com', 'page_label': '1', 'creationdate': '2025-09-25T09:00:43+00:00', 'keywords': '', 'creator': '(unspecified)', 'author': '(anonymous)', 'subject': '(unspecified)', 'total_page

LLM Model Loading

In [18]:
llm = LlamaCpp(
    model_path = "/content/drive/MyDrive/Biomistral/ggml-model-Q4_K_M.gguf",
    temperature = 0.2,
    max_tokens = 2048,
    top_p = 1
)

llama_model_loader: loaded meta data with 21 key-value pairs and 291 tensors from /content/drive/MyDrive/Biomistral/ggml-model-Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = models
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.att

Use LLM, Retriever and Query to gnerate final response

In [19]:
template = """
<|context|>
You are a medical chatbot which follows the instructions and generate the accurate
response based on the query and the context provided. Please be truthful and give direct answers.
</s>
<|user|>
{query}
</s>
<|assisstant|>
"""

In [20]:
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
from langchain.prompts import ChatPromptTemplate

In [21]:
prompt = ChatPromptTemplate.from_template(template)

In [22]:
rag_chain = (
    {"context": retriever, "query": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [25]:
query = "What are the causes of heart failure?"
response = rag_chain.invoke(query)

Llama.generate: 59 prefix-match hit, remaining 17 prompt tokens to eval
llama_perf_context_print:        load time =   44456.63 ms
llama_perf_context_print: prompt eval time =   11438.56 ms /    17 tokens (  672.86 ms per token,     1.49 tokens per second)
llama_perf_context_print:        eval time =   45817.11 ms /    62 runs   (  738.99 ms per token,     1.35 tokens per second)
llama_perf_context_print:       total time =   57346.28 ms /    79 tokens
llama_perf_context_print:    graphs reused =         61


In [26]:
response

'Heart failure can be caused by several factors, including high blood pressure, diabetes, obesity, smoking, and a family history of heart disease. It can also result from infections such as pneumonia or influenza, which cause myocarditis (inflammation of the heart muscle).'

In [27]:
import sys

while True:
  user_input = input(f"Input query:")
  if user_input == "exit":
    print("Exiting...")
    sys.exit()
  if user_input == "":
    continue
  response = rag_chain.invoke(user_input)
  print(response)

Input query:What is meant by heart disease?


Llama.generate: 57 prefix-match hit, remaining 18 prompt tokens to eval
llama_perf_context_print:        load time =   44456.63 ms
llama_perf_context_print: prompt eval time =   15316.23 ms /    18 tokens (  850.90 ms per token,     1.18 tokens per second)
llama_perf_context_print:        eval time =   80527.34 ms /    99 runs   (  813.41 ms per token,     1.23 tokens per second)
llama_perf_context_print:       total time =   96005.73 ms /   117 tokens
llama_perf_context_print:    graphs reused =         96


Heart disease is a condition in which the heart is not functioning properly. It can be caused by various factors such as high blood pressure, diabetes, smoking, and high cholesterol levels. Heart disease can lead to serious complications such as heart attack or stroke, so it is important to take steps to prevent it. This can include eating a healthy diet, exercising regularly, and avoiding tobacco products. If you are concerned about your risk of heart disease, talk to your doctor.
Input query:What are the symptoms of heart attack?


Llama.generate: 57 prefix-match hit, remaining 19 prompt tokens to eval
llama_perf_context_print:        load time =   44456.63 ms
llama_perf_context_print: prompt eval time =    8204.58 ms /    19 tokens (  431.82 ms per token,     2.32 tokens per second)
llama_perf_context_print:        eval time =   58544.49 ms /    77 runs   (  760.32 ms per token,     1.32 tokens per second)
llama_perf_context_print:       total time =   66859.90 ms /    96 tokens
llama_perf_context_print:    graphs reused =         75


The symptoms of a heart attack are chest pain or discomfort that lasts more than 5 minutes, especially if it is accompanied by shortness of breath, nausea, vomiting, sweating, or dizziness. Other signs and symptoms can include pain or discomfort in one or both arms or shoulders, breaking out in a cold sweat, or lightheadedness.
Input query:exit
Exiting...


SystemExit: 

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)
