## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

**Common Questions to Answer**

**1. Diagnostic Assistance**: "What are the common symptoms and treatments for pulmonary embolism?"

**2. Drug Information**: "Can you provide the trade names of medications used for treating hypertension?"

**3. Treatment Plans**: "What are the first-line options and alternatives for managing rheumatoid arthritis?"

**4. Specialty Knowledge**: "What are the diagnostic steps for suspected endocrine disorders?"

**5. Critical Care Protocols**: "What is the protocol for managing sepsis in a critical care unit?"

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [1]:
# --- one-time compatibility setup (do BEFORE running the notebook's own cells) ---
# pins versions that play nicely with Python 3.12 and the older LangChain API used in your notebook
!pip install -q --force-reinstall \
  "pydantic==1.10.18" "typing-extensions==4.11.0" \
  "numpy==1.26.4" "packaging==23.2" \
  "langchain==0.1.1" "langchain-core==0.1.22" "langchain-community==0.0.13" \
  "huggingface_hub==0.23.2" "chromadb==0.4.22" "sentence-transformers==2.3.1" \
  "pymupdf==1.25.1" "tiktoken==0.6.0" "pandas==1.5.3"


[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.34.0 requires jedi>=0.16, which is not installed.
google-colab 1.0.0 requires google-auth==2.38.0, but you have google-auth 2.43.0 which is incompatible.
google-colab 1.0.0 requires pandas==2.2.2, but you have pandas 1.5.3 which is incompatible.
google-colab 1.0.0 requires requests==2.32.4, but you have requests 2.32.5 which is incompatible.
plotnine 0.14.5 requires pandas>=2.2.0, but you have pandas 1.5.3 which is incompatible.
google-adk 1.17.0 requires opentelemetry-api<=1.37.0,>=1.37.0, but you have opentelemetry-api 1.38.0 which is incompatible.
google-adk 1.17.0 requires opentelemetry-sdk<=1.37.0,>=1.37.0, but you have opentelemetry-sdk 1.38.0 which is incompatible.
google-adk 1.17.0 requires pydantic<3.0.0,>=2.0, but you have pydantic 1.10.18 which is incompatible.
peft 0.17.1 requires huggin

In [1]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used
# !CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━[0m [32m1.6/1.8 MB[0m [31m46.6 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m49.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m182.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m278.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.6/16.6 MB[0m [31m286.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.6/44.6 kB[0m [31m298.4 MB/

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [2]:
# For installing the libraries & downloading models from HF Hub
!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q

  Installing build dependencies ... [?25l[?25hdone
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mGetting requirements to build wheel[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Getting requirements to build wheel ... [?25l[?25herror
[1;31merror[0m: [1msubprocess-exited-with-error[0m

[31m×[0m [32mGetting requirements to build wheel[0m did not run successfully.
[31m│[0m exit code: [1;36m1[0m
[31m╰─>[0m See above for output.

[1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.


**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [1]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

#Libraries for downloading and loading the llm
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

#### Downloading and Loading the model

In [35]:
# Define model repository and quantized file details
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [36]:
# Download model file from Hugging Face Hub
model_path = hf_hub_download(
    repo_id=model_name_or_path,     # specify repository ID
    filename=model_basename         # specify model file name
)

print("Model downloaded and cached at:", model_path)

Model downloaded and cached at: /root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q6_K.gguf


In [37]:
# this block because I'm using GPU runtime
llm = Llama(
    model_path=model_path,
    n_ctx=2300,
    n_gpu_layers=38,
    n_batch=512
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Response

In [7]:
def response(query,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    model_output = llm(
      prompt=query,
      max_tokens=max_tokens,
      temperature=temperature,
      top_p=top_p,
      top_k=top_k
    )

    return model_output['choices'][0]['text']

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [8]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
response(user_input)

'\n\nSepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:\n\n1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.\n2. Resusc'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [9]:
user_input_2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
response(user_input_2)

Llama.generate: prefix-match hit


'\n\nAppendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:\n\n1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing'

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [10]:
user_input_3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
response(user_input_3)

Llama.generate: prefix-match hit


"\n\nSudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.\n\nThe exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications."

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [11]:
user_input_4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
response(user_input_4)

Llama.generate: prefix-match hit


"\n\nA person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:\n\n1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.\n2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as"

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [12]:
user_input_5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
response(user_input_5)

Llama.generate: prefix-match hit


"\n\nFirst and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:\n\n1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.\n2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.\n3. Immobilize the leg: Use a splint, sl"

## Question Answering using LLM with Prompt Engineering

In [13]:
system_prompt = "You are a knowledgeable and helpful medical assistant. Provide clear, accurate, and concise explanations to health-related questions."

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [14]:
user_input = system_prompt + "\n" + "What is the protocol for managing sepsis in a critical care unit?"
response(user_input)

Llama.generate: prefix-match hit


"\nSepsis is a life-threatening condition caused by the body's response to an infection. In a critical care unit, managing sepsis involves a multidisciplinary approach that focuses on early recognition, rapid resuscitation, and ongoing supportive care. Here are the key steps in managing sepsis in a critical care unit:\n1. Early Recognition: Identify sepsis suspects based on clinical suspicion, laboratory findings (such as leukocytosis or leukopenia, lactic acidosis, and inflammatory markers), and vital sign abnormal"

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [15]:
user_input = system_prompt + "\n" + "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
response(user_input)

Llama.generate: prefix-match hit


'\nAppendicitis is a condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The common symptoms of appendicitis include:\n1. Sudden and severe pain in the lower right abdomen that may migrate to other areas\n2. Loss of appetite and feeling sick to your stomach\n3. Nausea and vomiting\n4. Fever, often over 101°F (38.3°C)\n5. Diarrhea or constipation\n6. Abdominal swelling'

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [16]:
user_input = system_prompt + "\n" + "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
response(user_input)

Llama.generate: prefix-match hit


"\nSudden patchy hair loss, also known as alopecia areata, is an autoimmune condition that causes hair loss in small, round patches on the scalp, beard, or other areas of the body. The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system.\nThere are several treatments for addressing sudden patchy hair loss:\n1. Corticosteroids: These medications can be applied directly to the affected area or taken orally to reduce inflammation and suppress the immune system response that causes hair loss"

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [17]:
user_input = system_prompt + "\n" + "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
response(user_input)

Llama.generate: prefix-match hit


'\n\nA person with a brain injury may require various treatments depending on the severity and location of the injury. Here are some common treatments:\n\n1. Emergency care: For severe brain injuries, immediate medical attention is necessary to prevent further damage or complications. This may include surgery to remove hematomas or other obstructions, administration of medications to manage swelling or seizures, and supportive care such as ventilation and fluid replacement.\n2. Rehabilitation: Brain injury rehabilitation aims to help individuals regain lost skills and improve their overall functioning. Rehabilitation may include physical therapy to address motor'

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [18]:
user_input = system_prompt + "\n" + "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
response(user_input)


Llama.generate: prefix-match hit


'\nA leg fracture, especially one sustained during a hiking trip, can be a serious injury that requires prompt medical attention. Here are some necessary precautions and treatment steps:\n1. Assess the severity of the injury: If you suspect a leg fracture, do not move the person unless it is necessary to prevent further harm or ensure their safety. Try to keep them as still as possible to minimize discomfort and potential damage.\n2. Call for emergency medical help: If the injury is severe or if the person is unable to walk, call for emergency medical assistance. If you are in a remote area without'

## Data Preparation for RAG

### Loading the Data

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [3]:
manual_pdf_path = "/content/drive/MyDrive/AI Course/Module 5/Project 5/medical_diagnosis_manual.pdf"

In [4]:
pdf_loader = PyMuPDFLoader(manual_pdf_path)

In [5]:
manual = pdf_loader.load()

### Data Overview

#### Checking the first 5 pages

In [23]:
for i in range(5):
    print(f"Page Number : {i+1}",end="\n")
    print(manual[i].page_content,end="\n")

Page Number : 1
ridha.chehime@wmich.edu
6KIJLVN8X7
ant for personal use by ridha.chehime@w
shing the contents in part or full is liable 

Page Number : 2
ridha.chehime@wmich.edu
6KIJLVN8X7
This file is meant for personal use by ridha.chehime@wmich.edu only.
Sharing or publishing the contents in part or full is liable for legal action.

Page Number : 3
Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    ...................................................................................................................................................................................

#### Checking the number of pages

In [24]:
len(manual)

4114

### Data Chunking

In [6]:
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=1000,    # Split text into chunks of ~1000 characters
    chunk_overlap=200   # Overlap 200 characters between chunks to preserve context
)

In [7]:
document_chunks = pdf_loader.load_and_split(text_splitter)

In [8]:
len(document_chunks)

4725

In [9]:
document_chunks[0].page_content

'ridha.chehime@wmich.edu\n6KIJLVN8X7\nant for personal use by ridha.chehime@w\nshing the contents in part or full is liable'

In [10]:
document_chunks[2].page_content

"Table of Contents\n1\nFront    ................................................................................................................................................................................................................\n1\nCover    .......................................................................................................................................................................................................\n2\nFront Matter    ...........................................................................................................................................................................................\n53\n1 - Nutritional Disorders    ...............................................................................................................................................................\n53\nChapter 1. Nutrition: General Considerations    ...........................................................................................

In [11]:
document_chunks[3].page_content

'491\nChapter 44. Foot & Ankle Disorders    .....................................................................................................................................\n502\nChapter 45. Tumors of Bones & Joints    ...............................................................................................................................\n510\n5 - Ear, Nose, Throat & Dental Disorders    ..................................................................................................................\n510\nChapter 46. Approach to the Patient With Ear Problems    ...........................................................................................\n523\nChapter 47. Hearing Loss    .........................................................................................................................................................\n535\nChapter 48. Inner Ear Disorders    ...................................................................................................

### Embedding

In [17]:
!pip install -U numpy==1.26.4 scipy==1.11.4 -q --force-reinstall

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires google-auth==2.38.0, but you have google-auth 2.43.0 which is incompatible.
google-colab 1.0.0 requires pandas==2.2.2, but you have pandas 1.5.3 which is incompatible.
google-colab 1.0.0 requires requests==2.32.4, but you have requests 2.32.5 which is incompatible.
plotnine 0.14.5 requires pandas>=2.2.0, but you have pandas 1.5.3 which is incompatible.
peft 0.17.1 requires huggingface_hub>=0.25.0, but you have huggingface-hub 0.23.2 which is incompatible.
geopandas 1.1.1 requires pandas>=2.0.0, but you have pandas 1.5.3 which is incompatible.
cudf-cu12 25.6.0 requires pandas<2.2.4dev0,>=2.0, but you have pandas 1.5.3 which is incompatible.
datasets 4.0.0 requires fsspec[http]<=2025.3.0,>=2023.1.0, but you have fsspec 2025.10.0 which is incompatible.
datasets 4.0.0 requires huggingf

In [12]:
embedding_model = SentenceTransformerEmbeddings(model_name="sentence-transformers/paraphrase-MiniLM-L6-v2")


The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/229 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/314 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [13]:
embedding_1 = embedding_model.embed_query(document_chunks[0].page_content)
embedding_2 = embedding_model.embed_query(document_chunks[1].page_content)

In [14]:
print("Dimension of the embedding vector:", len(embedding_1))
print("Embeddings for two chunks are of equal length:", len(embedding_1) == len(embedding_2))

Dimension of the embedding vector: 384
Embeddings for two chunks are of equal length: True


In [15]:
embedding_1, embedding_2

([-0.48617106676101685,
  0.7641158699989319,
  0.14471304416656494,
  -0.3361620008945465,
  -0.027019904926419258,
  -0.17651624977588654,
  0.5946176648139954,
  0.12057385593652725,
  0.1420324742794037,
  -0.03638692945241928,
  0.11862877011299133,
  -0.12103472650051117,
  -0.10951196402311325,
  -0.09885941445827484,
  0.02352144941687584,
  -0.5323572754859924,
  0.13839754462242126,
  0.1237298995256424,
  -0.2835805118083954,
  -0.12171392887830734,
  -0.19446906447410583,
  0.029745977371931076,
  -0.09721716493368149,
  -0.21461839973926544,
  -0.2042931616306305,
  -0.07821822911500931,
  -0.2664152979850769,
  0.20931805670261383,
  0.06692367047071457,
  -0.059601929038763046,
  0.3276905417442322,
  0.19109030067920685,
  0.13665489852428436,
  0.22482764720916748,
  -0.07457486540079117,
  -0.14576031267642975,
  -0.22628788650035858,
  -0.4435056746006012,
  0.1211949810385704,
  -0.12533245980739594,
  -0.08928049355745316,
  0.10264500230550766,
  -0.25088262557983

### Vector Database

In [16]:
out_dir = 'medical_db'

if not os.path.exists(out_dir):
    os.makedirs(out_dir)

In [17]:
from langchain_community.vectorstores import Chroma

vectorstore = Chroma.from_documents(
    documents=document_chunks,       # loaded & chunked PDF
    embedding=embedding_model,       # the working SentenceTransformer embeddings
    persist_directory=out_dir        # where to store the DB
)


ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [18]:
vectorstore = Chroma(persist_directory=out_dir, embedding_function=embedding_model)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [19]:
vectorstore.embeddings

HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
), model_name='sentence-transformers/paraphrase-MiniLM-L6-v2', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False)

In [20]:
query = "What are the treatment options for appendicitis?"
results = vectorstore.similarity_search(query, k=3)

for i, res in enumerate(results, 1):
    print(f"\nResult {i}:\n{res.page_content}\n")

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given



Result 1:
• Surgical removal
• IV fluids and antibiotics
Treatment of acute appendicitis is open or laparoscopic appendectomy; because treatment delay
increases mortality, a negative appendectomy rate of 15% is considered acceptable. The surgeon can
usually remove the appendix even if perforated. Occasionally, the appendix is difficult to locate: In these
cases, it usually lies behind the cecum or the ileum and mesentery of the right colon. A contraindication to
appendectomy is inflammatory bowel disease involving the cecum. However, in cases of terminal ileitis
and a normal cecum, the appendix should be removed.
Appendectomy should be preceded by IV antibiotics. Third-generation cephalosporins are preferred. For
nonperforated appendicitis, no further antibiotics are required. If the appendix is perforated, antibiotics
should be continued until the patient's temperature and WBC count have normalized or continued for a
fixed course, according to the surgeon's preference. If surgery is 

### Retriever

In [32]:
retriever = vectorstore.as_retriever(
    search_type='similarity',
    search_kwargs={'k': 3}  # retrieve top 3 most relevant chunks
)

In [33]:
query = "What are the treatment options for appendicitis?"  # example query
rel_docs = retriever.get_relevant_documents(query)
rel_docs

[Document(page_content="• Surgical removal\n• IV fluids and antibiotics\nTreatment of acute appendicitis is open or laparoscopic appendectomy; because treatment delay\nincreases mortality, a negative appendectomy rate of 15% is considered acceptable. The surgeon can\nusually remove the appendix even if perforated. Occasionally, the appendix is difficult to locate: In these\ncases, it usually lies behind the cecum or the ileum and mesentery of the right colon. A contraindication to\nappendectomy is inflammatory bowel disease involving the cecum. However, in cases of terminal ileitis\nand a normal cecum, the appendix should be removed.\nAppendectomy should be preceded by IV antibiotics. Third-generation cephalosporins are preferred. For\nnonperforated appendicitis, no further antibiotics are required. If the appendix is perforated, antibiotics\nshould be continued until the patient's temperature and WBC count have normalized or continued for a\nfixed course, according to the surgeon's pr

In [39]:
# Combine relevant docs into context safely
context = " ".join([doc.page_content for doc in rel_docs])

# Truncate context if it exceeds the model window
max_prompt_length = 1800  # keep ~500 tokens free for output & prompt text
if len(context) > max_prompt_length:
    context = context[:max_prompt_length]

# Run the model
model_output = llm(
    f"Answer the following medical question using the given context.\n\n"
    f"Context:\n{context}\n\n"
    f"Question: {query}\n\nAnswer:",
    max_tokens=256,    # allows detailed responses
    temperature=0.2    # keeps it factual & grounded
)


In [40]:
print(model_output['choices'][0]['text'])

 The primary treatment for appendicitis is surgical removal, specifically open or laparoscopic appendectomy. Antibiotics, such as third-generation cephalosporins, are administered intravenously before surgery to help combat infection. For nonperforated appendicitis, no further antibiotics are necessary after the procedure. However, if the appendix has perforated, antibiotics should be continued until the patient's temperature and white blood cell count have returned to normal or for a fixed course as determined by the surgeon. If surgery is not an option, antibiotics can still improve survival rates. In cases where a large inflammatory mass involves the appendix, terminal ileum, and cecum, resection of the entire mass and ileocolostomy may be preferable. For late cases with pericolic abscesses, the abscess is drained either by ultrasound-guided percutaneous catheter or open operation, followed by appendectomy at a later date. A Meckel's diverticulum should be removed concomitantly with

### System and User Prompt Template

In [41]:
# Define the system message — this sets the role and tone of your LLM
qna_system_message = (
    "You are a knowledgeable and reliable medical assistant. "
    "Your goal is to provide accurate, concise, and evidence-based medical information. "
    "Always explain your reasoning clearly, avoid assumptions, and do not provide diagnoses. "
    "If the context does not contain the answer, respond with 'The information is not available in the provided context.'"
)

# Define the user message template — this gives the context and the actual question
qna_user_message_template = (
    "Use the following context from a medical reference manual to answer the question. "
    "If the answer is not in the context, state that explicitly.\n\n"
    "Context:\n{context}\n\n"
    "Question: {question}\n\n"
    "Answer:"
)


### Response Function

In [48]:
def generate_rag_response(user_input, k=10, max_tokens=128, temperature=0, top_p=0.95, top_k=50):
    global qna_system_message, qna_user_message_template

    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input, k=k)
    context_list = [d.page_content for d in relevant_document_chunks]

    # Combine all chunks into one context
    context_for_query = ". ".join(context_list)

    # --- Limit context to avoid exceeding model token window ---
    # Keep ~1800 tokens for input (leaving room for output)
    max_prompt_length = 1800
    if len(context_for_query) > max_prompt_length:
        context_for_query = context_for_query[:max_prompt_length]

    # Build user + system message
    user_message = qna_user_message_template.replace('{context}', context_for_query)
    user_message = user_message.replace('{question}', user_input)
    prompt = qna_system_message + '\n' + user_message

    # Generate the response
    try:
        response = llm(
            prompt=prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k
        )

        response = response['choices'][0]['text'].strip()
    except Exception as e:
        response = f"Sorry, I encountered the following error: \n{e}"

    return response


## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [49]:
# Query 1
user_input = "What is the protocol for managing sepsis in a critical care unit?"
print(generate_rag_response(user_input, k=20))


Llama.generate: prefix-match hit


The management of sepsis in a critical care unit involves several steps. First, aggressive fluid resuscitation is initiated to maintain adequate tissue perfusion. Antibiotics are administered based on the suspected infectious source and sensitivity data. Surgical excision of infected or necrotic tissues and drainage of pus may be necessary. Supportive care includes mechanical ventilation for respiratory failure, renal replacement therapy for renal failure, and vasopressors for cardiovascular failure. Intensive control of blood glucose levels and administration of corticosteroids and activated protein C may also


### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [50]:
# Query 2
user_input_2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(generate_rag_response(user_input_2, k=20))


Llama.generate: prefix-match hit


The common symptoms of appendicitis include epigastric or periumbilical pain followed by brief nausea, vomiting, anorexia, shifting pain to the right lower quadrant, increased pain with cough and motion, direct and rebound tenderness at McBurney's point, and low-grade fever. However, these symptoms may not be present in all cases, especially in infants, children, elderly patients, and pregnant women. Appendicitis cannot be cured via medicine alone as it can lead to necrosis, gangrene, perforation, and the formation of


### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [51]:
# Query 3
user_input_3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(generate_rag_response(user_input_3, k=20))


Llama.generate: prefix-match hit


The context does not provide information on effective treatments or solutions for addressing sudden patchy hair loss. However, based on the context, possible causes include telogen effluvium, which is suggested by a positive pull test result (extracting more than 3 hairs per pull or more than 10 hairs in total). Telogen effluvium can be caused by various factors such as stress, illness, medications, or hormonal imbalances. The context also mentions the importance of evaluating causative disorders based on clinical suspicion and testing for specific conditions like endocrinologic or autoimmune disorders.


### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [52]:
# Query 4
user_input_4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(generate_rag_response(user_input_4, k=20))


Llama.generate: prefix-match hit


The initial treatment for a person with traumatic brain injury (TBI) includes ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. Surgery may be needed in cases of more severe injury to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas. In the first few days after the injury, maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium are important. Subsequently, many patients require rehabilitation. The context does not provide information on


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [53]:
# Query 5
user_input_5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(user_input_5, k=20))


Llama.generate: prefix-match hit


For a person who has fractured their leg during a hiking trip, the following precautions and treatment steps are necessary:
1. Immobilize the affected leg using a splint or cast to prevent further damage and promote healing.
2. Apply ice packs intermittently to reduce swelling and pain.
3. Ensure adequate pain management through medication as prescribed by a healthcare professional.
4. Elevate the injured leg above heart level whenever possible to minimize swelling and improve circulation.
5. Transport the person to the nearest medical facility for proper diagnosis and treatment.
6. During transportation,


### Fine-tuning

In [54]:
# Query 1
user_input = "What is the protocol for managing sepsis in a critical care unit?"
generate_rag_response(user_input, temperature=0.5)


Llama.generate: prefix-match hit


"The management of sepsis in a critical care unit involves several interventions aimed at addressing the underlying infection, restoring tissue perfusion, and providing supportive care. According to the context provided, these interventions include aggressive fluid resuscitation, administration of antibiotics, surgical excision or drainage of infected or necrotic tissues, supportive care, intensive control of blood glucose, and in some cases, corticosteroids and activated protein C. The goal is to address the critical reduction in tissue perfusion that occurs in severe sepsis and septic shock. However, it's important"

In [55]:
# Query 2
user_input_2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
generate_rag_response(user_input_2, temperature=0.5)


Llama.generate: prefix-match hit


"The common symptoms of appendicitis include epigastric or periumbilical pain followed by brief nausea, vomiting, anorexia, shifting pain to the right lower quadrant, increased pain with cough and motion, direct and rebound tenderness at McBurney's point, possible Rovsing sign (pain in left lower quadrant), psoas sign (increased pain from passive extension of right hip joint), or obturator sign (pain caused by passive internal rotation of flexed thigh). A low-grade fever is also common. However, these classic findings appear"

In [56]:
# Query 3
user_input_3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
generate_rag_response(user_input_3, temperature=0.5)


Llama.generate: prefix-match hit


'The context provided does not specifically address treatments or solutions for sudden patchy hair loss. However, based on the information given, it mentions that signs of hair loss in patterns other than male-pattern or female-pattern are nondiagnostic and may require microscopic hair examination or scalp biopsy for definitive diagnosis. Therefore, if someone is experiencing sudden patchy hair loss, they should consult a healthcare professional for further evaluation to determine the underlying cause.\n\nPossible causes of sudden patchy hair loss include but are not limited to: autoimmune disorders such as alopecia areata or lic'

In [57]:
# Query 4
user_input_4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
generate_rag_response(user_input_4, temperature=0.5)


Llama.generate: prefix-match hit


'The initial treatment for a person with a traumatic brain injury (TBI) includes ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. Surgery may be needed in cases of more severe injury to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas. In the first few days after the injury, maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium are important. Subsequently, many patients require rehabilitation. In the US and much of'

In [58]:
# Query 5
user_input_5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
generate_rag_response(user_input_5, temperature=0.5)


Llama.generate: prefix-match hit


'A fractured leg requires immobilization to prevent further damage and promote healing. The person should be transported to a healthcare facility as soon as possible. During transportation, the leg should be kept elevated and supported with a splint or cast if available. Once at the hospital, the fracture will be assessed, and appropriate treatment will be determined based on the severity of the injury.\n\nIn terms of care and recovery, the following precautions and steps should be considered:\n1. Immobilization: The leg must remain immobilized to allow the bone to heal properly. Hospital orders should emphasize the'

## Output Evaluation

Let us now use the LLM-as-a-judge method to check the quality of the RAG system on two parameters - retrieval and generation. We illustrate this evaluation based on the answeres generated to the question from the previous section.

- We are using the same Mistral model for evaluation, so basically here the llm is rating itself on how well he has performed in the task.

In [59]:
# --- Define the system messages for evaluation ---

# Groundedness rater: checks if the answer is supported by retrieved context
groundedness_rater_system_message = """
You are a medical expert evaluating the groundedness of an AI-generated answer.
Given a question, the context retrieved from a medical document, and the model's answer,
rate how well the answer is supported by the provided context.
Return one of the following scores:
- 1: Poor grounding (answer not supported or contradicts context)
- 2: Partial grounding (somewhat supported, some missing or inconsistent info)
- 3: Fully grounded (fully supported by context, no contradictions)
Then provide a one-sentence justification for your score.
"""



In [60]:
# Relevance rater: checks if the retrieved context and answer align with the query
relevance_rater_system_message = """
You are evaluating the relevance of a retrieved medical context and its generated answer
with respect to the user’s question. Assess whether the information retrieved and the
answer provided are relevant to the question.
Return one of the following scores:
- 1: Irrelevant (not related to the question)
- 2: Partially relevant (somewhat related, but incomplete or off-topic)
- 3: Highly relevant (directly and fully answers the question)
Then provide a one-sentence justification for your score.
"""



In [61]:
# --- Template for question/context/answer ---
user_message_template = """
###Question
{question}

###Context
{context}

###Answer
{answer}
"""


In [64]:
def generate_ground_relevance_response(user_input, k=3, max_tokens=128, temperature=0, top_p=0.95, top_k=50):
    global qna_system_message, qna_user_message_template

    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input, k=k)
    context_list = [d.page_content for d in relevant_document_chunks]

    # Combine and truncate context safely
    context_for_query = ". ".join(context_list)
    max_prompt_length = 1800   # Reserve ~500 tokens for model output
    if len(context_for_query) > max_prompt_length:
        context_for_query = context_for_query[:max_prompt_length]

    # Combine user prompt and system message
    prompt = f"""[INST]{qna_system_message}\n
                {'user'}: {qna_user_message_template.format(context=context_for_query, question=user_input)}
                [/INST]"""

    # Generate answer
    response = llm(
        prompt=prompt,
        max_tokens=max_tokens,
        temperature=temperature,
        top_p=top_p,
        top_k=top_k,
        stop=['INST'],
    )
    answer = response["choices"][0]["text"]

    # Build groundedness evaluation prompt
    groundedness_prompt = f"""[INST]{groundedness_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Build relevance evaluation prompt
    relevance_prompt = f"""[INST]{relevance_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Generate both evaluation scores
    response_1 = llm(
        prompt=groundedness_prompt,
        max_tokens=max_tokens,
        temperature=temperature,
        top_p=top_p,
        top_k=top_k,
        stop=['INST'],
    )

    response_2 = llm(
        prompt=relevance_prompt,
        max_tokens=max_tokens,
        temperature=temperature,
        top_p=top_p,
        top_k=top_k,
        stop=['INST'],
    )

    return response_1['choices'][0]['text'], response_2['choices'][0]['text']


### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [65]:
# Query 1
ground, rel = generate_ground_relevance_response(
    user_input="What is the protocol for managing sepsis in a critical care unit?",
    max_tokens=370
)
print(ground, end="\n\n")
print(rel)


Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Score: 3 (Fully grounded)
Justification: The context provides detailed information about the causes, symptoms, and treatment options for sepsis, severe sepsis, and septic shock. The answer accurately identifies and summarizes the mentioned interventions for managing these conditions in a critical care unit.

 Score: 2 (Partially relevant)

Justification: The context provides an overview of sepsis, its causes, symptoms, and treatment options. However, it does not detail an exact protocol for managing sepsis in a critical care unit as the user's question specifically asks for. The answer acknowledges this limitation but summarizes the mentioned interventions that could be part of such a protocol.


### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [66]:
# Query 2
ground, rel = generate_ground_relevance_response(
    user_input="What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?",
    max_tokens=370
)
print(ground, end="\n\n")
print(rel)


Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Score: 3 (Fully grounded)

Justification: The model's answer accurately summarizes the symptoms and signs mentioned in the context, emphasizing that they are not always present in a consistent manner. Additionally, the model correctly states that appendicitis cannot be cured via medicine alone and requires surgical intervention, which aligns with the information provided in the context about the potential complications if left untreated.

 Score: 3
Justification: The retrieved context and generated answer fully address the user's question by providing a comprehensive list of common symptoms for appendicitis and confirming that it cannot be cured via medicine alone. Additionally, the answer explains the standard surgical treatment for appendicitis, which is an appendectomy.


### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [67]:
# Query 3
ground, rel = generate_ground_relevance_response(
    user_input="What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?",
    max_tokens=370
)
print(ground, end="\n\n")
print(rel)


Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Score: 2 (Partial grounding)

Justification: The answer acknowledges that there is no explicit mention of effective treatments or solutions for sudden patchy hair loss in the context, but it does provide some information about possible causes and diagnostic tests that may be necessary to determine the underlying cause. However, the answer could be more specific about the potential treatments for each identified cause.

 Score: 2 (Partially relevant)

Justification: The context provides some information about possible causes of sudden patchy hair loss, but it does not explicitly mention effective treatments or solutions for this condition. Therefore, while the context is related to the question, it does not fully answer it.


### Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [68]:
# Query 4
ground, rel = generate_ground_relevance_response(
    user_input="What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?",
    max_tokens=370
)
print(ground, end="\n\n")
print(rel)


Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Score: 3 (Fully grounded)

Justification: The answer accurately summarizes the initial treatment recommendations for a person with traumatic brain injury as stated in the context, including ensuring a reliable airway, maintaining adequate ventilation, oxygenation, and blood pressure, and the potential need for surgery to manage intracranial pressure or hematomas. The answer also correctly mentions the importance of preventing complications and rehabilitation in the days following the injury.

 Score: 2 (Partially relevant)

Justification: The answer provides some relevant information about the initial treatment for traumatic brain injury, including ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. However, it does not directly address the user's question about specific treatments recommended for a person with brain tissue injury, and it does not mention any medications or other therapies beyond general recommendations.


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [69]:
# Query 5
ground, rel = generate_ground_relevance_response(
    user_input="What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?",
    max_tokens=370
)
print(ground, end="\n\n")
print(rel)


Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Score: 2 (Partial grounding)

The context does provide some information related to the question, such as the need for activity, prevention of falls, and rehabilitation. However, it does not directly address the specifics of fracturing a leg during a hiking trip or the necessary precautions and treatment steps for that particular injury. The answer provided goes beyond the context by including general information about fractures and their care and recovery.

 Score: 2 (Partially relevant)

The context provides valuable information on preventing falls, incontinence, and other potential issues for elderly patients in a hospital setting. However, it does not directly address the specific question about necessary precautions and treatment steps for someone who has fractured their leg during a hiking trip. The answer provided includes general medical knowledge related to the question but is not directly taken from the context.


## Actionable Insights and Business Recommendations

**1. Broaden the dataset**
  * Some responses were only partially grounded because the system didn’t have enough coverage in certain topics. Expanding the document set with more domain-specific and detailed materials would strengthen future responses and improve factual grounding.

**2. Use stronger embeddings**
  * While the current embedding model performs well, using a more advanced or domain-adapted model, such as all-mpnet-base-v2 or BioClinicalBERT, could help capture subtle context and improve retrieval precision.

**3. Combine retrieval methods**
  * The evaluation revealed that certain answers drifted slightly off topic. A hybrid approach that combines semantic search with keyword-based retrieval (like BM25) could make responses more focused and contextually accurate.

**4. Optimize chunk sizes**
  * Some errors occurred because of long context windows or oversized chunks. Adjusting the chunk size dynamically depending on the content type can reduce overflow issues and improve retrieval quality.

**5. Maintain automated evaluation**
  * The LLM-as-a-judge evaluation method proved effective for assessing both groundedness and relevance. Keeping this evaluation mechanism active will provide continuous feedback on performance and guide future improvements.
  

<font size=6 color='blue'>Power Ahead</font>
___