## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

**Common Questions to Answer**

**1. Diagnostic Assistance**: "What are the common symptoms and treatments for pulmonary embolism?"

**2. Drug Information**: "Can you provide the trade names of medications used for treating hypertension?"

**3. Treatment Plans**: "What are the first-line options and alternatives for managing rheumatoid arthritis?"

**4. Specialty Knowledge**: "What are the diagnostic steps for suspected endocrine disorders?"

**5. Critical Care Protocols**: "What is the protocol for managing sepsis in a critical care unit?"

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [1]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used
#!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q
#!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.28 --force-reinstall --upgrade --no-cache-dir -q
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q
# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used
#!CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m54.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m147.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m285.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.9/16.9 MB[0m [31m125.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.8/43.8 kB[0m [31m251.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for llama-cpp-python (pyproject.toml) ... [?25l[?25hdone
[31mERROR: pip's dependency

In [1]:
# For installing the libraries & downloading models from HF Hub
#!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q


#!pip install --upgrade huggingface_hub pandas tiktoken pymupdf langchain langchain-community chromadb sentence-transformers numpy torch --force-reinstall --no-cache-dir -q

#!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q

!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q




[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.2/40.2 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.2/40.2 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.0/44.0 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m3.5 MB/s[0m eta [36m0

In [1]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

#Libraries for downloading and loading the llm
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

#### Downloading and Loading the model

In [2]:
#Download the model from Hugging Face
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [3]:
#Initialize the Llama model
#uncomment the below snippet of code if the runtime is connected to GPU.
llm = Llama(
    model_path=model_path,
    n_ctx=15000,            # context window size / max tokens the model can take as input
    n_gpu_layers=38,        # layers to offload for GPU processing
    n_batch=512             # input sequence batch size (larger batch size can improve performance)
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Response

In [4]:
#Define a function to take an input prompt and call LLM
def response(query,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    model_output = llm(
      prompt=query,
      max_tokens=max_tokens,
      temperature=temperature,
      top_p=top_p,
      top_k=top_k
    )

    return model_output['choices'][0]['text']

In [5]:
#Test the response function by prompting and outputting response
response("What are the common symptoms for pulmonary embolism?")


'\n\nPulmonary embolism is a serious condition that occurs when a blood clot or other foreign substance travels to the lungs and blocks one or more of the arteries. This can cause several symptoms, including:\n\n1. Shortness of breath: One of the most common symptoms of pulmonary embolism is shortness of breath, which may be sudden and severe or gradual in onset.\n2. Chest pain: Pulmonary embolism can also cause chest pain, which may be sharp, stabbing, or dull. The pain may worsen with deep breathing or'

##### Observation

LLM gave relevant response related to pulmonary embolism and Hypertension

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [8]:
#Prompt LLM on query 1
user_input_1 = "What is the protocol for managing sepsis in a critical care unit?"
raw_llm_response_1 = response(user_input_1)
raw_llm_response_1

Llama.generate: prefix-match hit


'\n\nSepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:\n\n1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.\n2. Resusc'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [9]:
#Prompt LLM on query 2
user_input_2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
raw_llm_response_2 = response(user_input_2)
raw_llm_response_2

Llama.generate: prefix-match hit


'\n\nAppendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:\n\n1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing'

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [10]:
#Prompt LLM on query 3
user_input_3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
raw_llm_response_3 = response(user_input_3)
raw_llm_response_3

Llama.generate: prefix-match hit


"\n\nSudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.\n\nThe exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications."

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [11]:
#Prompt LLM on query 4
user_input_4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
raw_llm_response_4 = response(user_input_4)
raw_llm_response_4

Llama.generate: prefix-match hit


"\n\nA person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:\n\n1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.\n2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as"

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [12]:
#Prompt LLM on query 5
user_input_5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
raw_llm_response_5 = response(user_input_5)
raw_llm_response_5

Llama.generate: prefix-match hit


"\n\nFirst and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:\n\n1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.\n2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.\n3. Immobilize the leg: Use a splint, sl"

##### Observations

Overall, the LLM delivered responses that were on-target and contextually appropriate. Truncation was observed in a few cases, but this stems from token limitations rather than any issue with the model’s reasoning or content generation.


## Question Answering using LLM with Prompt Engineering

In [13]:
# Define system prompt
system_prompt = """You are a knowledgeable and experienced medical professional.
Use the retrieved context provided to respond accurately and concisely to the user's medical question.
If the information is insufficient or unclear, respond with 'I don't know' rather than speculating."""


### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [14]:
#Prompt the LLM on query 1, with system prompt
user_input = system_prompt+"\n"+ user_input_1
sysprompt_llm_1 = response(user_input)

separator = "=" * 50
print(f"{separator}\nRaw LLM response:\n{raw_llm_response_1}\n\n{separator}\nLLM with system prompt:\n{sysprompt_llm_1}")


Llama.generate: prefix-match hit


Raw LLM response:


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Resusc

LLM with system prompt:

Based on current guidelines from reputable sources such as the Surviving Sepsis Campaign and the National Institute for Health and Care Excellence (NICE), the following are key steps in managing sepsis in a critical care unit:
1. Early recognition and assessment using the Sequential Organ Failure Assessment (SOFA) score or Quick Sequential Organ Failure Assessment (qSOFA) score.
2. Immediate administration of br

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [15]:
#Prompt the LLM on query 2, with system prompt
user_input = system_prompt+"\n"+ user_input_2
sysprompt_llm_2 = response(user_input)

separator = "=" * 50
print(f"{separator}\nRaw LLM response:\n{raw_llm_response_2}\n\n{separator}\nLLM with system prompt:\n{sysprompt_llm_2}")


Llama.generate: prefix-match hit


Raw LLM response:


Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:

1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing

LLM with system prompt:


Appendicitis is a medical condition characterized by inflammation of the appendix, a small pouch that extends from the cecum in the large intestine. The common symptoms for appendicitis include:
1. Sudden pain in the lower right abdomen, which may be mild at first but becomes sharp and severe over time.
2. Loss of appetite and feeling sick to your stomach (nausea).
3. Vomiting.
4. Fever, which may s

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [16]:
#Prompt the LLM on query 3, with system prompt
user_input = system_prompt+"\n"+ user_input_3
sysprompt_llm_3 = response(user_input)

separator = "=" * 50
print(f"{separator}\nRaw LLM response:\n{raw_llm_response_3}\n\n{separator}\nLLM with system prompt:\n{sysprompt_llm_3}")


Llama.generate: prefix-match hit


Raw LLM response:


Sudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.

The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications.

LLM with system prompt:

Based on the context provided, sudden patchy hair loss, also known as alopecia areata, is an autoimmune disorder that results in the sudden loss of hair in small patches on the scalp or other areas of the body. The exact cause of alopecia areata is unknown, but it's believed to be related to a problem with the immune system.
Effective treatments for addressing sudden patchy hair loss include:
1. Corticosteroids: These medications can help reduc

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [17]:
#Prompt the LLM on query 4, with system prompt
user_input = system_prompt+"\n"+ user_input_4
sysprompt_llm_4 = response(user_input)

separator = "=" * 50
print(f"{separator}\nRaw LLM response:\n{raw_llm_response_4}\n\n{separator}\nLLM with system prompt:\n{sysprompt_llm_4}")


Llama.generate: prefix-match hit


Raw LLM response:


A person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:

1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.
2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as

LLM with system prompt:

Based on the context provided, I cannot definitively answer this question without additional information such as the severity and location of the brain injury, the specific symptoms experienced by the individual, and any underlying health conditions they may have. However, I can provide some general information about common treatments for brain injuries.

For mild to mo

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [18]:
#Prompt the LLM on query 5, with system prompt
user_input = system_prompt+"\n"+ user_input_5
sysprompt_llm_5 = response(user_input)

separator = "=" * 50
print(f"{separator}\nRaw LLM response:\n{raw_llm_response_5}\n\n{separator}\nLLM with system prompt:\n{sysprompt_llm_5}")


Llama.generate: prefix-match hit


Raw LLM response:


First and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:

1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.
2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.
3. Immobilize the leg: Use a splint, sl

LLM with system prompt:

Based on the context provided, I cannot directly determine the specific type or severity of the leg fracture. However, I can provide general guidelines for the necessary precautions and treatment steps for someone who has sustained a leg fracture during a hiking trip.
1. Immediate Care: The first priority is to ensure the safety and stability of the injured person. If possible, try to immobilize the affected leg us

##### Observations
- Adding a system prompt and medical-professional persona makes the LLM’s answers noticeably more focused and domain-specific instead of generic.
- The cutoff you see is purely a max-token ceiling—not a sign the model drifted off-topic or lost accuracy.
-  Overall, prompt + persona steer the model to deliver appropriately framed clinical guidance that speaks directly to each question.

## Data Preparation for RAG

In [19]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

### Loading the Data

In [21]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [22]:
manual_pdf_path = "/content/drive/MyDrive/Chinmay MS/medical_diagnosis_manual.pdf"

pdf_loader = PyMuPDFLoader(manual_pdf_path)
manual = pdf_loader.load()

### Data Overview

#### Checking the first 5 pages

In [23]:
for i in range(5):
    print(f"Page Number : {i+1}",end="\n")
    print(manual[i].page_content,end="\n")

Page Number : 1
chinmay.rozekar@gmail.com
IT4RPVSUBM
nt for personal use by chinmay.rozekar@
shing the contents in part or full is liable 

Page Number : 2
chinmay.rozekar@gmail.com
IT4RPVSUBM
This file is meant for personal use by chinmay.rozekar@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.

Page Number : 3
Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    .............................................................................................................................................................................

#### Checking the number of pages

In [24]:
# Check the # of pages in the manual
len(manual)

4114

### Data Chunking

In [25]:
#Break PDF down into smaller chunks
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=400,               # chunk size
    chunk_overlap=80              # size of overlap between chunks
)

document_chunks = pdf_loader.load_and_split(text_splitter)

In [26]:
# No. of chunks created
len(document_chunks)

11410

In [27]:
# Preview 1st chunk
document_chunks[0].page_content

'chinmay.rozekar@gmail.com\nIT4RPVSUBM\nnt for personal use by chinmay.rozekar@\nshing the contents in part or full is liable'

In [28]:
# Preview 2nd Chunk
document_chunks[2].page_content

'Table of Contents\n1\nFront    ................................................................................................................................................................................................................\n1\nCover    .......................................................................................................................................................................................................\n2\nFront Matter    ...........................................................................................................................................................................................\n53\n1 - Nutritional Disorders    ...............................................................................................................................................................\n53\nChapter 1. Nutrition: General Considerations    ...........................................................................................

In [29]:
# Preview 3rd chunk
document_chunks[3].page_content

'206\nChapter 16. Gastroenteritis    ......................................................................................................................................................\n213\nChapter 17. Malabsorption Syndromes    ..............................................................................................................................\n225\nChapter 18. Irritable Bowel Syndrome    ................................................................................................................................\n229\nChapter 19. Inflammatory Bowel Disease    .........................................................................................................................\n241\nChapter 20. Diverticular Disease    ...........................................................................................................................................\n246\nChapter 21. Anorectal Disorders    ........................................................................

In [30]:
# Preview 2nd Last page
document_chunks[-2].page_content

"Z\nZafirlukast 1879\nZalcitabine 1451\nin children 2854\nZaleplon 1709\nZanamivir 1407\nin influenza 1407, 1929\nZAP-70 (zeta-associated protein 70) deficiency 1092, 1108\nZavanelli maneuver 2680\nZellweger syndrome 2383, 3023\nZenker's diverticulum 125\nZidovudine 1451, 1453\nin children 2854\nZileuton 1881\nin asthma 1880\nZinc 49, 55, 3431-3432\nin common cold 1405\ndeficiency of 11, 49, 55\nin dermatophytoses 705\npoisoning with 3328, 3353\nrecommended dietary allowances for 50\nreference values for 3499\ntoxicity of 49, 55\ncopper deficiency and 49\nin Wilson's disease 52\nZinc oxide 2233\ngelatin formulation of 646, 672\nZinc pyrithione 647\nZinc shakes 55\nZipper injury 3239, 3240\nZiprasidone\nin agitation 1492\nin bipolar disorder 3059\npoisoning with 3347\nin schizophrenia 1566\nZoledronate 359, 361, 848\nZollinger-Ellison syndrome 95, 199, 200-201, 910\nmastocytosis vs 1125\nMenetrier's disease vs 132\npeptic ulcer disease vs 134\nZolmitriptan 1721\nZolpidem 1709, 3103\nZon

In [31]:
# Preview Last page
document_chunks[-1].page_content

"Zollinger-Ellison syndrome 95, 199, 200-201, 910\nmastocytosis vs 1125\nMenetrier's disease vs 132\npeptic ulcer disease vs 134\nZolmitriptan 1721\nZolpidem 1709, 3103\nZonisamide 1701\nZoonotic diseases, cutaneous 718\nZoophobia 1498\nZoster (see Herpes zoster virus infection)\nZygomycosis 1332\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nZ\n4104\nchinmay.rozekar@gmail.com\nIT4RPVSUBM\nThis file is meant for personal use by chinmay.rozekar@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action."

##### Observations:
Seeing Some bits of overlaps

### Embedding

In [32]:
#Initialize the embedding module
embedding_model = SentenceTransformerEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [33]:
#Try embedding model on a few chunks
embedding_1 = embedding_model.embed_query(document_chunks[0].page_content)
embedding_2 = embedding_model.embed_query(document_chunks[1].page_content)

In [34]:
#Check the dimension of a chunk, confirm that embedded chunk length is consistent
print("Dimension of the embedding vector ",len(embedding_1))
len(embedding_1)==len(embedding_2)

Dimension of the embedding vector  384


True

In [35]:
#Output and confirm embedding process is working
embedding_1,embedding_2

([-0.08878480643033981,
  0.10947953909635544,
  0.0058256653137505054,
  -0.009392620995640755,
  0.08232270181179047,
  -0.016214923933148384,
  0.06115225329995155,
  0.08534809947013855,
  -0.008117127232253551,
  -0.0264132022857666,
  0.07644364982843399,
  -0.0677272379398346,
  0.023500271141529083,
  -0.0563480406999588,
  -0.05763360857963562,
  -0.0416007824242115,
  -0.04238284006714821,
  0.01218160055577755,
  -0.05716867372393608,
  -0.009484992362558842,
  -0.056975167244672775,
  -0.025044124573469162,
  0.0026125202421098948,
  -0.03280925378203392,
  0.011630745604634285,
  -0.0025313650257885456,
  -0.04570190981030464,
  0.019783711060881615,
  0.014762265607714653,
  -0.08363518863916397,
  0.037159234285354614,
  0.021040787920355797,
  0.06211840733885765,
  0.023200012743473053,
  0.04378737509250641,
  0.047307614237070084,
  -0.03182739019393921,
  -0.0410759374499321,
  -0.006279019173234701,
  -0.011165548115968704,
  -0.023045819252729416,
  0.019026871770

### Vector Database

In [36]:
#Create a directory to store vector database
out_dir = 'medical_db'

if not os.path.exists(out_dir):
  os.makedirs(out_dir)

In [37]:
#Create a Chroma vector store for document chunks
vectorstore = Chroma.from_documents(
    document_chunks,                  # PDF chunks
    embedding_model,                  # sentence transformer embedding model
    persist_directory=out_dir         # directory to store vector database
)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [38]:
#Re-initialize the vectorstore, reload from vector DB
vectorstore = Chroma(persist_directory=out_dir,embedding_function=embedding_model)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [39]:
#Confirm the vector store's embedding model
vectorstore.embeddings

HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
  (2): Normalize()
), model_name='sentence-transformers/all-MiniLM-L6-v2', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False)

In [40]:
#Perform a test similarity search on the vector store and view the matched chunks (3)
vectorstore.similarity_search("What are the common symptoms for pulmonary embolism?", k=3)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given


[Document(page_content='Chapter 194. Pulmonary Embolism\nIntroduction\nPulmonary embolism (PE) is the occlusion of ≥ 1 pulmonary arteries by thrombi that originate\nelsewhere, typically in the large veins of the lower extremities or pelvis. Risk factors are\nconditions that impair venous return, conditions that cause endothelial injury or dysfunction,\nand underlying hypercoagulable states. Symptoms are nonspecific and include dyspnea,\npleuritic chest pain, cough, and, in severe cases, syncope or cardiorespiratory arrest. Signs are\nalso nonspecific and may include tachypnea, tachycardia, hypotension, and a loud pulmonic\ncomponent of the 2nd heart sound. Diagnosis is based on a CT angiogram, ventilation/perfusion\nscan, or a pulmonary arteriogram. Treatment is with anticoagulants and, sometimes, clot\ndissolution with thrombolytics or surgical removal. Preventive measures include anticoagulants\nand sometimes insertion of an inferior vena caval filter.\nPE affects an estimated 117 pe

##### Observations:
The chunks retrieved from the vector store closely align with the prompt

### Retriever

In [41]:
#Initialize a retriever to fetch relevant context from the vector store based on the query
retriever = vectorstore.as_retriever(
    search_type='similarity',
    search_kwargs={'k': 3}
)

In [42]:
#Display the retrieved context for a sample query or prompt
rel_docs = retriever.get_relevant_documents("What are the common symptoms for pulmonary embolism?")
rel_docs

[Document(page_content='Chapter 194. Pulmonary Embolism\nIntroduction\nPulmonary embolism (PE) is the occlusion of ≥ 1 pulmonary arteries by thrombi that originate\nelsewhere, typically in the large veins of the lower extremities or pelvis. Risk factors are\nconditions that impair venous return, conditions that cause endothelial injury or dysfunction,\nand underlying hypercoagulable states. Symptoms are nonspecific and include dyspnea,\npleuritic chest pain, cough, and, in severe cases, syncope or cardiorespiratory arrest. Signs are\nalso nonspecific and may include tachypnea, tachycardia, hypotension, and a loud pulmonic\ncomponent of the 2nd heart sound. Diagnosis is based on a CT angiogram, ventilation/perfusion\nscan, or a pulmonary arteriogram. Treatment is with anticoagulants and, sometimes, clot\ndissolution with thrombolytics or surgical removal. Preventive measures include anticoagulants\nand sometimes insertion of an inferior vena caval filter.\nPE affects an estimated 117 pe

In [43]:
#Generate a response from the LLM using only the prompt, without any RAG-based context
model_output = llm(
    "What are the common symptoms for pulmonary embolism?",
    max_tokens=256,
    temperature=0.7,
)

model_output['choices'][0]['text']

Llama.generate: prefix-match hit


"\n\nPulmonary embolism is a serious condition that occurs when a blood clot or other foreign substance travels to the lungs and blocks one or more of the arteries in the lungs. Some of the common symptoms include:\n\n1. Shortness of breath - This is often the first symptom, especially with large clots. The individual may feel like they can't catch their breath or that they are gasping for air.\n2. Chest pain or discomfort - This can be described as a sharp stabbing pain in the chest, which may worsen with deep breathing or physical activity. Some people may also feel pressure or heaviness in the chest.\n3. Rapid heart rate - The individual's heart may beat faster than normal to try and compensate for the reduced blood flow to the lungs.\n4. Rapid breathing - The body may respond to the lack of oxygen by increasing the breathing rate.\n5. Coughing up blood or blood-streaked sputum - This is less common but can be a sign of a more severe pulmonary embolism.\n6. Swelling in the legs, ank

##### Observations:
The response generated with RAG supplementation aligns well with the prompt and demonstrates improved focus and relevance compared to the standalone LLM output.




### System and User Prompt Template








In [44]:
# Define a structured prompt template combining system-level guidance and user input.
# This template instructs the LLM to use only the retrieved context when answering the user's question.
# It helps ensure factual, grounded responses and avoids hallucination.

from langchain.prompts import ChatPromptTemplate

template = """You are a trusted and experienced medical professional.
Use only the context provided below to answer the user's question accurately and clearly.
If the context does not contain sufficient information, respond with 'I don't know' and do not make assumptions.

Question: {question}
Context: {context}
Answer:"""

prompt = ChatPromptTemplate.from_template(template)

prompt


ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are a trusted and experienced medical professional.\nUse only the context provided below to answer the user's question accurately and clearly.\nIf the context does not contain sufficient information, respond with 'I don't know' and do not make assumptions.\n\nQuestion: {question}\nContext: {context}\nAnswer:"))])

### Response Function

In [61]:
# Function: rag_response
# Purpose: Retrieve relevant chunks from the vector store and generate an LLM answer conditioned on that context.
def generate_rag_response(
    user_input,
    k=3,
    max_tokens=128,
    temperature=0,
    top_p=0.95,
    top_k=50
):
    global system_prompt, template

    # Step 1: Retrieve top-k most relevant document chunks based on the query
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input, k=k)

    # Extract page content from each retrieved document
    context_list = [d.page_content for d in relevant_document_chunks]

    # Step 2: Combine the document chunks into a single context string
    context_for_query = ". ".join(context_list)

    # Step 3: Format the prompt using the predefined template
    prompt = template.format(context=context_for_query, question=user_input)

    # Step 4: Generate a response from the LLM using the constructed prompt
    try:
        response = llm(
            prompt=prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k
        )

        # Extract the generated text from the response
        response = response['choices'][0]['text'].strip()

    except Exception as e:
        # Handle any errors during generation
        response = f"Sorry, I encountered the following error:\n{e}"

    return response


## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [62]:
# Generate a RAG-based response for Query 1
rag_response_1 = generate_rag_response(user_input_1, k=3, top_k=20)

# Show all three outputs: plain LLM, system-prompted LLM, and the RAG-augmented answer
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_1}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_1}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_1}"
)



Llama.generate: prefix-match hit


Raw LLM response:


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Resusc

LLM with system prompt:

Based on current guidelines from reputable sources such as the Surviving Sepsis Campaign and the National Institute for Health and Care Excellence (NICE), the following are key steps in managing sepsis in a critical care unit:
1. Early recognition and assessment using the Sequential Organ Failure Assessment (SOFA) score or Quick Sequential Organ Failure Assessment (qSOFA) score.
2. Immediate administration of br

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [64]:
# Generate a RAG-based response for Query 2
rag_response_2 = generate_rag_response(user_input_2, k=3, top_k=20)

# Show all three outputs: plain LLM, system-prompted LLM, and the RAG-augmented answer
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_2}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_2}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_2}"
)

Llama.generate: prefix-match hit


Raw LLM response:


Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:

1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing

LLM with system prompt:


Appendicitis is a medical condition characterized by inflammation of the appendix, a small pouch that extends from the cecum in the large intestine. The common symptoms for appendicitis include:
1. Sudden pain in the lower right abdomen, which may be mild at first but becomes sharp and severe over time.
2. Loss of appetite and feeling sick to your stomach (nausea).
3. Vomiting.
4. Fever, which may s

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [65]:
# Generate a RAG-based response for Query 3
rag_response_3 = generate_rag_response(user_input_3, k=3, top_k=20)

# Show all three outputs: plain LLM, system-prompted LLM, and the RAG-augmented answer
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_3}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_3}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_3}"
)

Llama.generate: prefix-match hit


Raw LLM response:


Sudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.

The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications.

LLM with system prompt:

Based on the context provided, sudden patchy hair loss, also known as alopecia areata, is an autoimmune disorder that results in the sudden loss of hair in small patches on the scalp or other areas of the body. The exact cause of alopecia areata is unknown, but it's believed to be related to a problem with the immune system.
Effective treatments for addressing sudden patchy hair loss include:
1. Corticosteroids: These medications can help reduc

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [66]:
# Generate a RAG-based response for Query 4
rag_response_4 = generate_rag_response(user_input_4, k=3, top_k=20)

# Show all three outputs: plain LLM, system-prompted LLM, and the RAG-augmented answer
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_4}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_4}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_4}"
)

Llama.generate: prefix-match hit


Raw LLM response:


A person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:

1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.
2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as

LLM with system prompt:

Based on the context provided, I cannot definitively answer this question without additional information such as the severity and location of the brain injury, the specific symptoms experienced by the individual, and any underlying health conditions they may have. However, I can provide some general information about common treatments for brain injuries.

For mild to mo

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [67]:
# Generate a RAG-based response for Query 4
rag_response_5 = generate_rag_response(user_input_5, k=3, top_k=20)

# Show all three outputs: plain LLM, system-prompted LLM, and the RAG-augmented answer
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_5}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_5}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_5}"
)

Llama.generate: prefix-match hit


Raw LLM response:


First and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:

1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.
2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.
3. Immobilize the leg: Use a splint, sl

LLM with system prompt:

Based on the context provided, I cannot directly determine the specific type or severity of the leg fracture. However, I can provide general guidelines for the necessary precautions and treatment steps for someone who has sustained a leg fracture during a hiking trip.
1. Immediate Care: The first priority is to ensure the safety and stability of the injured person. If possible, try to immobilize the affected leg us

#### Observations
**Raw LLM Responses**


- Provide overly generic recommendations with limited clinical relevance

- Frequently omit important medical protocols and considerations

- Lack situational awareness, resulting in incomplete or potentially unsafe advice

- Do not account for context-specific challenges such as remote settings or evacuation needs

**LLM + System Prompt Responses**
- Exhibit improved structure and adopt a professional tone

- Offer more focused guidance but still miss key clinical interventions

- Fail to fully address risk mitigation strategies and emergency protocols

- Remain limited in adapting responses to specific use-case scenarios

**RAG-Supplemented Responses**
- Leverage context effectively to deliver precise and relevant medical guidance

- Incorporate accurate clinical terminology and reference evidence-based practices

- Demonstrate an understanding of complications and appropriate escalation steps

- Adapt to the situation at hand, such as considering environmental and logistical factors

- Provide clear, prioritized, and actionable recommendations aligned with medical standards



### Fine-tuning

#### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [70]:
# Generate and compare LLM outputs: base model, system-guided, RAG-enhanced, and RAG with adjusted temperature
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_1}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_1}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_1}\n\n"
    f"{separator}\n"
    f"RAG response with adjusted temperature (0.5):\n"
    f"{generate_rag_response(user_input_1, temperature=0.5)}"
)



Llama.generate: prefix-match hit


Raw LLM response:


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Resusc

LLM with system prompt:

Based on current guidelines from reputable sources such as the Surviving Sepsis Campaign and the National Institute for Health and Care Excellence (NICE), the following are key steps in managing sepsis in a critical care unit:
1. Early recognition and assessment using the Sequential Organ Failure Assessment (SOFA) score or Quick Sequential Organ Failure Assessment (qSOFA) score.
2. Immediate administration of br

##### Observations
- Increasing the temperature from 0 to 0.5 led to noticeable changes in phrasing and expression.

- The core meaning of the response remained consistent despite the variation in wording.

- Overall, this adjustment had a low impact on content quality, affecting language style rather than informational value.

#### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [76]:
# Compare LLM outputs: baseline, system-prompted, RAG-based, and RAG with increased chunk count (k=6) and temp=0.7
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_2}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_2}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_2}\n\n"
    f"{separator}\n"
    f"RAG response with increased chunk count (k=5):\n"
    f"{generate_rag_response(user_input_2, temperature=0.7,k=5)}"
)

Llama.generate: prefix-match hit


Raw LLM response:


Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:

1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing

LLM with system prompt:


Appendicitis is a medical condition characterized by inflammation of the appendix, a small pouch that extends from the cecum in the large intestine. The common symptoms for appendicitis include:
1. Sudden pain in the lower right abdomen, which may be mild at first but becomes sharp and severe over time.
2. Loss of appetite and feeling sick to your stomach (nausea).
3. Vomiting.
4. Fever, which may s

##### Observations
- The raw LLM response provides a basic clinical description but lacks completeness and does not include specific diagnostic signs.

- The system-prompted LLM improves slightly, listing more symptoms but still omits clinical examination findings and medical nuance.

- The RAG response introduces detailed, clinically relevant content, including diagnostic signs (e.g., McBurney’s point, Rovsing sign), showing clear value from contextual grounding.

- Increasing k (number of retrieved chunks) from 3 to 5 yielded a nearly identical response, with only slight elaboration on certain signs.

- Overall, the change in k had minimal impact on output quality, suggesting diminishing returns beyond a certain threshold for this query.

#### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [81]:
# Compare LLM outputs: baseline, system-prompted, RAG-based, and RAG with increased chunk count (k=7) and temp=0.65
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_3}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_3}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_3}\n\n"
    f"{separator}\n"
    f"RAG response with increased chunk count (k=7):\n"
    f"{generate_rag_response(user_input_3, temperature=0.65,k=7)}"
)

Llama.generate: prefix-match hit


Raw LLM response:


Sudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.

The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications.

LLM with system prompt:

Based on the context provided, sudden patchy hair loss, also known as alopecia areata, is an autoimmune disorder that results in the sudden loss of hair in small patches on the scalp or other areas of the body. The exact cause of alopecia areata is unknown, but it's believed to be related to a problem with the immune system.
Effective treatments for addressing sudden patchy hair loss include:
1. Corticosteroids: These medications can help reduc

##### Observations

- The raw LLM and system-prompted responses both correctly identify alopecia areata and provide basic context, but they remain limited in treatment detail and are truncated before delivering complete guidance.

- The RAG response introduces more comprehensive treatment options and acknowledges the autoimmune nature of the condition, showing effective use of contextual knowledge from source documents.

- Increasing k to 7 slightly enhanced the completeness of the treatment list and added disclaimers regarding treatment variability and underlying conditions—suggesting a moderate improvement in response quality without introducing irrelevant content.

#### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [82]:
# Compare LLM outputs: baseline, system-prompted, RAG-based, and RAG with increased chunk count (k=8) and temp=0.8
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_4}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_4}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_4}\n\n"
    f"{separator}\n"
    f"RAG response with increased chunk count (k=8):\n"
    f"{generate_rag_response(user_input_4, temperature=0.8,k=8)}"
)

Llama.generate: prefix-match hit


Raw LLM response:


A person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:

1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.
2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as

LLM with system prompt:

Based on the context provided, I cannot definitively answer this question without additional information such as the severity and location of the brain injury, the specific symptoms experienced by the individual, and any underlying health conditions they may have. However, I can provide some general information about common treatments for brain injuries.

For mild to mo

##### Observations
- The raw LLM and system-prompted responses provide a high-level overview of traumatic brain injury (TBI) management, but both are truncated and lack depth in clinical intervention and long-term care strategies.

- The RAG response introduces medically relevant and context-grounded detail, including acute stabilization, ICP management, cognitive rehabilitation, and pain control, demonstrating strong value from contextual retrieval.

- Increasing k to 8 resulted in a more detailed and complete version of the RAG response, reinforcing key clinical considerations and extending coverage to long-term monitoring and neurological assessments—showing a moderate improvement in completeness and precision without topic drift.

#### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [83]:
# Compare LLM outputs: baseline, system-prompted, RAG-based, and RAG with temp=0.8
separator = "=" * 50
print(
    f"{separator}\n"
    f"Raw LLM response:\n{raw_llm_response_5}\n\n"
    f"{separator}\n"
    f"LLM with system prompt:\n{sysprompt_llm_5}\n\n"
    f"{separator}\n"
    f"RAG response:\n{rag_response_5}\n\n"
    f"{separator}\n"
    f"RAG response with temp = 0.8:\n"
    f"{generate_rag_response(user_input_5, temperature=0.8)}"
)

Llama.generate: prefix-match hit


Raw LLM response:


First and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:

1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.
2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.
3. Immobilize the leg: Use a splint, sl

LLM with system prompt:

Based on the context provided, I cannot directly determine the specific type or severity of the leg fracture. However, I can provide general guidelines for the necessary precautions and treatment steps for someone who has sustained a leg fracture during a hiking trip.
1. Immediate Care: The first priority is to ensure the safety and stability of the injured person. If possible, try to immobilize the affected leg us

##### Observations
- The raw LLM and system-prompted responses provide a general first-aid overview of managing a leg fracture but are cut off before delivering complete or medically actionable steps. They focus more on stabilization than on clinical treatment or escalation protocols.

- The RAG response offers greater clinical depth, including assessment of neurovascular injuries, use of sedation during reduction, and differentiation between closed and open treatment methods. It also references the RICE protocol, indicating improved contextual accuracy.

- Increasing the temperature to 0.8 led to a more expansive and assertive response that preserved medical relevance while introducing slightly broader detail. In this case, the higher temperature produced a richer but still accurate output, suggesting a positive impact on expression without sacrificing content integrity.

## Output Evaluation

Now we use LLM-as-a-judge method to check the quality of the RAG system on two parameters - retrieval and generation.

We are using the same Mistral model for evaluation, so basically here the llm is rating itself on how well it has performed in the task.

In [84]:
# Define a stricter evaluation prompt to assess how well a response is grounded in the provided source material
groundedness_rater_system_message = (
    "You are an assistant assigned to critically evaluate the groundedness of a response. "
    "Your task is to rigorously assess whether the response is explicitly supported by the provided source documents. "
    "Pay close attention to factual alignment, completeness, and the absence of unsupported claims or assumptions. "
    "Only rate the response favorably if it closely adheres to the information in the sources."
)

In [85]:
# Define a prompt for evaluating how directly and effectively a response addresses the original query
relevance_rater_system_message = (
    "You are an assistant tasked with critically evaluating the relevance of a response to a specific user query. "
    "Assess how directly, completely, and specifically the response addresses the intent and details of the question. "
    "Do not reward generic or partially aligned answers. "
    "Rate the response on a scale from 1 (not relevant) to 5 (highly relevant), and provide a clear justification for your rating."
)

In [86]:
# Define a prompt template that provides question, context, and answer for groundedness and relevance evaluation
user_message_template = """
You will be provided with a question, the context used to answer it, and the response itself.
Evaluate the response based on its relevance to the question and its groundedness in the context.

### Question:
{question}

### Context:
{context}

### Answer:
{answer}
"""


In [91]:
#Create a function to generate a RAG response and then evaluate for groundedness and relevance
#LLM is prompted multiple times, once to get response, once to eval groundedness, and once to eval relevance
def generate_ground_relevance_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global groundedness_rater_system_message, relevance_rater_system_message, user_message_template, template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=k)
    context_list = [d.page_content for d in relevant_document_chunks]
    context_for_query = ". ".join(context_list)

    # Use the template to construct the prompt for the main response
    qna_prompt = template.format(context=context_for_query, question=user_input)


    response = llm(
            prompt=qna_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            echo=False
            )

    answer =  response["choices"][0]["text"]

    # Combine user_prompt and system_message to create the prompt for groundedness evaluation
    groundedness_prompt = f"""[INST]{groundedness_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Combine user_prompt and system_message to create the prompt for relevance evaluation
    relevance_prompt = f"""[INST]{relevance_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    response_1 = llm(
            prompt=groundedness_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            echo=False
            )

    response_2 = llm(
            prompt=relevance_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            echo=False
            )

    return response_1['choices'][0]['text'],response_2['choices'][0]['text']

#### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [94]:
#Evaluate query 1 response for groundedness and relevance
ground,rel = generate_ground_relevance_response(user_input_1,k=1,max_tokens=370)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 The user's response is grounded in the context provided, as it accurately summarizes the protocol for managing sepsis in a critical care unit based on the information given in the text. The response includes details about fluid resuscitation, oxygen therapy, antibiotics, drainage of abscesses and necrotic tissue, normalization of blood glucose levels, replacement-dose corticosteroids, monitoring requirements, and diagnosis procedures for sepsis patients. Therefore, the response is favorable.

 Rating: 5 (highly relevant)

Justification: The response directly addresses the user's question about managing sepsis in a critical care unit by providing a detailed list of interventions and monitoring requirements. The information is grounded in the context provided, which discusses the importance of supportive care for ICU patients, including prevention of infection and monitoring of vital signs and laboratory values. The response also mentions the potential complications of sepsis, such as s

#### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [93]:
#Evaluate query 2 response for groundedness and relevance
ground,rel = generate_ground_relevance_response(user_input_2,max_tokens=370)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 The user's response is grounded in the context provided. The response accurately summarizes the common symptoms of appendicitis as described in the context, including epigastric or periumbilical pain, nausea, vomiting, anorexia, and right lower quadrant pain that increases with cough and motion. The response also mentions classic signs such as tenderness at McBurney's point, Rovsing sign, psoas sign, and obturator sign, as well as the presence of a low-grade fever.

The user's response correctly states that treatment for appendicitis is surgical removal via open or laparoscopic appendectomy. The response also mentions that antibiotics may be given before or after surgery to prevent infection and that if an abscess or inflammatory mass has formed, the procedure may be limited to drainage of the abscess.

The user's response is largely complete, accurately summarizing the information provided in the context regarding appendicitis symptoms, signs, and treatment. There are no unsupported 

#### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [95]:
#Evaluate query 3 response for groundedness and relevance
ground,rel = generate_ground_relevance_response(user_input_3,max_tokens=370)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 The user's response is grounded in the context provided, as it accurately identifies alopecia areata as a possible cause of sudden patchy hair loss and lists several treatment options for this condition based on the information in the context. The response also emphasizes the importance of identifying and treating any underlying disorder causing the hair loss. Therefore, I would rate the response favorably.

 Rating: 5 (highly relevant)

Justification: The response directly addresses the user query by identifying alopecia areata as a possible cause for sudden patchy hair loss and suggesting various treatment options for this condition. The response is also grounded in the context provided, which mentions alopecia areata as a type of hair disorder that can result in sudden patchy hair loss with no obvious underlying skin or systemic disorder. The response also emphasizes the importance of identifying and treating any underlying disorders if they are causing the hair loss. Overall, the 

#### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [96]:
#Evaluate query 4 response for groundedness and relevance
ground,rel = generate_ground_relevance_response(user_input_4,max_tokens=370)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 The user's response is largely grounded in the context provided, as it accurately summarizes various aspects of the treatment and management of brain injuries discussed in the text. The response mentions optimization of ventilation, oxygenation, and brain perfusion for moderate to severe injuries, which is explicitly stated in the context. It also acknowledges the importance of treating complications such as increased intracranial pressure, seizures, and hematomas, which are mentioned in the context.

The response further discusses rehabilitation for cognitive deficits and personality changes, which are common causes of disability in social relations and employment according to the context. It also mentions proper immobilization to protect the spinal cord and blood vessels, securing a clear airway at the injury scene, and relieving pain with a short-acting opioid, all of which are procedures mentioned in the context.

The response also correctly states that the cornerstone of manageme

#### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [97]:
#Evaluate query 5 response for groundedness and relevance
ground,rel = generate_ground_relevance_response(user_input_5,max_tokens=370)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 The user's response is grounded in the context provided, as it accurately summarizes the necessary precautions and treatment steps for a person who has fractured their leg, including assessing for life-threatening injuries, splinting or immobilizing the leg, definitive treatment (reduction), RICE, pain management, and rehabilitation. The response also mentions prevention measures such as assessing risk for DVT and using preventative treatments for those at higher risk.

However, it's important to note that while the context does mention some specific treatments (such as arteriography for suspected arterial injuries and nerve conduction studies for nerve injuries), the user's response goes beyond the information provided in the context by discussing various treatment options for fractures (closed vs. open reduction, use of surgical hardware) and preventative measures for DVT (use of different anticoagulants and compression devices or stockings). While these additional details are gener


##### Observations

- All five responses received high scores for groundedness, indicating strong alignment with the supporting source documents.

- Each response was also rated highly relevant, demonstrating that the answers addressed the original queries directly and comprehensively.

## Actionable Insights and Business Recommendations

## Final Observations
As we move from a raw LLM to a system-guided version, and ultimately to a RAG-enhanced LLM with a system prompt, there is a clear progression in the overall quality of responses. Each stage adds greater control, depth, and reliability—particularly in high-stakes fields like medicine, where accuracy and contextual relevance are critical. Integrating RAG with well-structured prompts and high-quality sources proves essential for producing trustworthy, domain-aware, and up-to-date answers.

**Raw LLM**
- Accuracy: Moderate

- Relevance: Inconsistent

- Consistency: Variable

- Hallucinations: Frequently observed

- Domain-Specific Knowledge: Limited and shallow

**LLM + System Prompt**
- Accuracy: Moderate but more stable

- Relevance: Noticeably improved

- Consistency: Improved with structured output

- Hallucinations: Reduced, but still present

- Domain-Specific Knowledge: Somewhat enhanced with controlled framing

**RAG + System Prompt**
- Accuracy: High, with domain-specific grounding

- Relevance: High, with strong contextual alignment

- Consistency: High and reliable

- Hallucinations: Rare

- Domain-Specific Knowledge: Strong and well-supported by source material

<font size=6 color='blue'>Power Ahead</font>
___