<a href="https://colab.research.google.com/github/phfrebelo/aiml-portfolio/blob/main/NLP_RAG_Project_Notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

**Common Questions to Answer**

**1. Diagnostic Assistance**: "What are the common symptoms and treatments for pulmonary embolism?"

**2. Drug Information**: "Can you provide the trade names of medications used for treating hypertension?"

**3. Treatment Plans**: "What are the first-line options and alternatives for managing rheumatoid arthritis?"

**4. Specialty Knowledge**: "What are the diagnostic steps for suspected endocrine disorders?"

**5. Critical Care Protocols**: "What is the protocol for managing sepsis in a critical care unit?"

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [None]:
# Installation for GPU llama-cpp-python
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.28 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/9.4 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/9.4 MB[0m [31m38.0 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.4/9.4 MB[0m [31m141.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m268.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.4/16.4 MB[0m [31m220.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.6/44.6 kB[0m [31m294.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for llama-cpp-p

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [None]:
# For installing the libraries & downloading models from HF Hub
!pip install -q \
  huggingface_hub==0.35.3 pandas==2.2.2 tiktoken==0.12.0 pymupdf==1.26.5 \
  langchain==0.3.27 langchain-community==0.3.31 chromadb==1.1.1 \
  sentence-transformers==5.1.1 numpy==2.3.3

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m564.3/564.3 kB[0m [31m22.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.1/24.1 MB[0m [31m110.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m63.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m107.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.9/19.9 MB[0m [31m114.5 MB/s[0m eta [36

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [None]:
#Libraries for downloading and loading the llm
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

#### Downloading and Loading the model

In [None]:
model_path = hf_hub_download(
    repo_id="TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
    filename="mistral-7b-instruct-v0.2.Q5_K_M.gguf",
    resume_download=True
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


mistral-7b-instruct-v0.2.Q5_K_M.gguf:   0%|          | 0.00/5.13G [00:00<?, ?B/s]

In [None]:
llm = Llama(model_path=model_path, n_ctx=8192, n_gpu_layers=-1, n_batch=512, verbose=True)

AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Response

In [None]:
def response(query, max_tokens=256, temperature=0.2, top_p=0.95, top_k=50):
    prompt = f"[INST] {query} [/INST]"
    out = llm(prompt=prompt, max_tokens=max_tokens, temperature=temperature, top_p=top_p, top_k=top_k)
    return out["choices"][0]["text"].strip()

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
response(user_input)

'Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:\n\n1. Early Recognition: Identify sepsis early by recognizing the signs and symptoms, such as fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis. Use validated scoring systems like Sequential Organ Failure Assessment (SOFA) score or Quick Sequential [Sepsis-related] Organ Failure Assessment (qSOFA) to help identify patients at risk of sepsis.\n\n2. Immediate Fluid Resuscitation: Administer intravenous fluids to maintain adequate tissue perfusion and blood pressure. The goal is to achieve a mean arterial pressure (MAP) of 65 mmHg or higher, but avoid over-resuscitation that can lead to fluid overload.\n\n3. Antibiotics: Administer broad-spectrum antibiotics as soon as possible based on the suspected so

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
response(user_input2)

Llama.generate: prefix-match hit


'Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right abdomen. The common symptoms of appendicitis include:\n\n1. Abdominal pain: The pain usually starts around the navel and then shifts to the lower right side of the abdomen. It may be mild at first but can worsen over time.\n2. Loss of appetite: People with appendicitis often lose their appetite due to abdominal pain and discomfort.\n3. Nausea and vomiting: Vomiting is a common symptom in appendicitis, and it can occur before or after the onset of abdominal pain.\n4. Fever: A fever above 100.4°F (38°C) is often present in people with appendicitis.\n5. Constipation or diarrhea: Both constipation and diarrhea can occur in appendicitis, depending on the stage of the disease.\n6. Inability to pass gas: The inability to pass gas can be a sign of appendicitis, especially if it is accompanied by other symptoms.\n\nAppend'

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_input3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
response(user_input3)

Llama.generate: prefix-match hit


"Sudden patchy hair loss, also known as alopecia areata, is an autoimmune condition that causes hair loss in small patches on the scalp or other parts of the body. The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Here are some effective treatments and possible causes:\n\n1. Corticosteroids: These medications can help stimulate hair growth by reducing inflammation in the affected areas. They can be applied topically or taken orally.\n2. Immunotherapy: This treatment involves using substances that alter the immune response to promote hair regrowth. One common immunotherapy drug used for alopecia areata is anthralin.\n3. Minoxidil: This medication, which is available over-the-counter, can help slow down hair loss and promote new growth in some cases of alopecia areata.\n4. Hair transplantation: In severe cases where there is extensive hair loss, hair transplantation may be an option to restore hair.\n5. Nutritional defic

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
response(user_input4)

Llama.generate: prefix-match hit


'Treatment for a physical injury to the brain tissue depends on the severity and location of the injury. Here are some common treatments that may be recommended:\n\n1. Emergency care: If the injury is recent, the person may require emergency care to address any life-threatening conditions such as airway obstruction, breathing difficulties, or excessive bleeding.\n2. Surgery: In some cases, surgery may be necessary to remove blood clots, repair skull fractures, or relieve pressure on the brain.\n3. Medications: Depending on the symptoms, various medications may be prescribed to manage conditions such as seizures, pain, or infections.\n4. Rehabilitation: Rehabilitation is an essential part of treatment for brain injuries. This may include physical therapy to help with mobility and strength, occupational therapy to improve fine motor skills and daily living activities, speech therapy to address communication difficulties, and cognitive rehabilitation to improve memory, attention, and prob

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
response(user_input5)

Llama.generate: prefix-match hit


'If a person has fractured their leg during a hiking trip, it is important that they receive proper medical attention as soon as possible. Here are some necessary precautions and treatment steps:\n\n1. Assess the situation: Check the severity of the injury and ensure the safety of both the injured person and any companions. If the fracture is open or the person is in severe pain, immobilize the leg with a splint or a makeshift sling and call for emergency medical help.\n\n2. Provide first aid: Apply pressure to control bleeding if necessary, and elevate the injured leg to reduce swelling and discomfort. Use a sterile dressing to cover any wounds.\n\n3. Immobilize the leg: Use a splint, sling, or other immobilizing device to keep the leg stable and prevent further damage. Make sure it is securely fastened but not too tight.\n\n4. Transport the person: If possible, carry the person to a safe location for further medical assistance. If transporting them is difficult, consider calling for 

### Observations

- The LLM provides a coherent and medically plausible response based on its pretrained knowledge.
- However, answers are generated without referencing any external medical source, which may lead to hallucinations or outdated recommendations.
- For protocol-based questions (e.g., sepsis management), responses are high-level and may lack institution-specific steps.
- This highlights the limitation of using an LLM alone for clinical decision support and motivates the need for Retrieval-Augmented Generation (RAG).

## Question Answering using LLM with Prompt Engineering

In [None]:
system_prompt = """
You are a medical knowledge assistant helping healthcare professionals quickly summarize information.
You must follow these rules:

1) Do NOT invent facts. If you are uncertain or the question requires specifics you do not have, say:
   "Insufficient information to answer with certainty."
2) Do NOT provide individualized medical advice. Provide general clinical information only.
3) Keep the answer structured and concise, optimized for clinical scanning.
4) If medications are discussed, avoid exact dosing unless explicitly asked AND you are confident.
5) Use clear headings and bullet points.
"""

answer_format = """
Return your answer in this exact structure:

1) Summary (2-3 bullets)
2) Key clinical features / symptoms
3) Diagnostic approach
4) Management / treatment
5) Red flags / when to escalate
6) Notes / limitations (what you are assuming or what is missing)
"""

In [None]:
param_grid = [
    {"temperature":0.0, "max_tokens":300, "top_p":0.95, "top_k":50},
    {"temperature":0.1, "max_tokens":350, "top_p":0.95, "top_k":50},
    {"temperature":0.2, "max_tokens":400, "top_p":0.90, "top_k":40},
    {"temperature":0.3, "max_tokens":400, "top_p":0.95, "top_k":20},
    {"temperature":0.5, "max_tokens":450, "top_p":0.98, "top_k":50},
]

user_question = "What is the protocol for managing sepsis in a critical care unit?"

for i, p in enumerate(param_grid, 1):
    prompt = system_prompt + "\n\n" + answer_format + f"\n\nUser question:\n{user_question}\n\nAssistant answer:\n"
    ans = response(prompt, **p)
    print(f"\n--- Combo {i}: {p} ---\n{ans}\n")

Llama.generate: prefix-match hit



--- Combo 1: {'temperature': 0.0, 'max_tokens': 300, 'top_p': 0.95, 'top_k': 50} ---
1) Summary:
   - Sepsis is a life-threatening condition caused by a dysregulated response to infection.
   - Septic shock occurs when severe sepsis progresses and leads to tissue hypoperfusion and organ dysfunction.
   - Early recognition and intervention are crucial for improving outcomes in sepsis patients.

2) Key clinical features / symptoms:
   - Fever or hypothermia
   - Tachycardia (heart rate >90 bpm) or bradycardia (<60 bpm)
   - Respiratory distress (tachypnea, dyspnea, or respiratory failure)
   - Altered mental status (confusion or decreased level of consciousness)
   - Decreased urine output

3) Diagnostic approach:
   - Suspect sepsis in patients with suspected or confirmed infection and signs/symptoms of organ dysfunction.
   - Laboratory tests: Complete blood count, lactate levels, procalcitonin, and microbiological cultures.
   - Imaging studies may be used to identify the source of i

Llama.generate: prefix-match hit



--- Combo 2: {'temperature': 0.1, 'max_tokens': 350, 'top_p': 0.95, 'top_k': 50} ---
1) Summary:
   - Sepsis is a life-threatening condition caused by a dysregulated host response to infection.
   - Septic shock occurs when severe sepsis progresses to circulatory, cellular, and metabolic abnormalities.
   - Early recognition and intervention are crucial for improving outcomes.

2) Key clinical features / symptoms:
   - Fever or hypothermia
   - Tachycardia or bradycardia
   - Respiratory distress (tachypnea, altered breath sounds)
   - Altered mental status
   - Decreased urine output
   - Hypotension or lactic acidosis

3) Diagnostic approach:
   - Suspect sepsis based on clinical suspicion and laboratory results (leukocytosis, leukopenia, thrombocytopenia, elevated CRP, lactate).
   - Confirm infection source.
   - Obtain blood cultures before administering antibiotics.

4) Management / treatment:
   - Initiate broad-spectrum antibiotics based on suspected infection and local microb

Llama.generate: prefix-match hit



--- Combo 3: {'temperature': 0.2, 'max_tokens': 400, 'top_p': 0.9, 'top_k': 40} ---
1) Summary:
- Sepsis is a life-threatening condition caused by a dysregulated response to infection.
- Septic shock occurs when severe sepsis progresses and leads to tissue hypoperfusion and organ dysfunction.
- Early recognition and intervention are crucial for improving outcomes in sepsis patients.

2) Key clinical features / symptoms:
- Fever or hypothermia
- Tachycardia (heart rate >90 bpm) or bradycardia (<60 bpm)
- Respiratory distress (tachypnea, dyspnea, or ARDS)
- Altered mental status
- Decreased urine output
- Hypotension or lactic acidosis

3) Diagnostic approach:
- Suspect sepsis in any patient with suspected infection and organ dysfunction.
- Laboratory tests: CBC, BUN/creatinine, glucose, electrolytes, LFTs, coagulation panel, lactate, blood cultures, and microbiological cultures.
- Imaging studies: Chest X-ray, ultrasound, or CT scan for source identification.

4) Management / treatment

Llama.generate: prefix-match hit



--- Combo 4: {'temperature': 0.3, 'max_tokens': 400, 'top_p': 0.95, 'top_k': 20} ---
1) Summary:
   - Sepsis is a life-threatening condition caused by a dysregulated host response to infection.
   - Septic shock occurs when severe sepsis progresses and leads to tissue hypoperfusion and organ dysfunction.
   - Early recognition, appropriate antibiotic therapy, and supportive measures are crucial in managing sepsis in a critical care unit.

2) Key clinical features / symptoms:
   - Fever or hypothermia
   - Tachycardia or bradycardia
   - Respiratory distress (tachypnea, dyspnea, or ARDS)
   - Altered mental status
   - Decreased urine output
   - Hypotension or lactic acidosis

3) Diagnostic approach:
   - Suspect sepsis based on clinical suspicion and laboratory findings (leukocytosis, leukopenia, thrombocytopenia, elevated CRP, lactate levels).
   - Confirm infection source.
   - Obtain blood cultures before administering antibiotics if possible.
   - Monitor hemodynamic status and o

Llama.generate: prefix-match hit



--- Combo 5: {'temperature': 0.5, 'max_tokens': 450, 'top_p': 0.98, 'top_k': 50} ---
1) Summary:
- Sepsis is a life-threatening condition caused by a dysregulated response to infection.
- Septic shock occurs when severe sepsis progresses and causes circulatory, cellular, and metabolic abnormalities.
- Early recognition and management are crucial for improving outcomes.

2) Key clinical features / symptoms:
- Fever or hypothermia
- Tachycardia (heart rate > 90 bpm) or bradycardia (<60 bpm)
- Respiratory distress, including tachypnea and altered breath sounds
- Altered mental status
- Decreased urine output
- Hypotension or lactic acidosis

3) Diagnostic approach:
- Suspect sepsis in any patient with suspected infection and organ dysfunction.
- Laboratory tests: Complete blood count, metabolic panel, lactate level, coagulation studies, microbiological cultures (blood, urine, sputum, etc.)
- Imaging studies: Chest X-ray, abdominal ultrasound or CT scan if source of infection is unclear.


### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_question = "What is the protocol for managing sepsis in a critical care unit?"
user_input = (system_prompt + "\n\n" + answer_format + "\n\nUser question:\n" + user_question + "\n\nAssistant answer:\n")
response(user_input)

Llama.generate: prefix-match hit


'1) Summary:\n- Sepsis is a life-threatening condition caused by a dysregulated response to infection.\n- Septic shock occurs when severe sepsis progresses and leads to tissue hypoperfusion and organ failure.\n- Early recognition and intervention are crucial for improving outcomes in sepsis patients.\n\n2) Key clinical features / symptoms:\n- Fever or hypothermia\n- Tachycardia (heart rate >90 bpm) or bradycardia (<60 bpm)\n- Respiratory distress (tachypnea, dyspnea, or altered respiratory status)\n- Decreased urine output (less than 0.5 ml/kg/h for adults)\n- Altered mental status (confusion, disorientation, or decreased level of consciousness)\n- Hypotension (systolic blood pressure <90 mmHg or mean arterial pressure <65 mmHg)\n\n3) Diagnostic approach:\n- Suspect sepsis in patients with suspected infection and organ dysfunction.\n- Obtain blood cultures before starting antibiotics, if possible.'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_question2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
user_input2 = (system_prompt + "\n\n" + answer_format + "\n\nUser question:\n" + user_question2 + "\n\nAssistant answer:\n")
response(user_input2)

Llama.generate: prefix-match hit


"1) Summary:\n- Appendicitis is a common inflammatory condition of the appendix.\n- It typically begins as an infection in the appendix and can lead to perforation and peritonitis if left untreated.\n- The standard treatment for appendicitis is surgical removal of the appendix (appendectomy).\n\n2) Key clinical features / symptoms:\n- Right lower quadrant abdominal pain, often localized around McBurney's point.\n- Anorexia and nausea with or without vomiting.\n- Fever and leukocytosis are common but not always present.\n\n3) Diagnostic approach:\n- Clinical examination, including palpation of the abdomen.\n- Laboratory tests such as complete blood count (CBC), urinalysis, and imaging studies like ultrasound or CT scan.\n\n4) Management / treatment:\n- Appendectomy is the definitive treatment for appendicitis.\n- Antibiotics may be used preoperatively to manage infection but are not a substitute for surgery.\n\n5) Red flags / when to escalate:\n- Severe"

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_question3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
user_input3 = (system_prompt + "\n\n" + answer_format + "\n\nUser question:\n" + user_question3 + "\n\nAssistant answer:\n")
response(user_input3)

Llama.generate: prefix-match hit


'1) Summary:\n- Sudden patchy hair loss refers to the appearance of localized bald spots on the scalp.\n- Common causes include alopecia areata, telogen effluvium, and traction alopecia.\n- Treatment options include topical treatments, systemic medications, and hair restoration techniques.\n\n2) Key clinical features / symptoms:\n- Sudden onset of well-defined circular or oval bald patches on the scalp.\n- Hair loss may involve single or multiple patches.\n- Alopecia areata is often associated with itching or tingling sensation in the affected area.\n\n3) Diagnostic approach:\n- Clinical evaluation based on the distribution and pattern of hair loss.\n- Exclusion of other causes such as thyroid disorders, nutritional deficiencies, or medications.\n- Dermoscopic examination may be helpful for confirming the diagnosis.\n\n4) Management / treatment:\n- Topical treatments: Corticosteroids (injections, creams, or solutions) to reduce inflammation and promote regrowth.\n- Systemic medications

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_question4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
user_input4 = (system_prompt + "\n\n" + answer_format + "\n\nUser question:\n" + user_question4 + "\n\nAssistant answer:\n")
response(user_input4)

Llama.generate: prefix-match hit


'1) Summary:\n- Brain injuries can result in various degrees of impairment.\n- Treatment depends on the severity and location of the injury.\n- Rehabilitation is a crucial component of management.\n\n2) Key clinical features / symptoms:\n- Depending on the area of the brain affected, symptoms may include: motor deficits, sensory disturbances, speech difficulties, cognitive impairment, emotional instability, or changes in behavior.\n\n3) Diagnostic approach:\n- Neurological examination is essential to assess the extent and location of damage.\n- Imaging studies (CT or MRI) can help confirm the diagnosis and rule out other conditions.\n\n4) Management / treatment:\n- Acute phase management may involve surgery, intensive care support, and medication for symptoms such as seizures, pain, or increased intracranial pressure.\n- Rehabilitation is essential to restore function and improve quality of life. This can include physical therapy, occupational therapy, speech therapy, and cognitive reh

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_question5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
user_input5 = (system_prompt + "\n\n" + answer_format + "\n\nUser question:\n" + user_question5 + "\n\nAssistant answer:\n")
response(user_input5)

Llama.generate: prefix-match hit


'1) Summary:\n- A leg fracture is a common injury during hiking due to falls or twisting motions.\n- Immediate first aid includes RICE (Rest, Ice, Compression, Elevation).\n- Seek medical attention for proper diagnosis and treatment.\n\n2) Key clinical features / symptoms:\n- Swelling, bruising, and pain in the affected leg.\n- Difficulty weight-bearing or walking.\n- Deformity or misalignment of the bone (in some cases).\n\n3) Diagnostic approach:\n- Physical examination to assess the severity and location of the fracture.\n- X-rays for proper diagnosis and evaluation of alignment and displacement.\n\n4) Management / treatment:\n- Immobilization with a cast or brace to maintain alignment and promote healing.\n- Pain management through medications, ice packs, or other methods.\n- Regular follow-up appointments to monitor progress and adjust the treatment plan as needed.\n\n5) Red flags / when to escalate:\n- Significant swelling, discoloration, or coolness in the affected limb.\n- Inab

### Observations (Prompt Engineering)

- Adding a system prompt and structured answer format significantly improves readability and clinical scanability.
- Lower temperature values (0.0–0.2) produce more deterministic and protocol-like answers, which are preferable for healthcare use cases.
- Higher temperature values introduce more variability and narrative detail but may reduce precision.
- Explicitly instructing the model to avoid guessing reduces hallucinated medical details.
- Prompt engineering alone improves response quality but still does not guarantee factual grounding.

## Data Preparation for RAG

In [None]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

### Loading the Data

In [None]:
# Import dataset from google.colab drive
from google.colab import drive
import os

# Mount Google Drive ONCE
drive.mount('/content/drive', force_remount=True)

# Define base folder where the file is stored
base_path = "/content/drive/MyDrive/Colab Notebooks/NLP/medical_diagnosis_manual.pdf"

pdf_loader = PyMuPDFLoader(base_path)

manual = pdf_loader.load()

Mounted at /content/drive


### Data Overview

#### Checking the first 5 pages

In [None]:
for i in range(5):
    print(f"\n===== PAGE {i+1} =====")
    print(manual[i].page_content[:1500])


===== PAGE 1 =====
phfrebelo@gmail.com
1CJNY5XF6O
meant for personal use by phfrebelo@gm
shing the contents in part or full is liable

===== PAGE 2 =====
phfrebelo@gmail.com
1CJNY5XF6O
This file is meant for personal use by phfrebelo@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.

===== PAGE 3 =====
Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    .......................................................................................................................................................................................

#### Checking the number of pages

In [None]:
print("Total pages:", len(manual))

Total pages: 4114


### Data Chunking

In [None]:
# We use token-based chunking to ensure compatibility with the LLM context window.
# A chunk size of 800 tokens with overlap helps preserve semantic continuity,
# and is often preferable for protocol-heavy clinical content.
# Chunk overlap ensures that critical information spanning boundaries is not lost.

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=800,
    chunk_overlap=100
)

In [None]:
document_chunks = pdf_loader.load_and_split(text_splitter)

In [None]:
len(document_chunks)

6432

In [None]:
document_chunks[0].page_content

'phfrebelo@gmail.com\n1CJNY5XF6O\nmeant for personal use by phfrebelo@gm\nshing the contents in part or full is liable'

In [None]:
document_chunks[-2].page_content

'Y\nYaws 1266-1267\nforest 1379\nY chromosome 3373 (see also Genetic)\nabnormalities of 3005\nYeast infection (see also Fungal infection)\nvaginal 2542, 2544, 2545\nYellow fever 1400, 1429, 1437\nhepatic inflammation in 248\nvaccine against 1172, 1437, 3441\nYellow nail syndrome 732, 1995\npleural effusion in 1997\nYellow skin (see Jaundice)\nYersinia infection 1167, 1256-1257\nY. enterocolitica infection 147\nY. pestis infection 1924\nYew poisoning 3338\nYips 1762\nYo, antibodies to 1056\nYolk sac tumor 2476\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nY\n4103\nphfrebelo@gmail.com\n1CJNY5XF6O\nThis file is meant for personal use by phfrebelo@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action.'

In [None]:
document_chunks[-1].page_content

"Z\nZafirlukast 1879\nZalcitabine 1451\nin children 2854\nZaleplon 1709\nZanamivir 1407\nin influenza 1407, 1929\nZAP-70 (zeta-associated protein 70) deficiency 1092, 1108\nZavanelli maneuver 2680\nZellweger syndrome 2383, 3023\nZenker's diverticulum 125\nZidovudine 1451, 1453\nin children 2854\nZileuton 1881\nin asthma 1880\nZinc 49, 55, 3431-3432\nin common cold 1405\ndeficiency of 11, 49, 55\nin dermatophytoses 705\npoisoning with 3328, 3353\nrecommended dietary allowances for 50\nreference values for 3499\ntoxicity of 49, 55\ncopper deficiency and 49\nin Wilson's disease 52\nZinc oxide 2233\ngelatin formulation of 646, 672\nZinc pyrithione 647\nZinc shakes 55\nZipper injury 3239, 3240\nZiprasidone\nin agitation 1492\nin bipolar disorder 3059\npoisoning with 3347\nin schizophrenia 1566\nZoledronate 359, 361, 848\nZollinger-Ellison syndrome 95, 199, 200-201, 910\nmastocytosis vs 1125\nMenetrier's disease vs 132\npeptic ulcer disease vs 134\nZolmitriptan 1721\nZolpidem 1709, 3103\nZon

### Embedding

In [None]:
# We use a transformer-based embedding model to convert text chunks into dense vectors.
# These embeddings capture semantic similarity, enabling retrieval of relevant medical passages
# even when exact keywords are not present in the query.

embedding_model = SentenceTransformerEmbeddings(model_name='all-MiniLM-L6-v2')

  embedding_model = SentenceTransformerEmbeddings(model_name='all-MiniLM-L6-v2')


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
embedding_1 = embedding_model.embed_query(document_chunks[0].page_content)
embedding_2 = embedding_model.embed_query(document_chunks[1].page_content)

In [None]:
print("Dimension of the embedding vector ",len(embedding_1))
len(embedding_1)==len(embedding_2)

Dimension of the embedding vector  384


True

In [None]:
embedding_1,embedding_2

([-0.08645728975534439,
  0.05498664081096649,
  0.02757289819419384,
  -0.01819179765880108,
  0.08390454202890396,
  -0.018404480069875717,
  0.0915946364402771,
  0.08510835468769073,
  -0.013366815634071827,
  -0.019132614135742188,
  0.06325002014636993,
  -0.015051125548779964,
  -0.015580685809254646,
  -0.04006664827466011,
  -0.06319946050643921,
  -0.06379593908786774,
  -0.021549884229898453,
  -0.010457342490553856,
  -0.08835150301456451,
  -0.010510949417948723,
  -0.030389299616217613,
  -0.016328884288668633,
  -0.025726694613695145,
  0.016658402979373932,
  -0.058018606156110764,
  0.014258375391364098,
  -0.049273569136857986,
  0.0807180404663086,
  0.0036870127078145742,
  -0.04669046029448509,
  0.034854330122470856,
  0.06120111793279648,
  0.05682525411248207,
  -0.012309207580983639,
  0.030158475041389465,
  0.08325738459825516,
  -0.09574104845523834,
  -0.08339398354291916,
  -0.04181931912899017,
  -0.05692315846681595,
  0.010085363872349262,
  -0.04335660

### Vector Database

In [None]:
out_dir = 'manual_db'

if not os.path.exists(out_dir):
  os.makedirs(out_dir)

In [None]:
# The vector database stores embeddings along with document metadata.
# This enables efficient semantic similarity search and allows retrieved answers
# to be traced back to specific sections of the medical manual.

vectorstore = Chroma.from_documents(
    document_chunks,
    embedding_model,
    persist_directory=out_dir
)

In [None]:
vectorstore = Chroma(persist_directory=out_dir,embedding_function=embedding_model)

  vectorstore = Chroma(persist_directory=out_dir,embedding_function=embedding_model)


In [None]:
vectorstore.embeddings

HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
), model_name='all-MiniLM-L6-v2', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False, show_progress=False)

In [None]:
vectorstore.similarity_search("appendicitis sepsis",k=3)

[Document(metadata={'title': 'The Merck Manual of Diagnosis & Therapy, 19th Edition', 'keywords': '', 'total_pages': 4114, 'producer': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'creator': 'Atop CHM to PDF Converter', 'creationDate': 'D:20120615054440Z', 'format': 'PDF 1.7', 'file_path': '/content/drive/MyDrive/Colab Notebooks/NLP/medical_diagnosis_manual.pdf', 'subject': '', 'modDate': 'D:20251216052241Z', 'page': 174, 'moddate': '2025-12-16T05:22:41+00:00', 'creationdate': '2012-06-15T05:44:40+00:00', 'author': '', 'trapped': '', 'source': '/content/drive/MyDrive/Colab Notebooks/NLP/medical_diagnosis_manual.pdf'}, page_content="• Surgical removal\n• IV fluids and antibiotics\nTreatment of acute appendicitis is open or laparoscopic appendectomy; because treatment delay\nincreases mortality, a negative appendectomy rate of 15% is considered acceptable. The surgeon can\nusually remove the appendix even if perforated. Occasionally, the appendix is difficult to locate: In these\ncase

### Retriever

In [None]:
DEFAULT_K = 5

retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": DEFAULT_K}
)

In [None]:
# Sanity check
retriever.get_relevant_documents("appendicitis symptoms treatment")

  retriever.get_relevant_documents("appendicitis symptoms treatment")


[Document(metadata={'creationDate': 'D:20120615054440Z', 'title': 'The Merck Manual of Diagnosis & Therapy, 19th Edition', 'trapped': '', 'modDate': 'D:20251216052241Z', 'keywords': '', 'creator': 'Atop CHM to PDF Converter', 'total_pages': 4114, 'format': 'PDF 1.7', 'moddate': '2025-12-16T05:22:41+00:00', 'subject': '', 'page': 173, 'file_path': '/content/drive/MyDrive/Colab Notebooks/NLP/medical_diagnosis_manual.pdf', 'producer': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'creationdate': '2012-06-15T05:44:40+00:00', 'source': '/content/drive/MyDrive/Colab Notebooks/NLP/medical_diagnosis_manual.pdf', 'author': ''}, page_content="Etiology\nAppendicitis is thought to result from obstruction of the appendiceal lumen, typically by lymphoid\nhyperplasia, but occasionally by a fecalith, foreign body, or even worms. The obstruction leads to\ndistention, bacterial overgrowth, ischemia, and inflammation. If untreated, necrosis, gangrene, and\nperforation occur. If the perforation is conta

In [None]:
def get_retriever(k: int):
    return vectorstore.as_retriever(
        search_type="similarity",
        search_kwargs={"k": k}
    )

### System and User Prompt Template

In [None]:
qna_system_message = """
You are a medical knowledge assistant.

Rules:
- Answer ONLY using the reference sections provided.
- Do NOT mention "context", "retrieval", "documents", "excerpts", "passages", or "provided text".
- Do NOT describe your process.
- If the answer is not present, respond exactly: "I don't know".
- Include citations in the answer using this exact format: (Merck Manual, p. X).
- If you make multiple factual claims, include citations for each major claim.
"""

In [None]:
qna_user_message_template = """
###Context
Source text:
{context}

###Question
{question}
"""

### Response Function

In [None]:
def page_label(d):
    p = d.metadata.get("page", None)
    if isinstance(p, int):
        return str(p + 1)  # PyMuPDFLoader uses 0-based indexing
    return "NA"

In [None]:
def generate_rag_response(user_input, k=3, max_tokens=256, temperature=0.2, top_p=0.95, top_k=50):
    r = get_retriever(k)
    docs = r.get_relevant_documents(user_input)

    # Consistent, model-friendly page markers
    context_for_query = "\n\n".join(
      [f"[p. {page_label(d)}] {d.page_content}" for d in docs]
    )

    user_message = qna_user_message_template.format(context=context_for_query, question=user_input)
    prompt = qna_system_message + "\n" + user_message

    out = llm(prompt=prompt, max_tokens=max_tokens, temperature=temperature, top_p=top_p, top_k=top_k)
    return out["choices"][0]["text"].strip()

In [None]:
rag_param_grid = [
    {"k": 2, "temperature": 0.0, "max_tokens": 250},
    {"k": 3, "temperature": 0.0, "max_tokens": 300},
    {"k": 5, "temperature": 0.0, "max_tokens": 350},
    {"k": 5, "temperature": 0.2, "max_tokens": 350},
    {"k": 8, "temperature": 0.2, "max_tokens": 450},
]

q = "What is the protocol for managing sepsis in a critical care unit?"

for i, p in enumerate(rag_param_grid, 1):
    ans = generate_rag_response(q, k=p["k"], temperature=p["temperature"], max_tokens=p["max_tokens"])
    print(f"\n--- RAG Combo {i}: {p} ---\n{ans}\n")

Llama.generate: prefix-match hit



--- RAG Combo 1: {'k': 2, 'temperature': 0.0, 'max_tokens': 250} ---
Answer:
The protocol for managing sepsis in a critical care unit includes obtaining cultures of blood and any other appropriate specimens, administering empiric antibiotics after cultures are obtained, adjusting antibiotics according to culture and susceptibility testing results, surgically draining any abscesses, and removing any internal devices that may be the suspected source of bacteria (Merck Manual, p. 1308).



Llama.generate: prefix-match hit



--- RAG Combo 2: {'k': 3, 'temperature': 0.0, 'max_tokens': 300} ---
Answer:
The protocol for managing sepsis in a critical care unit includes obtaining cultures of blood and any other appropriate specimens (Merck Manual, p. 1308). Early treatment with an appropriate antimicrobial regimen is given after cultures are obtained, and therapy is adjusted according to culture and susceptibility testing, surgical drainage of abscesses, and removal of internal devices suspected as sources of bacteria (Merck Manual, p. 1308).

Citations:
(Merck Manual, p. 1308)



Llama.generate: prefix-match hit



--- RAG Combo 3: {'k': 5, 'temperature': 0.0, 'max_tokens': 350} ---
###Answer
In patients suspected of having sepsis or septic shock, cultures are obtained of blood and any other appropriate specimens (Merck Manual, p. 1308). Empiric antibiotics are given after appropriate cultures are obtained, with early treatment appearing to improve survival (Merck Manual, p. 2457). Antibiotics are adjusted according to culture and susceptibility testing, surgical drainage of abscesses is performed, and internal devices suspected of being the source of bacteria are removed (Merck Manual, p. 2457). If bacterial cultures show no growth by 48 hours and the neonate appears well, antibiotics may be stopped (Merck Manual, p. 2457). General supportive measures, including respiratory and hemodynamic management, are combined with antibiotic treatment (Merck Manual, p. 2457). In early-onset sepsis, initial therapy should include ampicillin or penicillin G plus an aminoglycoside, with cefotaxime added if me

Llama.generate: prefix-match hit



--- RAG Combo 4: {'k': 5, 'temperature': 0.2, 'max_tokens': 350} ---
###Answer
In patients with suspected bacteremia, sepsis, or septic shock, empiric antibiotics are given after appropriate cultures are obtained (Merck Manual, p. 1182). Early treatment appears to improve survival (Merck Manual, p. 1182). Antibiotics are adjusted according to culture and susceptibility testing, surgical drainage of abscesses is performed, and internal devices suspected as sources of bacteria are removed (Merck Manual, p. 1182). If bacterial cultures show no growth after 48 hours and the neonate appears well, antibiotics may be stopped (Merck Manual, p. 1309). General supportive measures, including respiratory and hemodynamic management, are combined with antibiotic treatment (Merck Manual, p. 1182).

In early-onset sepsis, initial therapy should include ampicillin or penicillin G plus an aminoglycoside, with cefotaxime added for suspected meningitis (Merck Manual, p. 1309). For late-onset sepsis, vanc

Llama.generate: prefix-match hit



--- RAG Combo 5: {'k': 8, 'temperature': 0.2, 'max_tokens': 450} ---
Answer:
The management of sepsis in a critical care unit involves prompt empiric antibiotic therapy, supportive measures such as fluid resuscitation and hemodynamic support, and surgical intervention when necessary. The specific antibiotic regimen depends on the suspected source and sensitivity patterns common to that specific unit. Once culture and sensitivity results are available, the antibiotic regimen is adjusted accordingly. Abscesses must be drained, and necrotic tissues excised to eliminate septic foci. Normalization of blood glucose improves outcome in critically ill patients. Corticosteroid therapy may also be beneficial. Activated protein C, a recombinant drug with fibrinolytic and anti-inflammatory activity, seems beneficial for severe sepsis and septic shock if initiated early but carries the risk of bleeding. (Merck Manual, p. 2457)

References:
The Merck Manual of Diagnosis & Therapy, 19th Edition. Sec

## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Answer:
The protocol for managing sepsis in a critical care unit includes obtaining cultures of blood and any other appropriate specimens if bacteremia, sepsis, or septic shock is suspected (Merck Manual, p. 1308). Early treatment with an appropriate antimicrobial regimen is given after cultures are obtained, and therapy is adjusted according to culture and susceptibility testing results (Merck Manual, p. 2391). Surgical intervention may be necessary for draining abscesses or removing internal devices that are the suspected source of bacteria (Merck Manual, p. 2391).

Citations:
(Merck Manual, p. 1308)
(Merck Manual, p. 2391)


### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(generate_rag_response(user_input2))

Llama.generate: prefix-match hit


###Answer
The common symptoms for appendicitis include epigastric or periumbilical pain followed by brief nausea, vomiting, and anorexia; after a few hours, the pain shifts to the right lower quadrant. Pain increases with cough and motion. Direct and rebound tenderness located at McBurney's point (junction of the middle and outer thirds of the line joining the umbilicus to the anterior superior spine) is a classic sign. Additional signs include pain felt in the right lower quadrant with palpation of the left lower quadrant (Rovsing sign), an increase in pain from passive extension of the right hip joint that stretches the iliopsoas muscle (psoas sign), or pain caused by passive internal rotation of the flexed thigh (obturator sign). Low-grade fever is common. However, these classic findings appear in less than 50% of patients, and many variations of symptoms and signs occur (Merck Manual, p. 174).

Appendicitis cannot be cured via medicine alone; treatment involves surgical removal of 

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_input3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(generate_rag_response(user_input3))

Llama.generate: prefix-match hit


Based on the provided text, alopecia areata is a type of sudden patchy hair loss that can affect any hairy area, including the scalp. It is thought to be an autoimmune disorder affecting genetically susceptible people exposed to unclear environmental triggers. The most common treatments for alopecia areata include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). The text also mentions that hormonal modulators such as oral contraceptives or spironolactone may be useful for female-pattern hair loss associated with hyperandrogenemia. It is important to note that the cause of alopecia areata is not always clear, but it is believed to be an autoimmune disorder. (Merck Manual, p. 859)

Here are some citations for specific treatments and causes mentioned in the text:

- Topical, intralesional, or systemic corticoster


### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(generate_rag_response(user_input4))

Llama.generate: prefix-match hit


Answer:
Initial treatment includes ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. Surgery may be needed in patients with more severe injury to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas (Merck Manual, p. 3405). In the first few days after the injury, maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium are important (Merck Manual, p. 3405). Subsequently, many patients require rehabilitation (Merck Manual, p. 3405).


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(user_input5))

Llama.generate: prefix-match hit


###Answer
The person with a fractured leg should first receive emergency treatment for any life-threatening injuries. Arterial injuries may require surgical repair (Merck Manual, p. 3391). Nerve injuries may necessitate nerve conduction studies (Merck Manual, p. 3391). Most injuries are immobilized immediately to prevent further damage and decrease pain using splints or casts (Merck Manual, p. 3391). RICE (rest, ice, compression, elevation) is recommended for soft-tissue injuries (Merck Manual, p. 3391). Pain is typically treated with opioids (Merck Manual, p. 1623). Definitive treatment often involves reduction, which may require analgesia or sedation and closed or open reduction depending on the injury severity (Merck Manual, p. 3391). Immobilization is maintained using various surgical hardware if needed (Merck Manual, p. 3391). Patients should be instructed to keep the cast dry, avoid putting objects inside it, inspect its edges and skin daily


### Observations (RAG-based Answer)

- The response is grounded in retrieved excerpts from the Merck Manual, reducing hallucinations.
- Compared to the LLM-only answer, the RAG response is more specific and clinically reliable.
- The system correctly avoids answering when the required information is not present in the retrieved context.
- Retrieval quality strongly influences answer completeness; increasing `k` improves coverage for complex questions.

### Fine-tuning

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
print(generate_rag_response(user_input, temperature=0.5))

Llama.generate: prefix-match hit


###Answer
For suspected bacteremia, sepsis, or septic shock, cultures are obtained of blood and any other appropriate specimens (Merck Manual, p. 1308). Empiric antibiotics are given after appropriate cultures are obtained, and early treatment improves survival (Merck Manual, p. 2401). Antibiotics should be adjusted according to culture and susceptibility testing, surgical drainage of abscesses is performed, and internal devices suspected as the source of bacteria are removed (Merck Manual, p. 2401).


### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(generate_rag_response(user_input2, temperature=0.1, max_tokens=350))

Llama.generate: prefix-match hit


###Answer
The common symptoms for appendicitis include epigastric or periumbilical pain followed by brief nausea, vomiting, and anorexia; after a few hours, the pain shifts to the right lower quadrant. Pain increases with cough and motion. Classic signs are right lower quadrant direct and rebound tenderness located at McBurney's point (junction of the middle and outer thirds of the line joining the umbilicus to the anterior superior spine). Additional signs include pain felt in the right lower quadrant with palpation of the left lower quadrant (Rovsing sign), an increase in pain from passive extension of the right hip joint that stretches the iliopsoas muscle (psoas sign), or pain caused by passive internal rotation of the flexed thigh (obturator sign). Low-grade fever (rectal temperature 37.7 to 38.3° C [100 to 101° F]) is common, but these classic findings appear in less than 50% of patients. Symptoms may not be localized, particularly in infants and children, and tenderness may be d

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_input3 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(generate_rag_response(user_input3, top_p=0.98, top_k=20, max_tokens=256))

Llama.generate: prefix-match hit


Based on the provided text, alopecia areata is a common cause of sudden patchy hair loss. It is an autoimmune disorder affecting genetically susceptible people exposed to unclear environmental triggers (Merck Manual, p. 859). The scalp and beard are most frequently affected, but any hairy area may be involved, and hair loss may affect most or all of the body (alopecia universalis) (Merck Manual, p. 856, 859).

The text suggests several treatment options for alopecia areata: topical, intralesional, or systemic corticosteroids; topical minoxidil; topical anthralin; topical immunotherapy (diphencyprone or squaric acid dibutylester); or psoralen plus ultraviolet A (PUVA) (Merck Manual, p. 859). The choice of treatment depends on the extent and severity of hair loss.

It is important to note that alopecia areata may be a sign of underlying systemic disease, so a thorough evaluation is necessary for proper diagnosis and


### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input4 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(generate_rag_response(user_input4, temperature=0.5))

Llama.generate: prefix-match hit


I don't know. (Merck Manual does not provide information on treatments for brain injuries.)


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input5 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(user_input5, temperature=0.1, max_tokens=200))

Llama.generate: prefix-match hit


###Answer
The person with a fractured leg should first receive emergency treatment for any life-threatening injuries. Splinting is necessary to prevent further injury and decrease pain (Merck Manual, p. 3391). Definitive treatment may involve reduction, which requires analgesia or sedation (Merck Manual, p. 3391). Closed reduction is maintained by casting, while open reduction is maintained by surgical hardware such as pins, screws, plates, or external fixators (Merck Manual, p. 3391). RICE (rest, ice, compression, and elevation) should be applied to soft-tissue injuries (Merck Manual, p. 3391). Immobilization decreases pain and facilitates healing by preventing further injury (Merck Manual, p. 3391). The person should rest, apply ice


### Fine-tuning Insights

- Increasing the number of retrieved chunks (`k`) improves groundedness for complex protocols such as sepsis management.
- Larger chunk sizes provide better context continuity but may introduce irrelevant information.
- Lower temperature values remain preferable for healthcare use cases to ensure consistent recommendations.
- Fine-tuning retrieval parameters has a greater impact on answer quality than changing the LLM alone.

## Output Evaluation

Let us now use the LLM-as-a-judge method to check the quality of the RAG system on two parameters - retrieval and generation. We illustrate this evaluation based on the answeres generated to the question from the previous section.

- We are using the same Mistral model for evaluation, so basically here the llm is rating itself on how well he has performed in the task.

In [None]:
groundedness_rater_system_message = """
You are a strict Groundedness Rater for a Retrieval-Augmented Generation (RAG) system.

Goal:
Evaluate whether the ANSWER is supported by the CONTEXT excerpts provided. Groundedness means the answer’s factual claims can be traced to the context (directly stated or clearly implied). Do NOT use outside knowledge.

Instructions:
1) Read the QUESTION, CONTEXT, and ANSWER.
2) Identify the key factual claims in the ANSWER (diagnostic criteria, treatments, steps, contraindications, definitions, statistics, timelines, etc.).
3) For each key claim, check if it is:
   - SUPPORTED: explicitly stated or clearly implied by the CONTEXT.
   - UNSUPPORTED: not present in the CONTEXT.
   - CONTRADICTED: conflicts with the CONTEXT.
4) If the ANSWER includes medical dosing, exact protocols, or specific steps not present in the CONTEXT, mark those claims UNSUPPORTED.
5) If CONTEXT is insufficient, the correct behavior is for the ANSWER to say it lacks enough information.

Scoring rubric:
5 = Fully grounded: all key claims supported; no major gaps.
4 = Mostly grounded: minor unsupported details; core answer supported.
3 = Partially grounded: mixed; several key claims unsupported or vague.
2 = Not grounded: most claims unsupported; answer relies on outside knowledge.
1 = Contradicted: one or more key claims conflict with context OR answer is largely invented.
"""

In [None]:
relevance_rater_system_message = """
You are a strict Relevance Rater for a Retrieval-Augmented Generation (RAG) system.

Goal:
Evaluate whether the ANSWER directly addresses the QUESTION. Relevance measures alignment with the user's intent and task, not factual correctness or grounding.

Instructions:
1) Read the QUESTION and the ANSWER only. Ignore the CONTEXT.
2) Identify the core intent of the QUESTION (e.g., definition, protocol, symptoms, comparison, treatment steps).
3) Evaluate whether the ANSWER:
   - Directly addresses the core intent
   - Covers all major sub-parts of the question
   - Stays focused without unnecessary or tangential information
4) Penalize answers that are:
   - Vague, generic, or overly high-level
   - Missing required sub-questions
   - Off-topic or addressing a different problem
   - Overly verbose without adding relevant value

Scoring rubric:
5 = Fully relevant: directly and completely answers all parts of the question.
4 = Mostly relevant: answers the main intent but misses minor aspects.
3 = Partially relevant: addresses the question superficially or misses key parts.
2 = Not relevant: mostly off-topic or fails to answer the question.
1 = Irrelevant: does not address the question at all.
"""

In [None]:
user_message_template = """
QUESTION:
{question}

ANSWER:
{answer}

CONTEXT:
{context}

Please evaluate the ANSWER according to your role.
"""

In [None]:
# The evaluator LLM is run deterministically (temperature=0)
# to ensure reproducible and consistent scoring across runs.

import json

def safe_json_loads(s: str):
    try:
        return json.loads(s)
    except Exception:
        return {"raw_text": s}

def generate_ground_relevance_response(user_input, k=3, gen_max_tokens=256, gen_temperature=0.2,
                                      judge_max_tokens=256):
    r = get_retriever(k)
    docs = r.get_relevant_documents(user_input)

    context_for_query = "\n\n".join(
      [f"[p. {page_label(d)}] {d.page_content}" for d in docs]
    )

    # Generate answer (RAG)
    user_message = qna_user_message_template.format(context=context_for_query, question=user_input)
    gen_prompt = qna_system_message + "\n" + user_message

    gen_out = llm(prompt=gen_prompt, max_tokens=gen_max_tokens, temperature=gen_temperature, top_p=0.95, top_k=50)
    answer = gen_out["choices"][0]["text"].strip()

    # Judge
    eval_user_message = user_message_template.format(question=user_input, answer=answer, context=context_for_query)

    grounded_prompt = groundedness_rater_system_message + "\n" + eval_user_message + """
    Return JSON with keys:
    - score (1-5)
    - justification (1-3 sentences)
    - unsupported_claims (list; can be empty)
    """

    relevance_prompt = relevance_rater_system_message + "\n" + eval_user_message + """
    Return JSON with keys:
    - score (1-5)
    - justification (1-3 sentences)
    - missing_parts (list; can be empty)
    """

    ground_out = llm(prompt=grounded_prompt, max_tokens=judge_max_tokens, temperature=0.0, top_p=1.0, top_k=0)
    rel_out    = llm(prompt=relevance_prompt, max_tokens=judge_max_tokens, temperature=0.0, top_p=1.0, top_k=0)

    ground_text = ground_out["choices"][0]["text"].strip()
    rel_text    = rel_out["choices"][0]["text"].strip()

    return answer, safe_json_loads(ground_text), safe_json_loads(rel_text)

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
answer, ground, rel = generate_ground_relevance_response("What is the protocol for managing sepsis in a critical care unit?", k=5)
print("ANSWER:\n", answer)
print("\nGROUNDEDNESS:\n", json.dumps(ground, indent=2))
print("\nRELEVANCE:\n", json.dumps(rel, indent=2))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


ANSWER:
 ###Answer
The protocol for managing sepsis in a critical care unit includes obtaining cultures of blood and any other appropriate specimens, administering empiric antibiotics after culture, adjusting antibiotics according to culture results, removing internal devices that may be the source of bacteria, and providing supportive care such as fluids, antipyretics, analgesics, and oxygen for patients with hypoxemia (Merck Manual, p. 1182). Rapid empiric antibiotic therapy is recommended due to the potential devastating effects of sepsis and its non-specific clinical signs (Merck Manual, p. 1182). Antibiotics may be changed as soon as an organism is identified. General supportive measures include respiratory and hemodynamic management (Merck Manual, p. 1182). In early-onset sepsis, initial therapy should include ampicillin or penicillin G plus an aminoglycoside, with cefotaxime added if meningitis is suspected. For late-onset sepsis, initial therapy should include vancomycin for co

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
answer, ground, rel = generate_ground_relevance_response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?", k=5)
print("ANSWER:\n", answer)
print("\nGROUNDEDNESS:\n", json.dumps(ground, indent=2))
print("\nRELEVANCE:\n", json.dumps(rel, indent=2))


Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


ANSWER:
 ###Answer
The common symptoms of appendicitis include epigastric or periumbilical pain followed by brief nausea, vomiting, and anorexia; after a few hours, the pain shifts to the right lower quadrant. Pain increases with cough and motion. Classic signs are right lower quadrant direct and rebound tenderness located at McBurney's point (junction of the middle and outer thirds of the line joining the umbilicus to the anterior superior spine). Additional signs include pain felt in the right lower quadrant with palpation of the left lower quadrant (Rovsing sign), an increase in pain from passive extension of the right hip joint that stretches the iliopsoas muscle (psoas sign), or pain caused by passive internal rotation of the flexed thigh (obturator sign). Low-grade fever (rectal temperature 37.7 to 38.3° C [100 to 101° F]) is common, but these classic findings appear in less than 50% of patients. Many variations of symptoms and signs occur, including pain that may not be localize

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
answer, ground, rel = generate_ground_relevance_response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?", k=5)
print("ANSWER:\n", answer)
print("\nGROUNDEDNESS:\n", json.dumps(ground, indent=2))
print("\nRELEVANCE:\n", json.dumps(rel, indent=2))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


ANSWER:
 Based on the provided text, alopecia areata is a common cause of sudden patchy hair loss. It is an autoimmune disorder affecting genetically susceptible people exposed to unclear environmental triggers. The most effective treatments for alopecia areata include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). Hormonal modulators such as oral contraceptives or spironolactone may be useful for female-pattern hair loss associated with hyperandrogenemia. Surgical options include follicle transplant, scalp flaps, and alopecia reduction. However, few procedures have been subjected to scientific scrutiny. Patients who are self-conscious about their hair loss may consider these options. It is important to note that the effectiveness of these treatments can vary from person to person.

The possible causes behind sudden patchy hair loss incl

### Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
answer, ground, rel = generate_ground_relevance_response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?", k=5)
print("ANSWER:\n", answer)
print("\nGROUNDEDNESS:\n", json.dumps(ground, indent=2))
print("\nRELEVANCE:\n", json.dumps(rel, indent=2))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


ANSWER:
 Answer:
The initial treatment for a person with a traumatic brain injury (TBI) includes ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. Surgery may be needed to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas. In the first few days after the injury, maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium are important. Subsequently, many patients require rehabilitation (Merck Manual, p. 3405). There are ongoing studies for treatments to promote nerve regeneration, such as injections of autologous macrophages, epidural administration of BA-210, oral administration of HP-184, and optimal timing of surgery (Merck Manual, p. 3417). Emotional care is also essential for the success of all other components of rehabilitation (Merck Manual, p. 3416).

GROUNDEDNESS:
 {
  "score": 4,
  "justi

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
answer, ground, rel = generate_ground_relevance_response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?", k=5)
print("ANSWER:\n", answer)
print("\nGROUNDEDNESS:\n", json.dumps(ground, indent=2))
print("\nRELEVANCE:\n", json.dumps(rel, indent=2))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


ANSWER:
 Based on the provided text, here's the answer:

The person with a fractured leg from a hiking trip should first receive emergency treatment if there are signs of hemorrhagic shock or life-threatening injuries. For arterial injuries, arteriography may be necessary. Nerve conduction studies may be indicated for nerve injuries.

Immediate treatment includes splinting to prevent further injury and decrease pain. Most injuries require immobilization, which is maintained by casting or surgical hardware. RICE (rest, ice, compression, elevation) is beneficial for soft-tissue injuries. Pain is typically treated with opioids.

Definitive treatment often involves reduction, which requires analgesia or sedation. Closed reduction is done when possible; open reduction is necessary for some injuries. Immobilization decreases pain and facilitates healing by preventing further injury.

For long-bone fractures, splinting may prevent fat embolism. Patients should be given written instructions on

## Actionable Insights and Business Recommendations

### Key Business Insights
- A RAG-based assistant significantly reduces the time clinicians spend searching through large medical manuals.
- Grounded responses improve trust and reliability, which is critical in healthcare settings.
- Structured outputs reduce cognitive load and support faster decision-making in emergency scenarios.
- Retrieval quality has a greater impact on answer reliability than model size alone.

### Business Recommendations
- Deploy the system as a clinical decision support tool rather than a decision-making authority.
- Integrate authoritative medical sources and enforce citation-based answers to ensure compliance and trust.
- Continuously monitor groundedness and relevance scores to identify retrieval or prompting issues.
- Customize prompts and retrieval strategies for high-impact use cases such as sepsis, trauma, and acute care.
- Invest in governance and auditing mechanisms to support regulatory and safety requirements.

Overall, a RAG-based medical knowledge assistant can enhance operational efficiency,
standardize care guidance, and support better clinical outcomes when used responsibly.

<font size=6 color='blue'>Power Ahead</font>
___