## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

**Common Questions to Answer**

**1. Diagnostic Assistance**: "What are the common symptoms and treatments for pulmonary embolism?"

**2. Drug Information**: "Can you provide the trade names of medications used for treating hypertension?"

**3. Treatment Plans**: "What are the first-line options and alternatives for managing rheumatoid arthritis?"

**4. Specialty Knowledge**: "What are the diagnostic steps for suspected endocrine disorders?"

**5. Critical Care Protocols**: "What is the protocol for managing sepsis in a critical care unit?"

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [None]:
# Reinstalling llama-cpp-python with CUDA (CUBLAS) support enabled for faster GPU performance
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.4/1.8 MB[0m [31m10.9 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m30.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m318.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m265.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.6/16.6 MB[0m [31m306.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.6/44.6 kB[0m [31m179.0 MB/

In [None]:
#Installing essential NLP and embedding libraries for LangChain and Hugging Face integration. Installed dependencies without version pins to avoid compatibility issues
!pip install huggingface_hub pandas tiktoken pymupdf langchain langchain-community chromadb sentence-transformers numpy -q

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [None]:
#Import the necessary libraries for the RAG-based AI solution, including libraries for data processing, loading, chunking, embedding, vector databases, and downloading/loading the LLM.
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

#Libraries for downloading and loading the llm
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

#### Downloading and Loading the model

In [None]:
## Model configuration & download the model from Hugging face hub
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
    )

In [None]:
#Initializing the Llama model with optimized GPU & batch settings
llm = Llama(
    model_path=model_path, #Path to the GGUF model file
    n_ctx=4000,
    n_gpu_layers=38, #Loads model layers onto GPU for faster inference (set to 0 for CPU-only)
    n_batch=512, #Number of tokens processed at once
    #max_tokens=128,
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Response

In [None]:
#Helper function to generate model response
def response(query,max_tokens=256,temperature=0,top_p=0.95,top_k=50):
    model_output = llm(
      prompt=query,
      max_tokens=max_tokens,
      temperature=temperature,
      top_p=top_p,
      top_k=top_k
    )

    return model_output['choices'][0]['text']

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
response("What is the protocol for managing sepsis in a critical care unit?")

'\n\nSepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:\n\n1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.\n2. Resuscitation: Provide adequate fluid resuscitation to maintain adequate tissue perfusion. The goal is to achieve a mean arterial pressure (MAP) of at least 65 mmHg and a central venous oxygen saturation (ScvO2) of greater than 70%.\n3. Antibiotics: Administer broad-spectrum antibiotics as soon as possible based on the suspected source of infection and local microbiology data.\n4. Source control: Identify and address the source of infection, if possible. T

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?")

Llama.generate: prefix-match hit


'\n\nAppendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:\n\n1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing.\n2. Loss of appetite: People with appendicitis may lose their appetite due to abdominal pain and discomfort.\n3. Nausea and vomiting: Vomiting is a common symptom of appendicitis, especially in the later stages of the condition.\n4. Fever: A fever of 100.4°F (38°C) or higher may be present in some cases of appendicitis.\n5. Constipation or diarrhea: Both constipation and diarrhea can occur with appendicitis, depending on the location and

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?")

Llama.generate: prefix-match hit


"\n\nSudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.\n\nThe exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications.\n\nThere are several treatments that have been shown to be effective in addressing sudden patchy hair loss:\n\n1. Corticosteroids: These are anti-inflammatory drugs that can help reduce inflammation and suppress the immune system's attack on the hair follicles. They can be applied topically or taken orally, depending on the severity of the condition.\n2. Minoxidil: This is a medication that has been shown to promote hair growth in some people with alopecia areata. It works by increa

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?")

Llama.generate: prefix-match hit


"\n\nA person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:\n\n1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.\n2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as swelling, seizures, pain, or infections. For example, corticosteroids may be used to reduce brain swelling, and anticonvulsants may be given to prevent seizures.\n3. Rehabilitation: Rehabilitation is an essential part of the recovery process for TBI patients. Rehabilitation may include physical therapy to help with mobility and strength, occupational therapy to help with daily living skills, speech therapy 

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?")

Llama.generate: prefix-match hit


"\n\nFirst and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:\n\n1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.\n2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.\n3. Immobilize the leg: Use a splint, sling, or other available materials to immobilize the leg and prevent movement. Be sure not to apply too much pressure on the injury site.\n4. Provide pain relief: Offer over-the-counter pain medication, such as acetaminophen or ibuprofen, to help manage pain.\n5. Seek medical attention: If the fracture is severe or if you suspect that there may be other injuries, seek medical help as soon as possible.\n\nOnce you've ensured the person's safety and stabi

**Observations on Initial LLM Responses**
Here are some observations on the initial responses generated by the LLM for each query, without any RAG or specific prompt engineering:

**Query 1**: What is the protocol for managing sepsis in a critical care unit?

The model started strong, outlining some key steps like early recognition and fluid resuscitation. However, it felt like it cut off before giving a truly complete picture of the full protocol. It touched on important aspects but didn't seem to finish its thought.

**Query 2**: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

The response did a decent job of listing common symptoms. It also started to address the treatment, mentioning surgery. But similar to the first query, it felt like the answer wasn't fully concluded and might have broken off mid-explanation of the surgical procedure.

**Query 3**: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

This response seemed to provide a relatively more complete answer compared to the others. It identified the condition and listed several potential causes and treatments. While it might not be exhaustive, it felt less like it cut off abruptly and provided a decent overview based on general knowledge.

**Query 4**: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

The model gave some relevant points about emergency care and rehabilitation, but the response felt quite general and seemed to stop before fully elaborating on the different types of treatments and long-term care considerations. It felt incomplete.

**Query 5**: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

The response offered some good initial advice on what to do right after the injury, like keeping still and immobilizing the leg. However, it felt like it only covered the immediate steps and didn't fully delve into the subsequent treatment and recovery process in detail. It seemed to end before giving a complete picture of the care needed.

## Question Answering using LLM with Prompt Engineering

**Prompt Engineering**

In [None]:
# function to generate, process, and return the response from the LLM
def generate_response(user_prompt):

    # System message
    system_message = """
    [INST]<<SYS>> You are a helpful medical professional. Your task is to answer the user's question based ONLY on the provided context. If the answer is not found in the context, please state that you cannot answer the question based on the provided information.
    <</SYS>>[/INST]
    """

    # Combine user_prompt and system_message to create the prompt
    prompt = f"{system_message}\n{user_prompt}"

    # Generate a response from the LLaMA model
    response = llm(
        prompt=prompt,
        max_tokens=1024,
        temperature=0.04,
        top_p=0.95,
        repeat_penalty=1.2,
        top_k=50,
        echo=False
    )

    # Extract and return the response text
    response_text = response["choices"][0]["text"]
    print(response)
    return response_text

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
generate_response("What is the protocol for managing sepsis in a critical care unit?")

Llama.generate: prefix-match hit


{'id': 'cmpl-e9ec36cf-78d2-4e4e-b59e-b8aa8705a445', 'object': 'text_completion', 'created': 1760539059, 'model': '/root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q6_K.gguf', 'choices': [{'text': ' I cannot provide an exact answer without additional context as the management of sepsis can vary depending on individual patient factors, severity of illness, and institutional guidelines. However, generally speaking, early recognition and prompt initiation of appropriate antibiotics are key components of sepsis management. Other interventions may include fluid resuscitation, vasopressor support for blood pressure, mechanical ventilation if needed, and correction of metabolic derangements such as electrolyte imbalances or acid-base disturbances. Close monitoring of vital signs, organ function, and laboratory values is also essential to assess response to treatment and identify any complica

' I cannot provide an exact answer without additional context as the management of sepsis can vary depending on individual patient factors, severity of illness, and institutional guidelines. However, generally speaking, early recognition and prompt initiation of appropriate antibiotics are key components of sepsis management. Other interventions may include fluid resuscitation, vasopressor support for blood pressure, mechanical ventilation if needed, and correction of metabolic derangements such as electrolyte imbalances or acid-base disturbances. Close monitoring of vital signs, organ function, and laboratory values is also essential to assess response to treatment and identify any complications promptly.'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
generate_response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?")

Llama.generate: prefix-match hit


{'id': 'cmpl-59868ed1-591e-40b5-8b09-1725b9e7a0e2', 'object': 'text_completion', 'created': 1760539065, 'model': '/root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q6_K.gguf', 'choices': [{'text': "\n\nBased on the context provided, I cannot directly answer whether or not a person in the text has been diagnosed with appendicitis. However, I can provide information about common symptoms of appendicitis and treatment options based on medical knowledge.\n\nCommon symptoms of appendicitis include:\n- Sudden pain that starts near your navel and moves lower to your right side\n- Loss of appetite\n- Nausea and vomiting\n- Abdominal swelling\n- Fever, which may be low grade at first but can rise as high as 103 degrees Fahrenheit (39.4 degrees Celsius)\n- Diarrhea or constipation\n- Inability to pass gas or have a bowel movement\n\nAppendicitis cannot typically be cured with medicine alone, an

"\n\nBased on the context provided, I cannot directly answer whether or not a person in the text has been diagnosed with appendicitis. However, I can provide information about common symptoms of appendicitis and treatment options based on medical knowledge.\n\nCommon symptoms of appendicitis include:\n- Sudden pain that starts near your navel and moves lower to your right side\n- Loss of appetite\n- Nausea and vomiting\n- Abdominal swelling\n- Fever, which may be low grade at first but can rise as high as 103 degrees Fahrenheit (39.4 degrees Celsius)\n- Diarrhea or constipation\n- Inability to pass gas or have a bowel movement\n\nAppendicitis cannot typically be cured with medicine alone, and surgery is usually required to remove the inflamed appendix before it ruptures and causes further complications. The most common surgical procedure for removing an inflamed appendix is called an appendectomy. This can be performed as an open or laparoscopic procedure depending on the individual ca

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
generate_response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?")

Llama.generate: prefix-match hit


{'id': 'cmpl-f111cc99-8dd5-4c14-af94-75c09b47cfb6', 'object': 'text_completion', 'created': 1760539076, 'model': '/root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q6_K.gguf', 'choices': [{'text': ' Based on the context you have not provided any specific information regarding treatments or causes of sudden patchy hair loss. Therefore, I cannot answer your question based on the given context. However, some common causes include stress, autoimmune disorders such as alopecia areata, nutritional deficiencies, and certain medications. Treatments may vary depending on the underlying cause but can include topical treatments, oral medications, or in severe cases, hair transplantation. It is recommended to consult with a healthcare professional for an accurate diagnosis and treatment plan.', 'index': 0, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 109, 'completion_to

' Based on the context you have not provided any specific information regarding treatments or causes of sudden patchy hair loss. Therefore, I cannot answer your question based on the given context. However, some common causes include stress, autoimmune disorders such as alopecia areata, nutritional deficiencies, and certain medications. Treatments may vary depending on the underlying cause but can include topical treatments, oral medications, or in severe cases, hair transplantation. It is recommended to consult with a healthcare professional for an accurate diagnosis and treatment plan.'

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
generate_response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?")

Llama.generate: prefix-match hit


{'id': 'cmpl-e44332d6-a735-44f8-866d-6e1daa28b177', 'object': 'text_completion', 'created': 1760539081, 'model': '/root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q6_K.gguf', 'choices': [{'text': ' I cannot provide an accurate answer without more specific details about the nature and extent of the brain injury. However, some common treatments may include medication to manage symptoms such as pain, seizures, or depression; rehabilitation therapies like physical therapy, occupational therapy, speech-language therapy, and cognitive rehabilitation; and in severe cases, surgical intervention or assistive devices. It is important for the person with a brain injury to receive proper medical evaluation and care from healthcare professionals experienced in treating such conditions.', 'index': 0, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 101, 'completion_tokens': 

' I cannot provide an accurate answer without more specific details about the nature and extent of the brain injury. However, some common treatments may include medication to manage symptoms such as pain, seizures, or depression; rehabilitation therapies like physical therapy, occupational therapy, speech-language therapy, and cognitive rehabilitation; and in severe cases, surgical intervention or assistive devices. It is important for the person with a brain injury to receive proper medical evaluation and care from healthcare professionals experienced in treating such conditions.'

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
generate_response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?")

Llama.generate: prefix-match hit


{'id': 'cmpl-62550892-ddbf-4816-89a2-82450e9ff447', 'object': 'text_completion', 'created': 1760539085, 'model': '/root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q6_K.gguf', 'choices': [{'text': " I'm assuming that you have already assessed the severity of the injury as a fracture. Based on the context provided, here is some information that may help:\n\n1. Immobilize the affected limb using a splint or a cast to prevent further damage and promote healing. This will also provide comfort and reduce pain.\n2. Apply ice packs intermittently for 15-20 minutes at a time, several times a day, to help minimize swelling and inflammation. Be sure not to apply the ice directly on the skin but rather wrap it in a cloth or protective layer.\n3. Elevate the injured leg above heart level whenever possible to reduce swelling and promote proper blood flow.\n4. Monitor for signs of infection, such a

" I'm assuming that you have already assessed the severity of the injury as a fracture. Based on the context provided, here is some information that may help:\n\n1. Immobilize the affected limb using a splint or a cast to prevent further damage and promote healing. This will also provide comfort and reduce pain.\n2. Apply ice packs intermittently for 15-20 minutes at a time, several times a day, to help minimize swelling and inflammation. Be sure not to apply the ice directly on the skin but rather wrap it in a cloth or protective layer.\n3. Elevate the injured leg above heart level whenever possible to reduce swelling and promote proper blood flow.\n4. Monitor for signs of infection, such as increased pain, redness, warmth, or pus at the injury site. If any of these symptoms are present, seek medical attention immediately.\n5. Encourage adequate hydration and a balanced diet to support healing and overall health.\n6. Assist with mobility needs, such as crutches or a wheelchair, if nec

## Observations on Prompt Engineering Responses vs. Initial LLM Responses

Here are some observations comparing the responses generated with prompt engineering to the initial responses from the LLM:

**Query 1: What is the protocol for managing sepsis in a critical care unit?**
*   With prompt engineering, the model's response for sepsis management was quite different. Instead of giving a partial protocol like before, it explicitly stated it couldn't give an exact answer without more context but then provided general key components. This shows the prompt engineering successfully guided the model to acknowledge limitations and stick to general knowledge when specific context isn't provided. It's more cautious and upfront about what it doesn't know from the "context."

**Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?**
*   The prompt-engineered response for appendicitis was much clearer about its source of information. It stated it couldn't confirm a diagnosis from the text but could provide general medical knowledge. It then gave a good list of symptoms and correctly explained that surgery (appendectomy) is typically required, not just medicine. This was a more structured and informative answer compared to the initial one, which felt incomplete.

**Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?**
*   For patchy hair loss, the prompt-engineered response also emphasized the lack of specific context in the "provided information." It then defaulted to providing general knowledge about causes and treatments for alopecia areata. While the initial LLM response was already relatively decent for this query, the prompt-engineered version clearly highlights that its answer is based on general knowledge because the specific context wasn't present.

**Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?**
*   The prompt-engineered response for brain injury treatments again started by saying it couldn't give an accurate answer without more specific details from the context. It then listed some common treatments like medication, rehabilitation therapies, and in severe cases, surgery or assistive devices. This response, while general, was more structured and clearly indicated that it was pulling from its broader knowledge base due to the lack of specific context in the input provided to the model.

**Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?**
*   The prompt-engineered response for the fractured leg scenario also started with an assumption ("I'm assuming that you have already assessed the severity...") and then provided general information based on what it knows medically. It gave a good list of steps, including RICE, immobilization, pain relief, and seeking medical attention. Compared to the initial response, this felt more like a comprehensive first-aid and initial care guide, clearly drawing from its general medical knowledge when the context wasn't specific.

Overall, I feel like the prompt engineering did a better job when compared to the first set of queries

## Data Preparation for RAG

### Loading the Data

In [None]:
#Mounting Gdrive to access the files
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
#Assign the Medical data to a variable
data = '/content/drive/MyDrive/AIML_Apr_2025/Projects/Project5:Medical Assistant/medical_diagnosis_manual.pdf'

In [None]:
#Load the PDF file using PyMuPDF
pdf_loader = PyMuPDFLoader(data)
pages = pdf_loader.load_and_split()

In [None]:
#Load the completed PDF content in to a variable called merck_data
merck_data = pdf_loader.load()

### Data Overview

#### Checking the first 5 pages

In [None]:
#Iterate through the pages and display first 5 pages
for i in range(5):
    print(f"Page Number : {i+1}",end="\n")
    print(merck_data[i].page_content,end="\n")


Page Number : 1
praveen.kumar.pno@gmail.com
MLONAPRA19005
for personal use by praveen.kumar.pno@
shing the contents in part or full is liable
Page Number : 2
praveen.kumar.pno@gmail.com
MLONAPRA19005
This file is meant for personal use by praveen.kumar.pno@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Page Number : 3
Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    .....................................................................................................................................................................

#### Checking the number of pages

In [None]:
print(f"Number of pages : {len(merck_data)}")

Number of pages : 4114


### Data Chunking

In [None]:
#Text Splitter to chunk the documents with some overlapping. Tried with a couple of overalps and ended up with 100
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=512,
    chunk_overlap=100,
    encoding_name='cl100k_base'
    #length_function=len,
)

In [None]:
#Splitting the full PDF content into smaller token-based chunks
document_chunks = text_splitter.split_documents(merck_data)

### Embedding

In [None]:
# Create an embedding function using SentenceTransformer
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

  embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
#Generate embeddings for a 2nd chunk of document using the embedding function
embedding1 = embedding_function.embed_query(document_chunks[1].page_content)

In [None]:
##Generate embeddings for a 3rd chunk of document using the embedding function
embedding2 = embedding_function.embed_query(document_chunks[2].page_content)

In [None]:
#Verify length of different chunks
print("Dimension of the embedding vector ",len(embedding1))
print("Dimension of the embedding vector ",len(embedding2))
print(embedding1[:5])
print(embedding2[:5])
print("Dimension of the embedding vector ",len(embedding_function.embed_query(document_chunks[-1].page_content)))

#Obervation: All of the chunks seem to have a length of 384

Dimension of the embedding vector  384
Dimension of the embedding vector  384
[-0.07298078387975693, 0.05817607045173645, -0.024621667340397835, -0.07538803666830063, 0.06229318678379059]
[-0.03384838253259659, 0.0026916114147752523, -0.042815208435058594, -0.08248289674520493, 0.007532679010182619]
Dimension of the embedding vector  384


In [None]:
#Total no of chunks after the embeddings
len(document_chunks)

9145

### Vector Database

In [None]:
#Create an output directory
out_dir = 'merck_db'
if not os.path.exists(out_dir):
    os.makedirs(out_dir)

In [None]:
#create a chroma vector store from the document chunks & embeddings
VectorStore = Chroma.from_documents(
    documents=document_chunks,
    embedding=embedding_function,
    persist_directory=out_dir
)

In [None]:
VectorStore.embeddings

HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
), model_name='all-MiniLM-L6-v2', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False, show_progress=False)

In [None]:
#semantic search in the vector stroe to find top 3 document chunks
VectorStore.similarity_search("Apendicitis", k=3)

[Document(metadata={'file_path': '/content/drive/MyDrive/AIML_Apr_2025/Projects/Project5:Medical Assistant/medical_diagnosis_manual.pdf', 'producer': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'trapped': '', 'format': 'PDF 1.7', 'keywords': '', 'total_pages': 4114, 'modDate': 'D:20251004135101Z', 'source': '/content/drive/MyDrive/AIML_Apr_2025/Projects/Project5:Medical Assistant/medical_diagnosis_manual.pdf', 'moddate': '2025-10-04T13:51:01+00:00', 'subject': '', 'author': '', 'creator': 'Atop CHM to PDF Converter', 'page': 1435, 'title': 'The Merck Manual of Diagnosis & Therapy, 19th Edition', 'creationDate': 'D:20120615054440Z', 'creationdate': '2012-06-15T05:44:40+00:00'}, page_content='progeny. These ticks are the natural reservoirs. Dermacentor andersoni  (wood tick) is the principal vector\nin the western US. D. variabilis (dog tick) is the vector in the eastern and southern US. RMSF is probably\nnot transmitted directly from person to person.\nPathophysiology\nSmall blood v

### Retriever

In [None]:
# Create a retriever from the vector database
# This will be used to retrieve relevant document chunks based on a query
retriever = VectorStore.as_retriever(search_kwargs={"k": 4})

In [None]:
rel_docs = retriever.get_relevant_documents(query="Apendicitis")
print(rel_docs)

[Document(metadata={'creationdate': '2012-06-15T05:44:40+00:00', 'source': '/content/drive/MyDrive/AIML_Apr_2025/Projects/Project5:Medical Assistant/medical_diagnosis_manual.pdf', 'file_path': '/content/drive/MyDrive/AIML_Apr_2025/Projects/Project5:Medical Assistant/medical_diagnosis_manual.pdf', 'trapped': '', 'creationDate': 'D:20120615054440Z', 'creator': 'Atop CHM to PDF Converter', 'subject': '', 'author': '', 'producer': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'page': 1435, 'total_pages': 4114, 'format': 'PDF 1.7', 'title': 'The Merck Manual of Diagnosis & Therapy, 19th Edition', 'keywords': '', 'moddate': '2025-10-04T13:51:01+00:00', 'modDate': 'D:20251004135101Z'}, page_content='progeny. These ticks are the natural reservoirs. Dermacentor andersoni  (wood tick) is the principal vector\nin the western US. D. variabilis (dog tick) is the vector in the eastern and southern US. RMSF is probably\nnot transmitted directly from person to person.\nPathophysiology\nSmall blood v

  rel_docs = retriever.get_relevant_documents(query="Apendicitis")


### System and User Prompt Template

In [None]:
# Define a System message for the RAG model
qna_system_message = """
You are an assistant whose work is to review the report and provide the appropriate answers from the context.
User input will have the context required by you to answer user questions.
This context will begin with the token: ###Context.
The context contains references to specific portions of a document relevant to the user query.

User questions will begin with the token: ###Question.

Please answer only using the context provided in the input. Do not mention anything about the context in your final answer.

If the answer is not found in the context, respond "I can't find the answer, sorry".
"""

# Define a User message for the RAG model
qna_user_message= """
###Context
{context}

###Question
{question}
Answer:
"""

### Response Function

In [None]:
def generate_rag_response(user_input,k=3,max_tokens=512,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=k)
    context_list = [d.page_content for d in relevant_document_chunks]

    # Combine document chunks into a single context
    context_for_query = ". ".join(context_list)

    user_message = qna_user_message.replace('{context}', context_for_query)
    user_message = user_message.replace('{question}', user_input)

    prompt = qna_system_message + '\n' + user_message

    # Generate the response
    try:
        response = llm(
                  prompt=prompt,
                  max_tokens=max_tokens,
                  temperature=temperature,
                  top_p=top_p,
                  top_k=top_k
                  )

        # Extract and print the model's response
        response = response['choices'][0]['text'].strip()
    except Exception as e:
        response = f'Sorry, I encountered the following error: \n {e}'

    return response

## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
generate_rag_response("What is the protocol for managing sepsis in a critical care unit?")

Llama.generate: prefix-match hit


'The protocol involves obtaining cultures of blood and any other appropriate specimens if bacteremia, sepsis, or septic shock is suspected. Empiric antibiotics are given after cultures are obtained, and treatment is adjusted according to culture and susceptibility testing results. Surgical drainage of abscesses and removal of internal devices that may be the source of bacteria are also part of the treatment.'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
generate_rag_response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?")

Llama.generate: prefix-match hit


"The common symptoms for appendicitis include epigastric or periumbilical pain followed by brief nausea, vomiting, and anorexia; after a few hours, the pain shifts to the right lower quadrant. Pain increases with cough and motion. Classic signs are right lower quadrant direct and rebound tenderness located at McBurney's point. Additional signs include pain felt in the right lower quadrant with palpation of the left lower quadrant (Rovsing sign), an increase in pain from passive extension of the right hip joint that stretches the iliopsoas muscle (psoas sign), or pain caused by passive internal rotation of the flexed thigh (obturator sign). Low-grade fever is common. Unfortunately, these classic findings appear in less than 50% of patients, and many variations of symptoms and signs occur. Pain may not be localized, particularly in infants and children. Tenderness may be diffuse or absent. Bowel movements are usually less frequent or absent; if diarrhea is a sign, a retrocecal appendix s

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
generate_rag_response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?")

Llama.generate: prefix-match hit


'Sudden patchy hair loss, also known as alopecia areata, can be treated with various options including topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). The cause of alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers. It is important to note that the efficacy of these treatments may vary, and some may have adverse effects such as decreased libido, erectile and ejaculatory dysfunction, hypersensitivity reactions, gynecomastia, myopathy, and a decrease in prostate-specific antigen levels in older men. Common practice is to continue treatment for as long as positive results persist, but once treatment is stopped, hair loss returns to previous levels. Finasteride is not indicated for women and is contraindicated in pregnant women due to its teratogenic

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
generate_rag_response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?")

Llama.generate: prefix-match hit


'Physical and occupational therapy may be recommended to improve functioning and make the environment safer. There is no specific medical treatment for brain injuries, and drugs that slow the symptomatic progression of dementia do not appear beneficial. Supportive care includes preventing systemic complications due to immobilization, providing good nutrition, and preventing pressure ulcers. The prognosis depends on the severity and cause of the injury, with most patients becoming dependent and requiring help with activities of daily living and at least some degree of supervision. Recovery from a vegetative state is unlikely after 1 month if brain damage is nontraumatic and after 12 months if it is traumatic. Even if some recovery occurs after these intervals, most patients are severely disabled. Most patients in a persistent vegetative state die within 6 months of the original brain damage. The cause is usually pulmonary infection, UTI, or multiple organ failure, or death may be sudden

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
generate_rag_response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?")

Llama.generate: prefix-match hit


"The person with a fractured leg should rest the injured leg to prevent further injury and speed up healing. Ice and compression can minimize swelling and pain. The leg should be elevated to help reduce swelling and improve blood flow. If necessary, crutches or other assistive devices may be required for mobility. Pain can be managed with opioids as needed. Definitive treatment, such as reduction or surgery, may be required depending on the severity of the fracture. Immobilization through casting or splinting is often necessary to prevent further injury and promote healing. After surgery, additional measures such as low-dose UFH, LMWH, warfarin, newer anticoagulants, compression devices, or stockings may be required to prevent DVT. It's important for the person to move their legs periodically to prevent blood clots and promote circulation. Patients at higher risk of DVT should receive thrombosis prophylaxis. After recovery, avoiding sitting in chairs and elevating the legs can help pre

Observations

### Fine-tuning

In [None]:
generate_rag_response("What is the protocol for managing sepsis in a critical care unit?",temperature=0.5,top_p=.85)

Llama.generate: prefix-match hit


"The protocol for managing sepsis in a critical care unit involves obtaining cultures of blood and any other appropriate specimens, giving empiric antibiotics after appropriate cultures are obtained, adjusting antibiotics according to culture and susceptibility testing results, surgically draining any abscesses, removing internal devices that are the suspected source of bacteria, and continuing therapy until the patient's condition improves."

In [None]:
generate_rag_response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?", k=5)

Llama.generate: prefix-match hit


"The common symptoms for appendicitis include epigastric or periumbilical pain followed by brief nausea, vomiting, and anorexia; after a few hours, the pain shifts to the right lower quadrant. Pain increases with cough and motion. Classic signs are right lower quadrant direct and rebound tenderness located at McBurney's point (junction of the middle and outer thirds of the line joining the umbilicus to the anterior superior spine). Additional signs include pain felt in the right lower quadrant with palpation of the left lower quadrant (Rovsing sign), an increase in pain from passive extension of the right hip joint that stretches the iliopsoas muscle (psoas sign), or pain caused by passive internal rotation of the flexed thigh (obturator sign). Low-grade fever is common. Unfortunately, these classic findings appear in less than 50% of patients, and many variations of symptoms and signs occur. Pain may not be localized, particularly in infants and children. Tenderness may be diffuse or 

In [None]:
generate_rag_response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?",temperature=.75,top_p=.75, k=3)

Llama.generate: prefix-match hit


'Alopecia areata is a common cause of sudden patchy hair loss. It is an autoimmune disorder affecting genetically susceptible people exposed to unclear environmental triggers. The most effective treatments for alopecia areata include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). The efficacy is usually evident within 6 to 8 months of treatment. Common practice is to continue treatment for as long as positive results persist, and once treatment is stopped, hair loss returns to previous levels. Other causes of sudden patchy hair loss include fungal infections such as tinea capitis, which can be treated with topical or oral antifungals. Traction alopecia, caused by physical stress on the scalp, can be addressed by eliminating the physical traction or stress to the scalp.'

In [None]:
generate_rag_response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?",temperature=.8, k=4)

Llama.generate: prefix-match hit


'Physical and occupational therapy is recommended for improving functioning and making the environment safer. However, there is no specific medical treatment for reversing brain damage. Care should be taken to prevent complications such as pneumonia, pressure ulcers, and other complications. Patients with severe cognitive dysfunction may require extensive cognitive therapy immediately after injury and continued for months or years. Imaging techniques like CT or MRI are required to diagnose and characterize the central lesion. Depending on the level and extent of the injury, supportive care such as preventing systemic complications due to immobilization, providing good nutrition, and preventing pressure ulcers is essential.'

In [None]:
generate_rag_response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?",temperature=.9,top_k=50, k=3)

Llama.generate: prefix-match hit


"The person should first receive rest, ice, compression, and elevation (RICE) to prevent further injury and decrease pain. Ice and compression can be applied using a thigh sleeve and elastic bandage, respectively. If walking is painful, crutches may be required initially. Splinting or casting might be necessary for definitive treatment of the fracture, depending on its severity. Open reduction (with skin incision) may also be needed in some cases. Pain should be managed with opioids. For longer bone fractures, splinting can prevent fat embolism. It's essential to identify if the patient is at a higher risk for Deep Vein Thrombosis (DVT), and if so, additional preventive treatment may be required, such as thrombosis prophylaxis with low-dose UFH or LMWH, compression devices or stockings, or a combination. After surgery, patients should elevate their legs, avoid sitting in chairs that impede venous return, and consider further treatments based on their risk level, type of surgery, projec

**Comparison of Responses**:

Initial LLM vs. Prompt Engineering vs. RAG (with Fine-tuning)

Here's a brief comparison of the responses from each stage:

**Initial LLM**: Responses were often incomplete and sometimes cut off, lacking specific medical detail. Correctness varied, relying solely on general knowledge.

**Prompt Engineering**: Responses were more structured and acknowledged limitations when context was missing, but still relied on general knowledge and lacked the specific detail needed for medical queries. Completeness was better due to the explicit structure but still not ideal without relevant context.

**RAG (with Fine-tuning)**: This approach yielded significantly better results. Responses were much more complete, detailed, and medically accurate, drawing specific information directly from the Merck Manual. Fine-tuning helped refine the answers further. This method provided the most correct and comprehensive answers for the medical queries.

## Output Evaluation

Let us now use the LLM-as-a-judge method to check the quality of the RAG system on two parameters - retrieval and generation. We illustrate this evaluation based on the answeres generated to the question from the previous section.

- We are using the same Mistral model for evaluation, so basically here the llm is rating itself on how well he has performed in the task.

In [None]:
groundedness_rater_system_message  = groundedness_rater_system_message = """
You are an evaluator tasked with assessing the *groundedness* of AI-generated answers.

Groundedness means:
→ The answer must be fully derived from and supported by the provided context — without introducing external information, speculation, or assumptions.

You will be presented with:
- ###Question — the user’s question
- ###Context — the reference information used by the AI
- ###Answer — the AI-generated response

---

### Evaluation Task:
Judge how well the answer is grounded in the context using the following scale:

1 – Not grounded at all (Answer introduces unrelated or incorrect information)
2 – Grounded to a limited extent (Some parts reflect the context, but mostly unsupported)
3 – Grounded to a good extent (Mostly supported, with minor extrapolations or gaps)
4 – Mostly grounded (Nearly all content supported, very limited speculation)
5 – Completely grounded (All statements clearly and directly supported by the context)

---

### Evaluation Steps:
1. Read the Question to understand what is being asked.
2. Review the Context to identify all relevant information available to the AI.
3. Compare the Answer against the Context:
   - Check if every key statement is backed by the context.
   - Flag any external, speculative, or fabricated details.
4. Explain your reasoning step-by-step — noting which parts of the answer are supported or unsupported.
5. Assign a final rating (1–5) with a concise justification summarizing how grounded the answer is.

---

### Output Format:
Your response should include:
1. Step-by-step evaluation (how you assessed the grounding)
2. Final Rating (1–5)
3. Overall Explanation (2–3 sentences) summarizing your judgment
"""


In [None]:
relevance_rater_system_message = relevance_rater_system_message = """
You are an evaluator tasked with assessing the *relevance* of AI-generated answers.

Relevance means:
→ The answer should directly and appropriately address the key aspects of the question, based on the information in the provided context.

You will be presented with:
- ###Question — the user’s question
- ###Context — the information available to the AI
- ###Answer — the AI-generated response

---

### Evaluation Task:
Judge how well the answer is relevant to the question using the following scale:

1 – Not relevant at all (Answer does not address the question)
2 – Slightly relevant (Touches on the topic but misses most key aspects)
3 – Moderately relevant (Addresses some important points but omits others)
4 – Mostly relevant (Covers most key aspects with minor gaps)
5 – Fully relevant (Directly and completely addresses all key aspects of the question)

---

### Evaluation Steps:
1. Read the **Question** carefully to identify its key aspects or intent.
2. Review the **Context** to understand what information is available for answering.
3. Compare the **Answer** to the **Question**:
   - Check if the answer focuses on the main topic.
   - Note whether it addresses all important aspects asked.
   - Identify if any irrelevant or off-topic details are included.
4. Explain your reasoning step-by-step — describing which parts of the answer are relevant or missing.
5. Assign a final rating (1–5) with a concise justification summarizing how relevant the answer is.

---

### Output Format:
Your response should include:
1. Step-by-step evaluation (how you assessed the relevance)
2. Final Rating (1–5)
3. Overall Explanation (2–3 sentences) summarizing your reasoning
"""


In [None]:
user_message_template = """
###Question
{question}

###Context
{context}

###Answer
{answer}
"""

In [None]:
def generate_ground_relevance_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=3)
    context_list = [d.page_content for d in relevant_document_chunks]
    context_for_query = ". ".join(context_list)

    # Combine user_prompt and system_message to create the prompt
    prompt = f"""[INST]{qna_system_message}\n
                {'user'}: {qna_user_message.format(context=context_for_query, question=user_input)}
                [/INST]"""

    response = llm(
            prompt=prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    answer =  response["choices"][0]["text"]

    # Combine user_prompt and system_message to create the prompt
    groundedness_prompt = f"""[INST]{groundedness_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Combine user_prompt and system_message to create the prompt
    relevance_prompt = f"""[INST]{relevance_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    response_1 = llm(
            prompt=groundedness_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    response_2 = llm(
            prompt=relevance_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    return response_1['choices'][0]['text'],response_2['choices'][0]['text']

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=350)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 ### Step-by-step evaluation:
1. Understood the question to be about the protocol for managing sepsis in a critical care unit based on the context provided.
2. Identified relevant information from the context regarding sepsis diagnosis and treatment, including obtaining cultures, administering empiric antibiotics, adjusting antibiotics according to culture and susceptibility testing results, surgical intervention, and continuing therapy.
3. Compared the answer against the context:
   - The key statements in the answer are supported by the context (obtaining cultures, administering empiric antibiotics, adjusting antibiotics based on culture and susceptibility testing results, surgical intervention, and continuing therapy).
   - No external, speculative, or fabricated details were identified.

### Final Rating: 5

### Overall Explanation:
The answer is completely grounded in the context provided, as it accurately reflects the information presented regarding sepsis diagnosis and treatment

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=400)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 ###Step-by-step evaluation:
1. Understood the question to be about the common symptoms for appendicitis and whether it can be treated with medicine or if surgery is required.
2. Reviewed the context, which provided detailed information on the etiology, symptoms, diagnosis, and treatment of appendicitis.
3. Compared the answer against the context:
   - The answer correctly identified common symptoms such as epigastric or periumbilical pain, nausea, vomiting, shifting pain to the right lower quadrant, increased pain with cough and motion, direct and rebound tenderness at McBurney's point, and additional signs like Rovsing sign, psoas sign, or obturator sign. It also mentioned fever as a common symptom.
   - The answer correctly stated that appendicitis cannot be cured via medicine alone and that surgical removal is the treatment of choice. It also mentioned open or laparoscopic appendectomy as options for treatment.
   - The answer correctly stated that treatment should be initiated as 

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_input = " What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=300)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 ### Step-by-step evaluation:
1. Understood the question as asking for causes and treatments of sudden patchy hair loss, specifically alopecia areata.
2. Identified relevant information in the context regarding alopecia areata, its causes (autoimmune disorder), and treatment options.
3. Compared the answer against the context:
   - The key statement "Alopecia areata is a common cause of sudden patchy hair loss" is directly supported by the context.
   - The statement "Treatment options include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA)" is also directly supported by the context.
   - The statement "The cause is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers" is inferred from the context but is a valid assumption based on the information provided.
4. No exte

### Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=400)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 ### Step-by-step evaluation:
1. Understood the question to be about recommended treatments for a person with a brain injury and resulting impairment.
2. Reviewed the context, which discussed various aspects of rehabilitation for different types of injuries (head, spinal cord), focusing on physical therapy, occupational therapy, supportive care, and early intervention.
3. Compared the answer against the context:
   - The key statements in the answer are all supported by the context (physical and occupational therapy, no specific medical treatment for brain injury itself, supportive care including preventing systemic complications, providing good nutrition, and preventing pressure ulcers, early intervention by rehabilitation specialists, and brain imaging for diagnosis).
   - No external, speculative, or fabricated details were identified.

### Final Rating: 5

### Overall Explanation:
The answer is completely grounded in the context as it accurately reflects all the relevant informatio

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=500,top_p=6)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 ### Step-by-step evaluation:
1. Understood the question to be asking for necessary precautions and treatment steps for a person who has fractured their leg during hiking, as well as considerations for their care and recovery.
2. Reviewed the context which provided information on various treatments for injuries, including fractures, and discussed the importance of RICE (rest, ice, compression, elevation) and immobilization. The context also mentioned potential complications such as deep vein thrombosis (DVT) and the need for additional preventive treatment.
3. Compared the answer against the context:
   - Every key statement in the answer is backed by the context. The answer mentions RICE, immobilization, pain management with opioids, and definitive treatment through closed or open reduction, all of which are mentioned in the context. Additionally, the answer discusses the importance of preventing DVT and the need for additional preventive treatment, which is also mentioned in the cont

Based on the groundedness and relevance ratings provided by the LLM acting as a judge, several observations can be made:

**Groundedness**: The RAG responses consistently received high groundedness scores (mostly 5s). This indicates that the answers generated by the RAG system were well-supported by the provided context from the Merck Manual. The model, in its evaluator role, confirmed that it could trace the information back to the source documents—a key strength of the RAG approach.

**Relevance**: Similarly, the RAG responses scored high on relevance (mostly 5s). This suggests that the RAG system was effective in retrieving document chunks that were not only relevant to the query but also contained sufficient information to provide a comprehensive answer addressing all aspects of the user's question. The evaluator found the answers to be direct and fully aligned with the user's intent.

Overall, the LLM-as-a-judge evaluation reinforces the effectiveness of the RAG system in this medical context. The high scores for both groundedness and relevance demonstrate that the RAG approach successfully enables the model to generate answers that are both well-supported by the source material and directly relevant to the user's medical queries.

# **Actionable Insights and Business Recommendations**

Actionable Insights and Business Recommendations
Based on the evaluation of the different approaches (Initial LLM, Prompt Engineering, and RAG with Fine-tuning) and the observations, here are some actionable insights and business recommendations for implementing a RAG-based AI solution using medical manuals in a healthcare setting:

**Actionable Insights**:

**RAG is Crucial for Medical Accuracy:** The RAG approach, leveraging the Merck Manual, significantly improved the accuracy, completeness, and specificity of responses compared to using the base LLM or prompt engineering alone. This highlights that for domain-specific, factual information like medical knowledge, grounding the LLM with relevant external data is essential.


**Prompt Engineering Enhances Structure and Awareness**: While not sufficient on its own for detailed medical answers, prompt engineering helped structure the responses and made the model aware of the importance of using the provided context. This is valuable for controlling the model's behavior and ensuring it adheres to instructions (like only using the provided context).


**Base LLM Limitations for Specialized Domains:** The initial LLM responses, without external knowledge, were often general, incomplete, and sometimes inaccurate for medical queries. This reinforces that relying solely on a general-purpose LLM is insufficient for tasks requiring deep domain expertise.


**Chunking and Embedding Effectiveness:** The process of chunking the PDF and creating embeddings was effective in preparing the data for retrieval. The high relevance scores in the RAG evaluation indicate that the embedding model and retrieval mechanism successfully identified relevant document sections for the given queries.


**LLM-as-a-Judge Validation:** Using an LLM as a judge for groundedness and relevance proved to be a valuable method for evaluating the RAG system's performance objectively. This approach can be incorporated into a continuous evaluation pipeline for monitoring the system's quality.




**Business Recommendations:**

**Prioritize RAG Implementation:** Given the significant improvement in accuracy and completeness, healthcare organizations should prioritize implementing a RAG-based system for accessing medical knowledge. This directly addresses the challenge of information overload and supports better decision-making.

**Invest in High-Quality Knowledge:** The success of the RAG system is heavily reliant on the quality and comprehensiveness of the external knowledge base (like the Merck Manual). Continuously updating and expanding the medical knowledge base is crucial for maintaining the system's effectiveness.


**Develop a User-Friendly Interface:** A user-friendly interface is essential for healthcare professionals to easily pose questions and interpret the AI-generated responses. The interface should clearly present the source of the information (e.g., section of the Merck Manual) to build trust and allow for verification.


**Consider Fine-tuning for Specific Sub-domains:** While the general Merck Manual is comprehensive, fine-tuning the system on more specialized medical sub-domains or internal protocols could further enhance accuracy and relevance for specific departments or use cases.

Also, keep an eye on the Data Privacy and Security which is very important

In [1]:
!jupyter nbconvert --ClearMetadataPreprocessor.enabled=True --inplace /content/*.ipynb


This application is used to convert notebook files (*.ipynb)
        to various other formats.


Options
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
    <cmd> --help-all

--debug
    set log level to logging.DEBUG (maximize logging output)
    Equivalent to: [--Application.log_level=10]
--show-config
    Show the application's configuration (human-readable format)
    Equivalent to: [--Application.show_config=True]
--show-config-json
    Show the application's configuration (json format)
    Equivalent to: [--Application.show_config_json=True]
--generate-config
    generate default config file
    Equivalent to: [--JupyterApp.generate_config=True]
-y
    Answer yes to any questions instead of prompting.
    Equivalent to: [--JupyterApp.answer_yes=True]
--execute
    Execute the notebook prior to export.
    Equivalent to: [--ExecutePr