## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Questions to Answer

•	What is the protocol for managing sepsis in a critical care unit?

•	What are the common symptoms of appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

•	What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

•	What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

•	What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?


### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [None]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used
# !CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.5/1.8 MB[0m [31m15.1 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m31.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.0/62.0 kB[0m [31m297.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m292.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.8/16.8 MB[0m [31m309.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.8/45.8 kB[0m [31m294.3 MB/

In [None]:
# Uninstall existing versions of pandas and numpy
!pip uninstall pandas numpy -y


Found existing installation: pandas 1.5.3
Uninstalling pandas-1.5.3:
  Successfully uninstalled pandas-1.5.3
Found existing installation: numpy 2.2.6
Uninstalling numpy-2.2.6:
  Successfully uninstalled numpy-2.2.6


In [None]:
# For installing the libraries & downloading models from HF Hub
!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas==2.2.2, but you have pandas 1.5.3 which is incompatible.
cudf-cu12 25.2.1 requires pandas<2.2.4dev0,>=2.0, but you have pandas 1.5.3 which is incompatible.
thinc 8.3.6 requires numpy<3.0.0,>=2.0.0, but you have numpy 1.25.2 which is incompatible.
dask-expr 1.1.21 requires pandas>=2, but you have pandas 1.5.3 which is incompatible.
mizani 0.13.5 requires pandas>=2.2.0, but you have pandas 1.5.3 which is incompatible.
db-dtypes 1.4.3 requires packaging>=24.2.0, but you have packaging 23.2 which is incompatible.
plotnine 0.14.5 requires pandas>=2.2.0, but you have pandas 1.5.3 which is incompatible.
blosc2 3.3.2 requires numpy>=1.26, but you have numpy 1.25.2 which is incompatible.
dask-cudf-cu12 25.2.2 requires pandas<2.2.4dev0,>=2.0, but you have pandas 1.5.3 which is incompa

In [None]:
# Restart the session, Runtime menu->restart session then run the below code. Without restart session you may get error for numpy datatype or other error.
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

#### Downloading and Loading the model(Mistral)

In [None]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [None]:
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [None]:
llm = Llama(
    model_path=model_path,
    n_ctx=1024,
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Defining Model Response Parameters (Mistral)

In [None]:
def response(query,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    model_output = llm(
      prompt=query,
      max_tokens=max_tokens,
      temperature=temperature,
      top_p=top_p,
      top_k=top_k
    )

    return model_output['choices'][0]['text']

#### Downloading and Loading the model (Llama)

In [None]:
model_name_or_path = "TheBloke/Llama-2-13B-chat-GGUF"
model_basename = "llama-2-13b-chat.Q5_K_M.gguf" # the model is in gguf format

In [None]:
# Using hf_hub_download to download a model from the Hugging Face model hub
# The repo_id parameter specifies the model name or path in the Hugging Face repository
# The filename parameter specifies the name of the file to download
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)

In [None]:
lcpp_llm = Llama(
    model_path=model_path,
    n_threads=2,  # CPU cores
    n_batch=512,  # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
    n_gpu_layers=43,  # Change this value based on your model and your GPU VRAM pool.
    n_ctx=4096,  # Context window
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Defining Model Response Parameters (Llama)

In [None]:
def generate_llama_response(query):

    # Generate a response from the LLaMA model
    model_output = lcpp_llm(
        prompt=query,
        max_tokens=1024,
        temperature=0,
        top_p=0.95,
        repeat_penalty=1.2,
        top_k=50,
        echo=False

    )


    return  model_output["choices"][0]["text"]


### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
# Ask a question using (Mistral)
query = "What is the protocol for managing sepsis in a critical care unit?"
answer = response(query)
print("Answer:", answer)

Answer: 

Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Source control


The explanation is mostly clear and medically sound. It correctly emphasizes early recognition and aggressive management of sepsis.

However, the answer appears incomplete — it ends abruptly at "2. Source control" without elaboration or continuation.

In [None]:
# Ask a question using (Llama)
query = "What is the protocol for managing sepsis in a critical care unit?"
answer = generate_llama_response(query)
print("Answer:", answer)

Answer: 
Sepsis is a life-threatening condition that can arise from an infection, and it is important to have a well-defined protocol for its management in a critical care unit. Here are some key components of a sepsis protocol:

1. Early recognition and identification: The first step in managing sepsis is to recognize the signs and symptoms early on, such as fever, tachycardia, tachypnea, and confusion. The patient's medical history, including any recent surgeries or hospitalizations, should also be reviewed for potential sources of infection.
2. Rapid laboratory testing: Blood cultures and other relevant lab tests (e.g., white blood cell count, serum lactate) should be performed promptly to identify the source of infection and assess the severity of sepsis.
3. Administration of antibiotics: Broad-spectrum antibiotics should be administered as soon as possible, ideally within the first hour of recognition of sepsis. The choice of antibiotic should be guided by the suspected source of 

Excellent quality response — informative, well-structured, and clinically appropriate. Minor enhancements (mentioning Sepsis-3, MAP target) would make it even more robust. Rating: 9/10 for clarity, completeness, and clinical usefulness.

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
# Ask a question using (Mistral)
query = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
answer = response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 

Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:

1. Abdominal pain: The pain is typically sharp, localized in the lower right quadrant of the abdomen, and worsens with movement or pressure on the area.
2. Loss of appetite: People with appendicitis may lose their appetite due to abdominal discomfort.
3. Nausea and


Partially useful but incomplete response. Starts strong with symptom explanation, but fails to answer the full question and abruptly ends mid-sentence.

In [None]:
# Ask a question using (Llama)
query = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
answer = generate_llama_response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 

Answer: Appendicitis is a medical emergency that requires prompt treatment. The most common symptoms of appendicitis include:

1. Severe pain in the abdomen, usually starting near the belly button and then moving to the lower right side of the abdomen.
2. Nausea and vomiting.
3. Loss of appetite.
4. Fever.
5. Abdominal tenderness and guarding (muscle tension).
6. Abdominal swelling.
7. Diarrhea or constipation.

If you suspect that you or someone else may have appendicitis, it is essential to seek medical attention immediately. Appendicitis can be cured via surgery, but not with medicine alone. In fact, delaying treatment can lead to serious complications, such as the appendix rupturing and spreading infection throughout the abdominal cavity (peritonitis).

The surgical procedure used to treat appendicitis is called an appendectomy. There are two types of appendectomies: open and laparoscopic.

1. Open Appendectomy: This is the traditional method, where a single incision is m

Thorough, clear, and informative answer that fully addresses the user's question with accurate medical content and good structure.

Rating: 9.5/10

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
# Ask a question using (Mistral)
query = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
answer = response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 

Sudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, and eyelashes.

The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications.


Well-structured and accurate start that covers the causes of sudden patchy hair loss effectively. However, the lack of treatment details makes the answer feel incomplete.

In [None]:
# Ask a question using (Llama)
query = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
answer = generate_llama_response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 
Sudden patchy hair loss, also known as alopecia areata, can have various underlying causes. Here are some of the most effective treatments or solutions for addressing this condition:
1. Corticosteroid injections: These injections can help suppress the immune system and promote hair growth. They are usually administered every 4-6 weeks.
2. Topical corticosteroids: Over-the-counter or prescription creams, gels, or ointments containing corticosteroids can be applied directly to the affected area to reduce inflammation and promote hair growth.
3. Minoxidil (Rogaine): This is a solution that you apply to your scalp, which can help stimulate hair growth and slow down hair loss. It's available over-the-counter.
4. Anthralin: This medication is applied to the skin to reduce inflammation and promote hair growth. It may be used in combination with corticosteroids or minoxidil.
5. Phototherapy: Exposure to specific wavelengths of light, such as ultraviolet B (UVB) or narrowband UVB, can 

This is a strong and informative response that provides a broad range of effective treatments and a comprehensive list of causes for sudden patchy hair loss. It fulfills the query requirements thoroughly and offers actionable advice with medical accuracy.

Rating: 9/10

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
# Ask a question using (Mistral)
query = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
answer = response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 

A person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:

1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.
2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as


The answer starts well but is incomplete and limited in scope, making it insufficient for someone seeking a full understanding of TBI treatment.

Expanding the response to include rehabilitation, long-term care, and complications would significantly improve its usefulness.

In [None]:
# Ask a question using (Llama)
query = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
answer = generate_llama_response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 
There are several treatment options available for individuals who have suffered a physical injury to brain tissue, depending on the severity and location of the injury. Here are some common treatments recommended for such cases:
1. Medications: To manage symptoms such as pain, inflammation, seizures, and anxiety, medications may be prescribed. These can include analgesics, anti-inflammatory drugs, anticonvulsants, and anxiolytics.
2. Rehabilitation therapy: Physical, occupational, and speech therapies can help improve cognitive, motor, and communication skills that may have been affected by the injury. These therapies can be conducted individually or in a group setting.
3. Surgery: In some cases, surgery may be necessary to relieve pressure on the brain caused by swelling or bleeding, or to repair damaged blood vessels. The type of surgical procedure will depend on the location and severity of the injury.
4. Stem cell therapy: This is a relatively new treatment option that inv

The Llama response is informative, wide-ranging, and balanced, offering a strong summary of standard and supplementary treatments for brain injury recovery.

With slight improvement in precision regarding treatment evidence and injury severity context, this would be excellent for both general and semi-clinical audiences.

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
# Ask a question using (Mistral)
query = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
answer = response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 

First and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:

1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.
2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or low blood pressure. If you notice these symptoms, seek medical help immediately.
3. Immobilize the leg: Use a splint,


The Mistral response demonstrates strong initial emergency response knowledge and is helpful for on-site first aid during a hiking injury.

However, it is incomplete and would benefit from expanded content on medical treatment and recovery after initial stabilization.

In [None]:
# Ask a question using (Llama)
query = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
answer = generate_llama_response(query)
print("Answer:", answer)

Llama.generate: prefix-match hit


Answer: 
A person who has fractured their leg while hiking may require immediate medical attention to ensure proper diagnosis, treatment, and management of the injury. The following are some necessary precautions and treatment steps that should be taken:

1. Stop activity and rest: If you suspect a leg fracture, stop all physical activity and rest as soon as possible to avoid exacerbating the injury.
2. Seek medical attention: It is essential to seek immediate medical attention from a healthcare professional for proper diagnosis and treatment. A doctor may perform an X-ray or CT scan to confirm the fracture and determine its severity.
3. Immobilize the affected limb: To prevent further damage and promote healing, the affected leg should be immobilized using a splint, cast, or brace. This will help keep the bone in place and reduce pain.
4. Manage pain: Pain management is crucial to ensure comfort and mobility during recovery. Your healthcare provider may prescribe medication to manage 

This is a strong, well-rounded medical response that appropriately blends emergency care, clinical procedures, rehabilitation, and prevention.

It offers practical insights for both hikers and general readers, with only minor gaps in wilderness-specific emergency handling.

Rating: 8.5/10 — Detailed and informative with room to improve wilderness-specific emergency action steps.

## Question Answering using LLM with Prompt Engineering

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
# Ask a question using (Mistral) by using 5 sets of parameters with Zero-shot Prompt
prompt = "What is the protocol for managing sepsis in a critical care unit?"

combinations = [
    {"tokens": 128, "temp": 0.2, "top_p": 0.8, "top_k": 20},
    {"tokens": 100, "temp": 0.0, "top_p": 0.9, "top_k": 40},
    {"tokens": 150, "temp": 0.3, "top_p": 0.7, "top_k": 30},
    {"tokens": 120, "temp": 0.1, "top_p": 0.95, "top_k": 10},
    {"tokens": 80,  "temp": 0.0, "top_p": 1.0, "top_k": 5}
]

for i, combo in enumerate(combinations, 1):
    result = response(prompt, combo["tokens"], combo["temp"], combo["top_p"], combo["top_k"])
    print(f"--- Answer for Combination C{i} ---\n{result}\n")

Llama.generate: prefix-match hit


--- Answer for Combination C1 ---


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition and suspicion: Septic patients may present with non-specific symptoms such as fever, chills, tachycardia, tachypnea, altered mental status, or lactic acidosis. It is essential to have a high index of suspicion for sepsis in patients with known or suspected infections.
2. Initial assessment and res



Llama.generate: prefix-match hit


--- Answer for Combination C2 ---


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycard



Llama.generate: prefix-match hit


--- Answer for Combination C3 ---


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Resuscitation: Provide adequate fluid resuscitation to maintain adequate tissue perfusion. The goal is to achieve a



Llama.generate: prefix-match hit


--- Answer for Combination C4 ---


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition and suspicion: Septic patients may present with non-specific symptoms such as fever, chills, tachycardia, tachypnea, altered mental status, or lactic acidosis. It is essential to have a high index of suspicion for sepsis in any patient who has an infection or



Llama.generate: prefix-match hit


--- Answer for Combination C5 ---


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Seps



### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
# Ask a question using (Mistral) by using parameters with Zero-shot Prompt
prompt = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
answer = response(prompt, max_tokens=300,temperature=0,top_p=0.95,top_k=50)
print("Answer:", answer)


Llama.generate: prefix-match hit


Answer: 

Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:

1. Abdominal pain: The pain is typically sharp, localized in the lower right quadrant of the abdomen, and worsens with movement or pressure on the area.
2. Loss of appetite: People with appendicitis may lose their appetite due to abdominal discomfort.
3. Nausea and vomiting: Vomiting is a common symptom of appendicitis, especially in more advanced cases.
4. Fever: A fever of 100.4°F (38°C) or higher may be present.
5. Constipation or diarrhea: Some people with appendicitis experience constipation, while others have diarrhea.
6. Inability to pass gas: Flatulence is often difficult or impossible to pass due to the inflammation and pressure on the area.
7. Rebound tenderness: When the doctor presses on the abdomen, there may be pain 

•	Medically Accurate Start: The description of appendicitis and its common symptoms is factually correct and medically sound.

•	Incomplete Answer: The response is cut off mid-sentence (ends with "abdomen..."). It does not address:

•	Whether appendicitis can be cured with medicine.

•	The surgical treatment (e.g., appendectomy), which was explicitly asked.

•	Good Structure: Symptoms are well-organized in a numbered list, aiding readability.

•	Relevant Content: The content is highly relevant to the question but lacks completeness due to truncation or token limit.

•	Improvement Needed:

Ensure full question is answered.

Increase max_tokens if truncation caused cutoff.

Add a brief follow-up on treatment options.


### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
# Ask a question using (Mistral) by using parameters with Zero-shot Prompt
prompt = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"

answer = response(prompt, max_tokens=400,temperature=0.45,top_p=0.9,top_k=40)
print("Answer:", answer)


Llama.generate: prefix-match hit


Answer: 

Sudden patchy hair loss, also known as alopecia areata, is a common condition that affects both men and women. It is characterized by round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes. The exact cause of alopecia areata is not known, but it is believed to be an autoimmune condition in which the immune system attacks the hair follicles.

There are several treatments that have been shown to be effective for addressing sudden patchy hair loss:

1. Corticosteroids: These are anti-inflammatory medications that can help reduce inflammation and suppress the immune response against the hair follicles. They can be applied topically as creams or ointments, injected directly into the bald patches, or taken orally in pill form.
2. Minoxidil: This is a medication that has been shown to stimulate hair growth by increasing blood flow to the hair follicles and extending the anagen phase of the hair growth 

The answer is accurate and well-structured, covering both causes and treatments of patchy hair loss (alopecia areata).

It lists effective treatments clearly but ends abruptly, likely due to token limits.

Slight adjustment to max tokens is recommended for completeness.

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
# Ask a question using (Mistral) by using parameters with Zero-shot Prompt
prompt = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
answer = response(prompt, max_tokens=512,temperature=0.7,top_p=0.8,top_k=100)
print("Answer:", answer)

Llama.generate: prefix-match hit


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
# Ask a question using (Mistral) by using parameters with Zero-shot Prompt
prompt = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
answer = response(prompt, max_tokens=256,temperature=1.0,top_p=0.7,top_k=20)
print("Answer:", answer)

## Data Preparation for RAG

### Loading the Data

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
mediref_pdf_path = "/content/drive/My Drive/Python Course/medical_diagnosis_manual.pdf"

In [None]:
pdf_loader = PyMuPDFLoader(mediref_pdf_path)

In [None]:
mediref = pdf_loader.load()

### Data Overview

#### Checking the first 5 pages

In [None]:
for i in range(5):
    print(f"Page Number : {i+1}",end="\n")
    print(mediref[i].page_content,end="\n")

Page Number : 1
sureshsharma4747@gmail.com
2UJ6134R8W
for personal use by sureshsharma4747@
shing the contents in part or full is liable 

Page Number : 2
sureshsharma4747@gmail.com
2UJ6134R8W
This file is meant for personal use by sureshsharma4747@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.

Page Number : 3
Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    ............................................................................................................................................................................

Above is the five page of the document.

It contains shapes, text, and other elements.
Let's see how the text is extracted.

In [None]:
mediref[4].page_content

'921\nChapter 94. Adrenal Disorders    ................................................................................................................................................\n936\nChapter 95. Polyglandular Deficiency Syndromes    ........................................................................................................\n939\nChapter 96. Porphyrias    ..............................................................................................................................................................\n949\nChapter 97. Fluid & Electrolyte Metabolism    .....................................................................................................................\n987\nChapter 98. Acid-Base Regulation & Disorders    ..............................................................................................................\n1001\nChapter 99. Diabetes Mellitus & Disorders of Carbohydrate Metabolism    ..................................................

#### Checking the number of pages

In [None]:
len(mediref)

4114

### Data Chunking

In [None]:
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=512,
    chunk_overlap=20
)

In [None]:
document_chunks = pdf_loader.load_and_split(text_splitter)

In [None]:
len(document_chunks)

8503

In [None]:
document_chunks[0].page_content

'sureshsharma4747@gmail.com\n2UJ6134R8W\nfor personal use by sureshsharma4747@\nshing the contents in part or full is liable'

In [None]:
document_chunks[-4].page_content

"Wood's screw maneuver 2679\nWood trimmer's disease 1957\nWoodworker's lung 1957\nWord recognition score 434\nWorms 1336, 1341, 1342-1355\nCNS infection with 1728-1729\nWounds\nbite 3307\ncleansing of 3195\nclosure of 3196, 3198, 3199, 3200\ndebridement of 3196\nevaluation of 3193-3195\nhealing of 3193, 3194\ninfection of 3200, 3307 (see also specific organisms)\nirrigation of 3195\nmyiasis of 710\npostoperative 3451\nWright's stain 1166\nWrist\navascular necrosis of 390-391\nexercises for 3301, 3302\nfracture of 3210\nganglion cyst of 388\nmedian nerve compression at 391, 392\npain in 3203\nrange of motion of 3454\nsplints for 389, 392\nulnar nerve compression at 392\nWriter's cramp 1762\nWT1 gene 2400\nWuchereria bancrofti infection 1337, 1346-1347\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nW\n4101\nsureshsharma4747@gmail.com\n2UJ6134R8W\nThis file is meant for personal use by sureshsharma4747@gmail.com only.\nSharing or publishing the contents in part or full is liable 

In [None]:
document_chunks[-3].page_content

"X\nXanthelasma 579\nXanthine oxidase deficiency 3026\nXanthines 2064\nXanthochromia 1595\nXanthogranulomatous pyelonephritis 2379\nXanthomas 634, 645, 895\nXanthomatosis, cerebrotendinous 894\nX chromosome 2599, 3373 (see also Genetic)\nabnormalities of 3002-3005, 3376-3377, 3377\ninactivation of 3002, 3378\n47,XXX syndrome 3005\n48,XXXX syndrome 3005\n49,XXXXX syndrome 3005\n47,XYY syndrome 3005\nXenograft 1126 (see also Transplantation)\nXeroderma (xerosis) 661\nXerophthalmia 594\nvitamin A deficiency and 34\nXerosis 636\nXerostomia 501, 513-516\nradiation-induced 488, 505\nin Sjogren's syndrome 304\nX-linked adrenoleukodystrophy 3024\nX-linked agammaglobulinemia 1092, 1097, 1107-1108\nX-linked lymphoproliferative syndrome 1092, 1097, 1108\nX-rays 3252 (see also Radiation; Radiation therapy; Radiography)\nXylene poisoning 3348\nXylitol 518\nXylose absorption test 155, 3494, 3500\nXylose breath test 156, 157\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nX\n4102\nsureshsharm

In [None]:
document_chunks[-2].page_content

'Y\nYaws 1266-1267\nforest 1379\nY chromosome 3373 (see also Genetic)\nabnormalities of 3005\nYeast infection (see also Fungal infection)\nvaginal 2542, 2544, 2545\nYellow fever 1400, 1429, 1437\nhepatic inflammation in 248\nvaccine against 1172, 1437, 3441\nYellow nail syndrome 732, 1995\npleural effusion in 1997\nYellow skin (see Jaundice)\nYersinia infection 1167, 1256-1257\nY. enterocolitica infection 147\nY. pestis infection 1924\nYew poisoning 3338\nYips 1762\nYo, antibodies to 1056\nYolk sac tumor 2476\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nY\n4103\nsureshsharma4747@gmail.com\n2UJ6134R8W\nThis file is meant for personal use by sureshsharma4747@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action.'

In [None]:
document_chunks[-1].page_content

"Z\nZafirlukast 1879\nZalcitabine 1451\nin children 2854\nZaleplon 1709\nZanamivir 1407\nin influenza 1407, 1929\nZAP-70 (zeta-associated protein 70) deficiency 1092, 1108\nZavanelli maneuver 2680\nZellweger syndrome 2383, 3023\nZenker's diverticulum 125\nZidovudine 1451, 1453\nin children 2854\nZileuton 1881\nin asthma 1880\nZinc 49, 55, 3431-3432\nin common cold 1405\ndeficiency of 11, 49, 55\nin dermatophytoses 705\npoisoning with 3328, 3353\nrecommended dietary allowances for 50\nreference values for 3499\ntoxicity of 49, 55\ncopper deficiency and 49\nin Wilson's disease 52\nZinc oxide 2233\ngelatin formulation of 646, 672\nZinc pyrithione 647\nZinc shakes 55\nZipper injury 3239, 3240\nZiprasidone\nin agitation 1492\nin bipolar disorder 3059\npoisoning with 3347\nin schizophrenia 1566\nZoledronate 359, 361, 848\nZollinger-Ellison syndrome 95, 199, 200-201, 910\nmastocytosis vs 1125\nMenetrier's disease vs 132\npeptic ulcer disease vs 134\nZolmitriptan 1721\nZolpidem 1709, 3103\nZon

In [None]:
document_chunks[119].page_content

'Table 3-1).\nAssessing Response to Nutritional Support\nThere is no gold standard to assess response. Clinicians commonly use indicators of lean body mass\nsuch as the following:\n• Body mass index (BMI)\n• Body composition analysis\n• Body fat distribution (see pp.\n11 and 58)\nNitrogen balance, response to skin antigens, muscle strength measurement, and indirect calorimetry can\nalso be used.\n[Table 3-1. Estimated Adult Daily Protein Requirement]\nNitrogen balance, which reflects the balance between protein needs and supplies, is the difference\nbetween amount of nitrogen ingested and amount lost. A positive balance (ie, more ingested than lost)\nimplies adequate intake. Precise measurement is impractical, but estimates help assess response to\nnutritional support. Nitrogen intake is estimated from protein intake: nitrogen (g) equals protein (g)/6.25.\nEstimated nitrogen losses consist of urinary nitrogen losses (estimated by measuring urea nitrogen\ncontent of an accurately obtain

In [None]:
document_chunks[120].page_content

'Specific indications for enteral nutrition include the following:\n• Prolonged anorexia\n• Severe protein-energy undernutrition\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nChapter 3. Nutritional Support\n70\nsureshsharma4747@gmail.com\n2UJ6134R8W\nThis file is meant for personal use by sureshsharma4747@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action.'

As expected, there are some overlaps:

The sentence '**Specific indications for enteral nutrition include the following:\n• Prolonged anorexia**' appears in both chunks.
If we increase the chunk_overlap, the overlapping length of the sentence will also increase.

### Embedding

In [None]:
embedding_model = SentenceTransformerEmbeddings(model_name='thenlper/gte-large')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/385 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/67.9k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/57.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/619 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/670M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/712k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/191 [00:00<?, ?B/s]

In [None]:
embedding_1 = embedding_model.embed_query(document_chunks[0].page_content)
embedding_2 = embedding_model.embed_query(document_chunks[1].page_content)

In [None]:
print("Dimension of the embedding vector ",len(embedding_1))
len(embedding_1)==len(embedding_2)

Dimension of the embedding vector  1024


True

•	The embedding model provides a fixed-length vector for any number of chunks.

•	This is necessary because we want to compare them for similarity.


### Vector Database

In [None]:
out_dir = 'medical_db'

if not os.path.exists(out_dir):
  os.makedirs(out_dir)

In [None]:
vectorstore = Chroma.from_documents(
    document_chunks,
    embedding_model,
    persist_directory=out_dir
)

In [None]:
vectorstore = Chroma(persist_directory=out_dir,embedding_function=embedding_model)

In [None]:
vectorstore.embeddings

HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
  (2): Normalize()
), model_name='thenlper/gte-large', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False)

In [None]:
# vectorstore.similarity_search("The Merck Manual of Diagnosis & Therapy, 19th Edition ",k=5)
vectorstore.similarity_search("What is the protocol for managing sepsis in a critical care unit?",k=5)


[Document(page_content="16 - Critical Care Medicine\nChapter 222. Approach to the Critically Ill Patient\nIntroduction\nCritical care medicine specializes in caring for the most seriously ill patients. These patients are best\ntreated in an ICU staffed by experienced personnel. Some hospitals maintain separate units for special\npopulations (eg, cardiac, surgical, neurologic, pediatric, or neonatal patients). ICUs have a high\nnurse:patient ratio to provide the necessary high intensity of service, including treatment and monitoring\nof physiologic parameters.\nSupportive care for the ICU patient includes provision of adequate nutrition (see p. 21) and prevention of\ninfection, stress ulcers and gastritis (see p. 131), and pulmonary embolism (see p. 1920). Because 15 to\n25% of patients admitted to ICUs die there, physicians should know how to minimize suffering and help\ndying patients maintain dignity (see p. 3480).\nPatient Monitoring and Testing\nSome monitoring is manual (ie, by di

•	From the retrieved chunks, we observe that all the chunks are related to the key terms [ 'sepsis', 'critical care'].

### Retriever

In [None]:
retriever = vectorstore.as_retriever(
    search_type='similarity',
    search_kwargs={'k': 2}
)

In [None]:
rel_docs = retriever.get_relevant_documents("What is the protocol for managing sepsis in a critical care unit?")
rel_docs

[Document(page_content="16 - Critical Care Medicine\nChapter 222. Approach to the Critically Ill Patient\nIntroduction\nCritical care medicine specializes in caring for the most seriously ill patients. These patients are best\ntreated in an ICU staffed by experienced personnel. Some hospitals maintain separate units for special\npopulations (eg, cardiac, surgical, neurologic, pediatric, or neonatal patients). ICUs have a high\nnurse:patient ratio to provide the necessary high intensity of service, including treatment and monitoring\nof physiologic parameters.\nSupportive care for the ICU patient includes provision of adequate nutrition (see p. 21) and prevention of\ninfection, stress ulcers and gastritis (see p. 131), and pulmonary embolism (see p. 1920). Because 15 to\n25% of patients admitted to ICUs die there, physicians should know how to minimize suffering and help\ndying patients maintain dignity (see p. 3480).\nPatient Monitoring and Testing\nSome monitoring is manual (ie, by di

•	We can observe that the two relevant chunks contain the answer to the query.

•	If we increase the k value, there is a chance that we might find the answer in even more chunks.

•	This is a hyperparameter that we need to tune to get the best context.


###Downloading and Loading the model

In [None]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [None]:
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)

mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [None]:
#uncomment the below snippet of code if the runtime is connected to GPU.
llm = Llama(
    model_path=model_path,
    n_ctx=2300,
    n_gpu_layers=38,
    n_batch=512
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


In [None]:
llm("What is the protocol for managing sepsis in a critical care unit?")['choices'][0]['text']

'\n\nSepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following steps outline the general approach to managing sepsis in a critical care setting:\n\n1. Early recognition: Suspect sepsis in any patient with suspected or confirmed infection who is exhibiting signs of organ dysfunction, such as altered mental status, decreased urine output, respiratory distress, or hypotension. Use the Sequential Organ Failure Assessment (SOFA) score to assess organ dysfunction and monitor for progression'

The response seems generic and appears to be derived from another article. Let's provide our own context and align the response with our needs.

### System and User Prompt Template

Prompts guide the model to generate accurate responses. Here, we define two parts:

    1. The system message describing the assistant's role.
    2. A user message template including context and the question.

In [None]:
qna_system_message = """
You are an assistant whose work is to review the report and provide the appropriate answers from the context.
User input will have the context required by you to answer user questions.
This context will begin with the token: ###Context.
The context contains references to specific portions of a document relevant to the user query.

User questions will begin with the token: ###Question.

Please answer only using the context provided in the input. Do not mention anything about the context in your final answer.

If the answer is not found in the context, respond "I don't know".
"""

In [None]:
qna_user_message_template = """
###Context
Here are some documents that are relevant to the question mentioned below.
{context}

###Question
{question}
"""

### Response Function

In [None]:
def generate_rag_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=k)
    context_list = [d.page_content for d in relevant_document_chunks]

    # Combine document chunks into a single context
    context_for_query = ". ".join(context_list)

    user_message = qna_user_message_template.replace('{context}', context_for_query)
    user_message = user_message.replace('{question}', user_input)

    prompt = qna_system_message + '\n' + user_message

    # Generate the response
    try:
        response = llm(
                  prompt=prompt,
                  max_tokens=max_tokens,
                  temperature=temperature,
                  top_p=top_p,
                  top_k=top_k
                  )

        # Extract and print the model's response
        response = response['choices'][0]['text'].strip()
    except Exception as e:
        response = f'Sorry, I encountered the following error: \n {e}'

    return response

## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, the protocol for managing sepsis in a critical care unit includes:
1. Administering parenteral antibiotics after taking specimens for Gram stain and culture.
2. Starting very prompt empiric therapy as soon as sepsis is suspected.
3. Selecting an antibiotic regimen based on the suspected source, clinical setting, knowledge or suspicion of causative organisms and sensitivity patterns common to that specific inpatient unit, and previous culture results.
4. Adding vancomycin if resistant staphylococci or enter


•	The answer is clear, concise, and focused, without any unnecessary information.

•	For queries like this, we expect a response of this nature.


### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


###Answer
The common symptoms for appendicitis include abdominal pain, anorexia, and abdominal tenderness. Appendicitis cannot be cured via medicine alone; surgery, specifically a surgical removal of the appendix (appendectomy), is required for treatment.


•	The answer is clear, concise, and focused, without any unnecessary information.


### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_input = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, the condition being described is Alopecia Areata. The effective treatments for this condition include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). The possible cause behind sudden patchy hair loss in this context is an autoimmune disorder.


•	The answer is clear, concise, and focused, without any unnecessary information.


### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, the recommended treatments for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function include ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. Surgery may be needed in patients with more severe injury to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas. In the first few days after the injury, maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium are important


•	The answer is clear, concise, and focused, without any unnecessary information.


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, here is the answer:

The person with a fractured leg should elevate the injured limb above heart level for the first 2 days to minimize swelling. After 48 hours, they can apply warmth using a heating pad for 15 to 20 minutes to relieve pain and speed healing. Immobilization is necessary to prevent further injury and facilitate healing. Joints proximal and distal to the injury should be immobilized using either a cast or a splint. A cast is usually used for fractures that require weeks of immobilization,


•	The answer is clear, concise, and focused, without any unnecessary information.

### Fine-tuning

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
print(generate_rag_response(user_input,k=3,max_tokens=300,temperature=0,top_p=0.95,top_k=30))

Llama.generate: prefix-match hit


Based on the context provided, the protocol for managing sepsis in a critical care unit includes:
1. Administering parenteral antibiotics after taking specimens for Gram stain and culture.
2. Starting very prompt empiric therapy as soon as sepsis is suspected.
3. Selecting an antibiotic regimen based on the suspected source, clinical setting, knowledge or suspicion of causative organisms and sensitivity patterns common to that specific inpatient unit, and previous culture results.
4. Adding vancomycin if resistant staphylococci or enterococci are suspected.
5. Including a drug effective against anaerobes if there is an abdominal source.
6. Changing the antibiotic regimen based on culture and sensitivity results.
7. Continuing antibiotics for at least 5 days after shock resolves and evidence of infection subsides.
8. Draining abscesses and excising necrotic tissues to eliminate septic foci.
9. Normalizing blood glucose levels between 80 to 110 mg/dL (4.4 to 6.1 mmol/L) using a continuou

•	The answer (with tuning parameters) is clearly better.

•	It is more complete, detailed, and well-structured, covering all essential steps in sepsis management including antibiotics selection, source control, and glucose management.

•	The first answer (without tuning parameters) is cut off and lacks crucial information, making it less useful.

•	So for clinical or detailed queries, the tuned parameters help generate a more accurate, helpful, and professional response.


### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(generate_rag_response(user_input,k=4,max_tokens=400,temperature=0.45,top_p=0.9,top_k=40))

Llama.generate: prefix-match hit


###Answer
The common symptoms for appendicitis include abdominal pain, anorexia, and abdominal tenderness. Appendicitis cannot be cured via medicine alone; surgery, specifically a surgical removal of the appendix called appendectomy, is required for treatment.


•	Answers without tuning parameters and with tuning parameters correctly identify common appendicitis symptoms and state it's not curable by medicine alone.

•	Both specify appendectomy as the surgical treatment.


•	The Answer with tuning parameters is slightly more concise, omitting the parenthetical "specifically a surgical removal of the appendix".

•	Both answers appear complete and directly address the user's query based on the context, despite different max_tokens settings, as the required information is relatively short.


### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_input = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(generate_rag_response(user_input,k=3,max_tokens=512,temperature=0.7,top_p=0.8,top_k=100))

Llama.generate: prefix-match hit


The context suggests that alopecia areata is a condition characterized by sudden patchy hair loss in people with no obvious skin or systemic disorder. The causes of this condition are not clear but it's thought to be an autoimmune disorder affecting genetically susceptible people exposed to unclear environmental triggers. The treatment options for alopecia areata include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). In severe cases, hormonal modulators such as oral contraceptives or spironolactone may be useful for female-pattern hair loss associated with hyperandrogenemia. Surgical options include follicle transplant, scalp flaps, and alopecia reduction. Patients who are self-conscious about their hair loss may consider these procedures. It's important to note that the underlying disorders are treated if they are causing the sudden pat

•	Both answers identify Alopecia Areata and list medical treatments.

•	The answer with tuning (higher max_tokens, different temp/top_p/top_k) provides a more detailed explanation of causes and includes surgical treatment options.


•	The answer without tuning is more concise and focuses on the primary cause identified.

•	Tuning parameters, especially max_tokens, influenced the level of detail in the response.


### Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(generate_rag_response(user_input,k=3,max_tokens=256,temperature=1.0,top_p=0.7,top_k=20))

Llama.generate: prefix-match hit


Based on the context provided, the recommended treatments for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function include ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. Surgery may be needed in patients with more severe injury to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas. In the first few days after the injury, maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium are important. Subsequently, many patients require rehabilitation. There is no specific treatment for traumatic brain injury (TBI), but supportive care should include preventing systemic complications due to immobilization, providing good nutrition, and preventing pressure ulcers.


•	With tuning parameters, the response is more detailed and holistic, capturing both acute care and long-term rehabilitation.

•	Parameters like temperature=1.0, top_p=0.7, and top_k=20 encourage more diverse token sampling, which helps uncover additional relevant information (e.g., pressure ulcer prevention, nutrition, rehab).

•	k=3 likely allowed retrieval of more documents, giving the model a richer context window to synthesize from.

•	Without parameters, the output is still useful but slightly truncated or focused only on early-stage treatment.


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(user_input,k=3,max_tokens=450, temperature=0.4,top_p=0.85,top_k=10))

Llama.generate: prefix-match hit


The context provides information about fractures, their symptoms, diagnosis, treatment, and necessary precautions. According to the text, a person with a fractured leg should:
1. Elevate the injured limb above heart level for the first 2 days to minimize swelling.
2. Apply warmth for pain relief after 48 hours using a heating pad or other methods.
3. Immobilize the joints proximal and distal to the injury to prevent further damage.
4. Use a cast or splint for immobilization, depending on the severity of the injury and the length of recovery time required.
5. Keep the cast dry and avoid putting objects inside it.
6. Inspect the skin around the cast daily and apply lotion to any red or sore areas.
7. Pad any rough edges with soft material to prevent irritation.
8. Seek medical care if an odor emanates from within the cast, a fever develops, or there is a risk of infection.
9. Maintain good hygiene to avoid skin injuries that could lead to infection.
10. For prolonged immobilization (more

•	The answer (with tuning) is significantly better — it provides a fuller, clearer, and more useful response suitable for understanding treatment and care after a leg fracture.

•	The answer without parameters is too brief and incomplete for practical use.



## Output Evaluation

In [None]:
groundedness_rater_system_message = """
You are tasked with rating AI generated answers to questions posed by users.
You will be presented a question, context used by the AI system to generate the answer and an AI generated answer to the question.
In the input, the question will begin with ###Question, the context will begin with ###Context while the AI generated answer will begin with ###Answer.

Evaluation criteria:
The task is to judge the extent to which the metric is followed by the answer.
1 - The metric is not followed at all
2 - The metric is followed only to a limited extent
3 - The metric is followed to a good extent
4 - The metric is followed mostly
5 - The metric is followed completely

Metric:
The answer should be derived only from the information presented in the context

Instructions:
1. First write down the steps that are needed to evaluate the answer as per the metric.
2. Give a step-by-step explanation if the answer adheres to the metric considering the question and context as the input.
3. Next, evaluate the extent to which the metric is followed.
4. Use the previous information to rate the answer using the evaluaton criteria and assign a score.
"""

In [None]:
relevance_rater_system_message = """
You are tasked with rating AI generated answers to questions posed by users.
You will be presented a question, context used by the AI system to generate the answer and an AI generated answer to the question.
In the input, the question will begin with ###Question, the context will begin with ###Context while the AI generated answer will begin with ###Answer.

Evaluation criteria:
The task is to judge the extent to which the metric is followed by the answer.
1 - The metric is not followed at all
2 - The metric is followed only to a limited extent
3 - The metric is followed to a good extent
4 - The metric is followed mostly
5 - The metric is followed completely

Metric:
Relevance measures how well the answer addresses the main aspects of the question, based on the context.
Consider whether all and only the important aspects are contained in the answer when evaluating relevance.

Instructions:
1. First write down the steps that are needed to evaluate the context as per the metric.
2. Give a step-by-step explanation if the context adheres to the metric considering the question as the input.
3. Next, evaluate the extent to which the metric is followed.
4. Use the previous information to rate the context using the evaluaton criteria and assign a score.
"""

In [None]:
user_message_template = """
###Question
{question}

###Context
{context}

###Answer
{answer}
"""

In [None]:
def generate_ground_relevance_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=3)
    context_list = [d.page_content for d in relevant_document_chunks]
    context_for_query = ". ".join(context_list)

    # Combine user_prompt and system_message to create the prompt
    prompt = f"""[INST]{qna_system_message}\n
                {'user'}: {qna_user_message_template.format(context=context_for_query, question=user_input)}
                [/INST]"""

    response = llm(
            prompt=prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    answer =  response["choices"][0]["text"]

    # Combine user_prompt and system_message to create the prompt
    groundedness_prompt = f"""[INST]{groundedness_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Combine user_prompt and system_message to create the prompt
    relevance_prompt = f"""[INST]{relevance_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    response_1 = llm(
            prompt=groundedness_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    response_2 = llm(
            prompt=relevance_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    return response_1['choices'][0]['text'],response_2['choices'][0]['text']

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=350)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information related to managing sepsis in a critical care unit from the context.
2. Compare the identified information with the AI generated answer to check if the answer is derived only from the context.
3. Evaluate the extent to which the metric is followed.

Explanation:
The context provides detailed information about managing critically ill patients in an ICU, including supportive care and patient monitoring. Among these details, there are specific instructions for managing sepsis, such as administering antibiotics, draining abscesses, and normalizing blood glucose. The AI generated answer summarizes these instructions accurately, making it clear that the answer is derived only from the context.

Evaluation:
The metric is followed completely.

Rating:
Based on the evaluation criteria, I would rate the answer as a 5 because the metric is followed completely.

 Steps to evaluate the context as per the relevance metric:
1. Identify th

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input_2 = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
ground,rel = generate_ground_relevance_response(user_input_2,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information in the context related to appendicitis and its treatment.
2. Determine if the symptoms mentioned in the question are present in the context.
3. Check if the answer mentions only the symptoms and treatment options derived from the context.
4. Verify that the answer does not include any additional or incorrect information.

Explanation:
The answer adheres to the metric as it mentions the common symptoms for appendicitis (abdominal pain, anorexia, and abdominal tenderness) directly from the context and states that appendicitis cannot be cured via medicine alone but requires surgical removal of the appendix. This information is derived solely from the context provided.

Rating:
Based on the evaluation criteria, I would rate this answer as a 5 because it follows the metric completely by deriving all the information in the answer directly from the context without any additional or incorrect information.

 Steps to evaluate the co

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [None]:
user_input_2 = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
ground,rel = generate_ground_relevance_response(user_input_2,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the main question and the specific information being asked for in the question.
2. Read through the context provided to understand the key points related to the topic of hair loss, including causes and treatments.
3. Determine if the AI generated answer is derived solely from the information presented in the context.

Explanation:
The main question asks about effective treatments or solutions for sudden patchy hair loss (alopecia areata) and possible causes behind it. The context provides detailed information on various types of hair loss, including alopecia areata, and mentions several treatment options for this condition.

The AI generated answer correctly identifies the condition as alopecia areata and lists several treatment options, which are all mentioned in the context. Therefore, the answer is derived solely from the information presented in the context.

Evaluation:
Based on the evaluation criteria, the metric is followed completely. 

### Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input_2 = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
ground,rel = generate_ground_relevance_response(user_input_2,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information in the context related to treatments for traumatic brain injury (TBI).
2. Determine if the AI generated answer is derived only from the information presented in the context.
3. Explain how the answer adheres to the metric considering the question and context as the input.

Explanation:
The AI generated answer mentions that there is no specific treatment for TBI, but supportive care is important. This includes preventing systemic complications due to immobilization, providing good nutrition, and preventing pressure ulcers. The answer also mentions that in severe cases, surgery may be needed. These details are all directly taken from the context.

The answer adheres to the metric as it is derived only from the information presented in the context. The question asked for treatments recommended for a person with brain injury and the context provided detailed information about the supportive care and surgical interventions that 

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input_2 = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
ground,rel = generate_ground_relevance_response(user_input_2,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the information in the context related to precautions and treatment steps for a person with a fractured leg.
2. Compare the information identified in step 1 with the AI generated answer to ensure that all elements of the answer are derived from the context.
3. Evaluate the extent to which the metric is followed by checking if there is any additional or irrelevant information in the answer that is not present in the context.

Explanation:
The AI generated answer includes all the necessary precautions and treatment steps for a person with a fractured leg as presented in the context. The answer also mentions some additional information such as "avoiding sharing or publishing the contents of the document" which is not present in the context but it does not affect the accuracy of the answer as per the metric.

Evaluation:
The metric is followed mostly as the AI generated answer includes all the necessary information from the context and only has so

## Actionable Insights and Business Recommendations

### Vector database creation time increases with the number of pages in the PDF document.

o	Actionable Insight: For larger medical manuals or a collection of documents, building the vector database will be a time-consuming process.

o	Business Recommendation: Plan for sufficient infrastructure and time for the initial setup. For ongoing updates, consider incremental updates to the vector database rather than rebuilding it entirely. This will minimize downtime and resource usage. Evaluate cloud-based vector database solutions for scalability and performance.


### Retrieval parameter k is critical as the answer can be spread across multiple contexts.

o	Actionable Insight: The quality of the retrieved context directly impacts the language model's ability to answer the question. A poorly chosen k can lead to incomplete or inaccurate answers.

o	Business Recommendation: Implement a strategy for tuning k based on the type of questions being asked. Consider dynamic adjustment of k based on initial retrieval results or the complexity of the query. Monitor the performance of the system with different k values and use evaluation metrics to determine the optimal setting for different scenarios.


### chunk_overlap ensures coherence, especially when context spans across chunks.

o	Actionable Insight: Overlapping chunks helps maintain the flow of information and prevents fragmentation of sentences or concepts that span across chunk boundaries.

o	Business Recommendation: Experiment with different chunk_overlap values to find the optimal balance between capturing complete information and avoiding excessive redundancy. This can improve the quality of the retrieved context and, consequently, the accuracy of the generated answers.


### max_tokens depends on query complexity; higher values yield detailed responses, while simple queries result in concise outputs despite large token limits due to prompt design and zero temperature.

o	Actionable Insight: The max_tokens parameter controls the maximum length of the generated response. While setting a high limit allows for detailed answers, a well-designed prompt and low temperature can still produce concise responses for simple queries.

o	Business Recommendation: Implement a system that can dynamically adjust max_tokens based on the perceived complexity of the query. For complex medical questions, allow for longer responses. For simple definitional questions, a lower max_tokens might be sufficient. This optimizes resource usage and provides more appropriate responses.


### Refine prompt design and temperature settings to control response length and creativity.

o	Actionable Insight: The prompt provided to the language model and the temperature setting significantly influence the style, length, and creativity of the generated answer.

o	Business Recommendation: Invest time in crafting clear and effective prompts that guide the language model to generate the desired type of response (e.g., concise summary, detailed explanation). Experiment with different temperature settings to find the right balance between factual accuracy (lower temperature) and potentially more creative or varied phrasing (higher temperature).


### Continuously adjust RAG parameters based on specific use cases for optimal performance.

o	Actionable Insight: There is no single set of optimal parameters for all types of medical questions. The best parameters will vary depending on the query's nature, the user's needs, and the desired output characteristics.

o	Business Recommendation: Implement a system for ongoing monitoring and evaluation of the RAG system's performance. Establish a process for gathering feedback from healthcare professionals and use this feedback to inform parameter adjustments. This continuous improvement loop is essential for maintaining high performance and user satisfaction.


### Prioritize groundedness and relevance in evaluations to ensure reliable and contextually accurate outputs.

o	Actionable Insight: Groundedness and relevance are critical metrics for evaluating the trustworthiness and usefulness of the generated medical information.

o	Business Recommendation: Make groundedness and relevance key performance indicators (KPIs) for the RAG system. Regularly evaluate the system's performance against these metrics and use the results to identify areas for improvement. Implement automated evaluation tools and leverage human experts to assess the quality of the responses.


### Establish a feedback loop to fine-tune parameters, improving performance for diverse query types.

o	Actionable Insight: A structured feedback mechanism from users is invaluable for identifying areas where the RAG system is underperforming for specific types of queries.

o	Business Recommendation: Implement a user feedback system that allows healthcare professionals to easily report issues or provide suggestions. Use this feedback to prioritize areas for parameter tuning and model improvements. This will help ensure that the system effectively meets the diverse needs of its users.


<font size=6 color='blue'>Power Ahead</font>
___