## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

**Common Questions to Answer**

**1. Diagnostic Assistance**: "What are the common symptoms and treatments for pulmonary embolism?"

**2. Drug Information**: "Can you provide the trade names of medications used for treating hypertension?"

**3. Treatment Plans**: "What are the first-line options and alternatives for managing rheumatoid arthritis?"

**4. Specialty Knowledge**: "What are the diagnostic steps for suspected endocrine disorders?"

**5. Critical Care Protocols**: "What is the protocol for managing sepsis in a critical care unit?"

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [1]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used
#!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used
# !CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

In [2]:
# For installing the libraries & downloading models from HF Hub
#!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q

In [3]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

#Libraries for downloading and loading the llm
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

In this project, we will use Meta/Facebook's Llama as a LLM with Mistral-7B-Instruct-v0.2 with GGUF format. Llama is designed to be open-source and accessible and a popular choice for research and development. Furthermore, we will use llama.cpp (a c++) software to integrate, run this model.


**High level description:**

*llama.cpp* - is a high-performance C++ inference library designed to run large language models efficiently on various hardware, including CPUs and GPUs, with a focus on local execution. It provides the core functionality for loading and running models in the GGUF format.

*Mistral-7B-Instruct-v0.2-GGUF* - is a powerful and versatile model that's made even more accessible and efficient through the GGUF format, making it a popular choice for both hobbyists and professionals. This model has 7B parameters and in GGUF format which is quantized and feasible to run in consumer grade or low resource (GPU) format.  



#### Downloading and Loading the model

In [4]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [5]:
model_path = hf_hub_download(
    repo_id= model_name_or_path,
    filename= model_basename
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [6]:
#uncomment the below snippet of code if the runtime is connected to GPU.
llm = Llama(
    model_path=model_path,
    n_ctx=4000,  #6000,  #2300,
    n_gpu_layers=38,
    n_batch=512
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


### Observation
* In llama.cpp, n_ctx specifically sets the maximum length of the combined prompt and generated output in tokens.
* The dafault n_ctx of llama is 2048. Unfortunately, we increase the n_ctx to accomodate more context for this application as it requires a response with lot of technical/medical jargon
* After trial and error, we set the value of n_ctx = 4000
* n_batch refers to the maximum batch size for processing the prompt tokens. its default value is 512.
* n_gpu_layers specifies the number of layers you want to process on the GPU. Hight value will give faster processing but higher cost associates with it.
* model_path is the path of the model.

#### Response

In [7]:
def response(query,max_tokens=128,temperature=0,top_p=0.95,top_k=40):
    # Generate the response
    try:
        response = llm(
                  prompt=query,
                  max_tokens=max_tokens,
                  temperature=temperature,
                  top_p=top_p,
                  top_k=top_k
                  )

        # Extract and print the model's response
        response = response['choices'][0]['text'].strip()
    except Exception as e:
        response = f'Sorry, I encountered the following error: \n {e}'
    return response



### Observation

llm takes the following argument for response
Arguments:

1. query - user_input
  * Query/Questions from user.

2. max_tokens
  * Sets a limit on the number of tokens (words or characters) the model can generate in a single response. In llama cpp, its default value is 128

3. top_k
  * Limits the model's choices to the top k most probable tokens at each step of text generation. Lower value leads to more predictable and focused outputs, reducing the likelihood of irrelevant tokens. Higher value allows for greater diversity and creativity, enabling the model to consider a wider range of possibilities. In llama, the value of top_k 1-10 is consider low and 20-50+ is consider high. Its default value is 40.

4. top_p
  * Selects tokens from a probability distribution, considering tokens whose cumulative probability is above a specified threshold p. lower value of it, restrict the model with focus and deterministic and higher value gives more flexibilty. Its range is 0 - 1.

5. temperature
  * influences the overall randomness of token selection. Lower temperatures lead to more deterministic outputs and higher temperature gives more deterministic output. Its range is 0-1 in llama model.

Lets ask a common query: What treatment options are available for managing hypertension?

In [8]:
print(response("What treatment options are available for managing hypertension?"))

Hypertension, or high blood pressure, is a common condition that can increase the risk of various health problems such as heart disease, stroke, and kidney damage. The good news is that there are several effective treatment options available to help manage hypertension and reduce the risk of complications. Here are some of the most commonly used treatments:

1. Lifestyle modifications: Making lifestyle changes is often the first line of defense against hypertension. This may include eating a healthy diet rich in fruits, vegetables, whole grains, and lean proteins; limiting sodium intake; getting regular physical activity


### Observation
* The response seems generic and provides very high level information.
* The response contains only limited information/incomplete.

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [9]:
print(response("What is the protocol for managing sepsis in a critical care unit?"))

Llama.generate: prefix-match hit


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Resusc


In [10]:
print(response("What is the protocol for managing sepsis in a critical care unit?", max_tokens=512))

Llama.generate: prefix-match hit


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Source control: Identify and address the source of infection as quickly as possible. This may involve surgical intervention, such as drainage of an abscess or debridement of necrotic tissue.
3. Fluid resuscitation: Administer intravenous fluids to maintain adequate blood pressure and organ perfusion. The goal is to achieve a mean arterial pressure (MAP) of at least 65 mmHg and a central venous oxygen saturation (ScvO2) of greater than 70%.
4. Vasopressor support: 

In [11]:
print(response("What is the protocol for managing sepsis in a critical care unit?", max_tokens=1024))

Llama.generate: prefix-match hit


Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Source control: Identify and address the source of infection as quickly as possible. This may involve surgical intervention, such as drainage of an abscess or debridement of necrotic tissue.
3. Fluid resuscitation: Administer intravenous fluids to maintain adequate blood pressure and organ perfusion. The goal is to achieve a mean arterial pressure (MAP) of at least 65 mmHg and a central venous oxygen saturation (ScvO2) of greater than 70%.
4. Vasopressor support: 

### Observation

* Query 1's response is also generic. Increasing max_tokens size to 1024 helps to capture the entire response.

**As max_tokens = 1024 helps to capture the response, we will use it for the future queries (only LLM and LLM with Prompt) in order to avoid truncation.**

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [12]:
print(response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?", max_tokens=1024))

Llama.generate: prefix-match hit


Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:

1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing.
2. Loss of appetite: People with appendicitis may lose their appetite due to abdominal pain and discomfort.
3. Nausea and vomiting: Vomiting is a common symptom of appendicitis, especially in the later stages of the condition.
4. Fever: A fever of 100.4°F (38°C) or higher may be present in some cases of appendicitis.
5. Constipation or diarrhea: Both constipation and diarrhea can occur with appendicitis, depending on the location and severity o

* Query 2's response is also generic. Increasing max_tokens size helps to capture the entire response.

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [13]:
print(response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?", max_tokens=1024))

Llama.generate: prefix-match hit


Sudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.

The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications.

There are several treatments that have been shown to be effective in addressing sudden patchy hair loss:

1. Corticosteroids: These are anti-inflammatory drugs that can help reduce inflammation and suppress the immune system's attack on the hair follicles. They can be applied topically or taken orally, depending on the severity of the condition.
2. Minoxidil: This is a medication that has been shown to promote hair growth in some people with alopecia areata. It works by increasing blood f

* Query 3's response is also generic. Increasing max_tokens size helps to capture the entire response.

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [14]:
print(response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?", max_tokens=1024))

Llama.generate: prefix-match hit


A person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:

1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.
2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as swelling, seizures, pain, or infections. For example, corticosteroids may be used to reduce brain swelling, and anticonvulsants may be given to prevent seizures.
3. Rehabilitation: Rehabilitation is an essential part of the recovery process for TBI patients. Rehabilitation may include physical therapy to help with mobility and strength, occupational therapy to help with daily living skills, speech therapy to improv

* Query 4's response is also generic. Increasing max_tokens size helps to capture the entire response.

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [15]:
print(response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?", max_tokens=1024))

Llama.generate: prefix-match hit


First and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:

1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.
2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.
3. Immobilize the leg: Use a splint, sling, or other available materials to immobilize the leg and prevent movement. Be sure not to apply too much pressure on the injury site.
4. Provide pain relief: Offer over-the-counter pain medication, such as acetaminophen or ibuprofen, to help manage pain.
5. Seek medical attention: If the fracture is severe or if you suspect that there may be other injuries, seek medical help as soon as possible.

Once you've ensured the person's safety and stability, conside

* Query 5's response is also generic. Increasing max_tokens size helps to capture the entire response.

### Observation

* All queries to llm without prompt are generic with additional information.
* Prompt engineering tune this queries and extracts specific information


#### QUerying with Prompt Engineering

* System Prompt - System prompt guids AI its behavior and responses thoughout its interaction.

* In this project, we definied a system prompt and instruct AI to help medical professional to find comprehensive, accurate medical information and treatment plan.

In [16]:
system_prompt = """
You are a highly knowledgeable and reliable medical assistant, designed to support healthcare professionals by providing comprehensive, evidence-based, and up-to-date medical information and treatment recommendations.

Your responses must reflect current clinical guidelines, medical best practices, and trustworthy sources. Always prioritize accuracy, patient safety, and clarity in your answers.

Response Requirements:

As an expert, address the following areas as needed, with precision, clinical relevance, and clear structure:

1. Diagnostic Assistance
Summarize the common symptoms, risk factors, diagnostic criteria, treatments, and potential complications for various diseases and conditions.

2. Drug Information
Provide a list of relevant medications, including their trade names, mechanisms of action, therapeutic uses, and notable side effects.

3. Treatment Planning
Outline first-line treatments and alternative options, highlighting clinical considerations such as comorbidities, age, or contraindications that may influence treatment choice.

4. Specialty Knowledge
Describe diagnostic pathways for evaluating specific disorders, including necessary laboratory tests, imaging studies, and clinical assessments.

5. Critical Care Protocols
Detail protocols for managing acute and life-threatening conditions (e.g., sepsis), covering key aspects like initial stabilization, pharmacologic interventions, monitoring, and escalation of care.

6. Clarity and Communication
Use clear, professional language suitable for clinicians. Avoid unnecessary medical jargon that may obscure understanding.

7. Limitations and Clinical Judgment
Acknowledge limitations in available evidence where applicable. Provide recommendations for when specialist referral or further investigation is warranted.

Important Guidelines:

1. Base your responses on authoritative sources such as WHO, CDC, NICE, UpToDate, or peer-reviewed medical literature.

2. Clearly cite or reference evidence-based guidelines when appropriate.

3. Do not make personal diagnoses or suggest treatments that could be unsafe.

4. If no reliable information is available, respond with: “I don’t know.”

Your goal is to assist healthcare professionals in making informed clinical decisions with clarity, safety, and confidence.
"""

In [17]:
def response_with_prompt(query,max_tokens=128,temperature=0,top_p=0.95,top_k=40):
    query = system_prompt + '\n' + query
    return response(query=query,max_tokens=max_tokens,temperature=temperature,top_p=top_p,top_k=top_k)

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [18]:
print(response_with_prompt("What is the protocol for managing sepsis in a critical care unit?", max_tokens=1024))

Llama.generate: prefix-match hit


Sepsis is a life-threatening condition caused by a dysregulated host response to infection. Early recognition and prompt management are crucial to improve outcomes. Here's an outline of the sepsis management protocol in a critical care unit:

1. Initial stabilization:
   - Secure airway, provide oxygen support, and initiate mechanical ventilation if necessary.
   - Establish intravenous access and administer fluids to maintain adequate blood pressure and organ perfusion.
   - Monitor vital signs, electrolytes, glucose levels, lactate, urine output, and arterial blood gases.
   - Administer broad-spectrum antibiotics based on suspected infection site and microbiological culture results.

2. Pharmacologic interventions:
   - Administer vasopressors to maintain mean arterial pressure (MAP) ≥65 mmHg if needed.
   - Initiate corticosteroids in cases of septic shock or refractory hypotension.
   - Consider administering intravenous immunoglobulin (IVIG) for severe sepsis or septic shock, esp

* Query respond is more focused than before and guided by system prompt.
* Generic and not domain specific.

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [19]:
print(response_with_prompt("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?", max_tokens=1024))

Llama.generate: prefix-match hit


Appendicitis is a common inflammatory condition characterized by the swelling and eventual rupture of the appendix, a small pouch located in the lower right abdomen. The following are common symptoms associated with appendicitis:

1. Abdominal pain, typically starting around the navel area but migrating to the right lower quadrant over several hours.
2. Loss of appetite and/or anorexia.
3. Nausea and vomiting.
4. Fever (often low-grade initially, but can rise significantly if appendicitis progresses).
5. Constipation or diarrhea.
6. Rebound tenderness when the abdomen is palpated.
7. Guarding (involuntary contraction of abdominal muscles) and rigidity in response to pressure on the affected area.
8. Inability to pass gas or have a bowel movement.

Appendicitis cannot be cured via medicine alone, as the condition involves inflammation and eventual rupture of the appendix. Once the appendix has ruptured, peritonitis (inflammation of the peritoneum) can occur, leading to a potentially lif

* Query respond is more focused than before and guided by system prompt.
* Generic and not domain specific.

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [20]:
print(response_with_prompt("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?", max_tokens=1024))

Llama.generate: prefix-match hit


Answer:

Sudden patchy hair loss, also known as alopecia areata, is an autoimmune disorder that results in hair loss from some or all areas of the scalp. The exact cause of alopecia areata is unknown, but it's believed to be related to a problem with the immune system.

Symptoms:
- Round or oval bald patches on the scalp, beard, or other areas of the body
- Hair loss may occur suddenly and can affect one or more areas of the body
- In some cases, hair regrowth may occur within several months to a year

Risk factors:
- Family history of alopecia areata or other autoimmune diseases
- Stress, anxiety, or emotional trauma
- Viral infections
- Certain medications (e.g., antithyroid drugs, retinoids)

Diagnostic criteria:
A definitive diagnosis is typically made based on the appearance of the hair loss and ruling out other causes such as fungal infections, nutritional deficiencies, or hormonal imbalances. A scalp biopsy may be performed to confirm the diagnosis.

Treatments:
1. Topical corti

* Query respond is more focused than before and guided by system prompt.
* Generic and not domain specific.

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [21]:
print(response_with_prompt("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?", max_tokens=1024))

Llama.generate: prefix-match hit


1. Diagnostic Assistance:
Brain injuries can be classified as traumatic (resulting from external force) or non-traumatic (caused by internal factors like stroke or infection). Common symptoms include headache, confusion, memory loss, difficulty speaking or understanding speech, seizures, and paralysis. Diagnostic criteria involve a thorough neurological examination, imaging studies (CT or MRI), and possibly cognitive assessments to determine the extent of damage.

2. Drug Information:
Medications used in brain injury treatment include analgesics for pain relief (acetaminophen, opioids), anti-inflammatory drugs (nonsteroidal anti-inflammatory drugs or corticosteroids), anticonvulsants to prevent seizures (valproate, levetiracetam), and medications to manage symptoms like depression or anxiety.

3. Treatment Planning:
First-line treatments include restoring blood flow to the brain if there is a vascular cause, controlling intracranial pressure, preventing infection, and managing symptoms

* Query respond is more focused than before and guided by system prompt.
* Generic and not domain specific.

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [22]:
print(response_with_prompt  ("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?", max_tokens=1024))

Llama.generate: prefix-match hit


1. Diagnostic Assistance:
A suspected leg fracture is typically characterized by intense pain, swelling, deformity, inability to bear weight, and possibly an audible snap or crunching sound at the time of injury. Risk factors include hiking on uneven terrain, tripping over rocks or roots, and falling from a height. Diagnostic criteria involve physical examination, including palpation for deformities and tenderness, as well as imaging studies like X-rays to confirm the fracture location and severity.

2. Drug Information:
Pain management is crucial in the initial stages of leg fracture care. Commonly used medications include acetaminophen, nonsteroidal anti-inflammatory drugs (NSAIDs), or opioids for severe pain. Tetanus prophylaxis may be necessary if the injury occurred due to a puncture wound or contamination with dirt.

3. Treatment Planning:
Initial treatment involves immobilizing the affected limb using a splint, sling, or cast to prevent further damage and promote healing. Depend

* Query respond is more focused than before and guided by system prompt.
* Generic and not domain specific.

## Data Preparation for RAG

### Observation
* So far, we observed that LLM retures generic response with prompt.
* RAG embeds domain knowledge into LLM.
* The steps to build a RAG are as follows:
    * A domain specific reference document has been feed to model. First, we chunk the documents into small pieces, converted to a vector by Transformer embedding, and stored it to a vector database.
    * When a user query, it is encoded with the same transformer and applied with the same embedding to convert it to vector. Later, top k similar chunk/document have been extracted as context from vector database using similarity metric (cosine similarity) and add them into the prompt along with the system prompt and submitted to llm.
    * llm responds queries relevant to the sysmtem prompts and user context accordingly.

In [23]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

### Loading the Data

In [25]:
from google.colab import drive
drive.mount('/content/drive')
file_name = "/content/drive/MyDrive/Colab Notebooks/medical_diagnosis_manual.pdf"
pdf_loader = PyMuPDFLoader(file_name)
manual = pdf_loader.load()


Mounted at /content/drive


In [26]:
manual[0].metadata

{'source': '/content/drive/MyDrive/Colab Notebooks/medical_diagnosis_manual.pdf',
 'file_path': '/content/drive/MyDrive/Colab Notebooks/medical_diagnosis_manual.pdf',
 'page': 0,
 'total_pages': 4114,
 'format': 'PDF 1.7',
 'title': 'The Merck Manual of Diagnosis & Therapy, 19th Edition',
 'author': '',
 'subject': '',
 'keywords': '',
 'creator': 'Atop CHM to PDF Converter',
 'producer': 'pdf-lib (https://github.com/Hopding/pdf-lib)',
 'creationDate': 'D:20120615054440Z',
 'modDate': 'D:20250603172640Z',
 'trapped': ''}

### Observation
* Loading pdf using PyMuPDFLoader.
* sample metadata is observed for the first page with pdf infomation.

### Data Overview

#### Checking the first 5 pages

In [27]:
for i in range(5):
    print(f"Page Number : {i+1}",end="\n")
    print(manual[i].page_content,end="\n")


Page Number : 1
rakib114@gmail.com
ERQ18432WM
meant for personal use by rakib114@gma
shing the contents in part or full is liable 

Page Number : 2
rakib114@gmail.com
ERQ18432WM
This file is meant for personal use by rakib114@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.

Page Number : 3
Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    ...........................................................................................................................................................................................
53
1 - 

#### Checking the number of pages

In [28]:
len(manual)

4114

### Data Chunking

In [29]:
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=512,
    chunk_overlap=16
)

In [30]:
document_chunks = pdf_loader.load_and_split(text_splitter)

In [31]:
len(document_chunks)

8423

In [32]:
document_chunks[0].page_content

'rakib114@gmail.com\nERQ18432WM\nmeant for personal use by rakib114@gma\nshing the contents in part or full is liable'

In [33]:
document_chunks[1].page_content

'rakib114@gmail.com\nERQ18432WM\nThis file is meant for personal use by rakib114@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action.'

In [34]:
document_chunks[2].page_content

'Table of Contents\n1\nFront    ................................................................................................................................................................................................................\n1\nCover    .......................................................................................................................................................................................................\n2\nFront Matter    ...........................................................................................................................................................................................\n53\n1 - Nutritional Disorders    ...............................................................................................................................................................\n53\nChapter 1. Nutrition: General Considerations    ...........................................................................................

In [35]:
document_chunks[3].page_content

"305\nChapter 25. Drugs & the Liver    ................................................................................................................................................\n308\nChapter 26. Alcoholic Liver Disease    ....................................................................................................................................\n314\nChapter 27. Fibrosis & Cirrhosis    ............................................................................................................................................\n322\nChapter 28. Hepatitis    ..................................................................................................................................................................\n333\nChapter 29. Vascular Disorders of the Liver    .....................................................................................................................\n341\nChapter 30. Liver Masses & Granulomas    ...........................................

In [36]:
document_chunks[4].page_content

'491\nChapter 44. Foot & Ankle Disorders    .....................................................................................................................................\n502\nChapter 45. Tumors of Bones & Joints    ...............................................................................................................................\n510\n5 - Ear, Nose, Throat & Dental Disorders    ..................................................................................................................\n510\nChapter 46. Approach to the Patient With Ear Problems    ...........................................................................................\n523\nChapter 47. Hearing Loss    .........................................................................................................................................................\n535\nChapter 48. Inner Ear Disorders    ...................................................................................................

In [37]:
document_chunks[5].page_content

'742\n7 - Dermatologic Disorders    ....................................................................................................................................................\n742\nChapter 71. Approach to the Dermatologic Patient    .......................................................................................................\n755\nChapter 72. Principles of Topical Dermatologic Therapy    ............................................................................................\n760\nChapter 73. Acne & Related Disorders    ...............................................................................................................................\n766\nChapter 74. Bullous Diseases    .................................................................................................................................................\n771\nChapter 75. Cornification Disorders    .............................................................................................

In [38]:
document_chunks[-3].page_content

"X\nXanthelasma 579\nXanthine oxidase deficiency 3026\nXanthines 2064\nXanthochromia 1595\nXanthogranulomatous pyelonephritis 2379\nXanthomas 634, 645, 895\nXanthomatosis, cerebrotendinous 894\nX chromosome 2599, 3373 (see also Genetic)\nabnormalities of 3002-3005, 3376-3377, 3377\ninactivation of 3002, 3378\n47,XXX syndrome 3005\n48,XXXX syndrome 3005\n49,XXXXX syndrome 3005\n47,XYY syndrome 3005\nXenograft 1126 (see also Transplantation)\nXeroderma (xerosis) 661\nXerophthalmia 594\nvitamin A deficiency and 34\nXerosis 636\nXerostomia 501, 513-516\nradiation-induced 488, 505\nin Sjogren's syndrome 304\nX-linked adrenoleukodystrophy 3024\nX-linked agammaglobulinemia 1092, 1097, 1107-1108\nX-linked lymphoproliferative syndrome 1092, 1097, 1108\nX-rays 3252 (see also Radiation; Radiation therapy; Radiography)\nXylene poisoning 3348\nXylitol 518\nXylose absorption test 155, 3494, 3500\nXylose breath test 156, 157\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nX\n4102\nrakib114@gm

In [39]:
document_chunks[-2].page_content

'Y\nYaws 1266-1267\nforest 1379\nY chromosome 3373 (see also Genetic)\nabnormalities of 3005\nYeast infection (see also Fungal infection)\nvaginal 2542, 2544, 2545\nYellow fever 1400, 1429, 1437\nhepatic inflammation in 248\nvaccine against 1172, 1437, 3441\nYellow nail syndrome 732, 1995\npleural effusion in 1997\nYellow skin (see Jaundice)\nYersinia infection 1167, 1256-1257\nY. enterocolitica infection 147\nY. pestis infection 1924\nYew poisoning 3338\nYips 1762\nYo, antibodies to 1056\nYolk sac tumor 2476\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nY\n4103\nrakib114@gmail.com\nERQ18432WM\nThis file is meant for personal use by rakib114@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action.'

### Observation
* Chunking break downs pdf into multiple small pieces that will be stored into vector database for context extraction.
* RecursiveCharacterTextSplitter has been used for chunking pdf.
* cl100k_base encoded has been used during chunking with chunk size 512.
* To extract context with continuty, 16 tokens overlap has been used.
* The length of the pdf is 4114
* Total chunk is 8423.
* The length of each chunk is not fixed as pdf has text with index, text, tables, figure, etc.
* Some sample chunk with contents have been observed and overlap has been found. For example, thrid pages ends with \n305 and fourth page starts with this.


### Embedding

In [40]:
embedding_model = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [41]:
embedding_1 = embedding_model.embed_query(document_chunks[0].page_content)
embedding_2 = embedding_model.embed_query(document_chunks[1].page_content)
print("Dimension of the embedding vector ",len(embedding_1))
len(embedding_1)==len(embedding_2)

Dimension of the embedding vector  384


True

In [42]:
print("Embedding for chunks[0]")
print(embedding_1)
print("Embedding for chunks[1]")
print(embedding_2)

Embedding for chunks[0]
[-0.09146303683519363, 0.05849295109510422, 0.029007835313677788, -0.022360781207680702, 0.09978065639734268, -0.03650221601128578, 0.08245579153299332, 0.04067248851060867, 0.017448972910642624, -0.021561605855822563, 0.07259844988584518, 0.01681794598698616, 0.057213980704545975, -0.06978625059127808, -0.04927937313914299, -0.013840477913618088, -0.02740032598376274, -0.04260037839412689, -0.10983483493328094, 0.008797114714980125, -0.04679570719599724, -0.022740844637155533, -0.030446060001850128, -0.007289224769920111, -0.006617100443691015, -0.005346018821001053, 0.00032810421544127166, 0.06624624133110046, -0.015210751444101334, -0.06257704645395279, 0.06762471795082092, 0.01683788374066353, 0.0253400020301342, 0.006672594230622053, 0.033839933574199677, 0.038182806223630905, -0.05627843737602234, -0.05664138123393059, -0.031023193150758743, -0.07766245305538177, -0.012832779437303543, -0.05757516622543335, -0.03073018603026867, 0.06104166433215141, 0.0225

### Observation
* Sentence Transformer with all-MiniLM-L6-v2 model has been used for vector embedding.
* The length of each vector is 384.
* The same embedding will be used for context extraction during user query.

### Vector Database

In [43]:
out_dir = 'medical_db3'

if not os.path.exists(out_dir):
  os.makedirs(out_dir)

vectorstore = Chroma.from_documents(
    documents=document_chunks[2:], # First two pages hold nothing
    embedding=embedding_model,
    persist_directory=out_dir
)

vectorstore = Chroma(persist_directory=out_dir,embedding_function=embedding_model)
print(vectorstore.embeddings)
print(vectorstore.similarity_search("what is the treatment of hypertention?",k=5))

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given


client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
  (2): Normalize()
) model_name='all-MiniLM-L6-v2' cache_folder=None model_kwargs={} encode_kwargs={} multi_process=False
[Document(page_content='chloride, including glutaraldehyde, formaldehyde, and tannic acid, are effective but can cause contact\ndermatitis and skin discoloration. A solution of methenamine also may help.\nTap water iontophoresis, in which salt ions are introduced into the skin using electric current, is an option\nfor patients unresponsive to topical treatments. The affected areas (typically palms or soles) are placed in\n2 tap water basins each containing an electrode ac

### Observation
* Vector database Chroma has been used to storing vector embedding of chunks.

### Retriever

In [44]:
retriever = vectorstore.as_retriever(
    search_type='similarity',
    search_kwargs={'k': 5}
)

rel_docs = retriever.get_relevant_documents("what is the treatment of diabetic?")
doc_count = 1
for doc in rel_docs:
    print(f"\nDocument {doc_count} :\n")
    doc_count += 1
    print(doc.page_content)



Document 1 :

Chapter 99. Diabetes Mellitus and Disorders of Carbohydrate Metabolism
Introduction
Diabetes mellitus and its complications (diabetic ketoacidosis, nonketotic hyperosmolar syndrome) are the
most common disorders of carbohydrate metabolism, but alcoholic ketoacidosis and hypoglycemia are
also important.
Diabetes Mellitus
Diabetes mellitus (DM) is impaired insulin secretion and variable degrees of peripheral insulin
resistance leading to hyperglycemia. Early symptoms are related to hyperglycemia and include
polydipsia, polyphagia, and polyuria. Later complications include vascular disease, peripheral
neuropathy, and predisposition to infection. Diagnosis is by measuring plasma glucose.
Treatment is diet, exercise, and drugs that reduce glucose levels, including insulin and oral
antihyperglycemic drugs. Prognosis varies with degree of glucose control.
There are 2 main categories of DM—type 1 and type 2, which can be distinguished by a combination of
features (see
Table 99-1

### Observation
* Retriever retrieves context from the user query.
* Same transformer model has been used to embeded user query to embedding.
* Top k contexts has been retrieved based on the similary metric.
* We observed that, top 5 context has been extracted from vector databases for the query ""what is the treatment of diabetic?"".

In [45]:
model_output = llm(
      "what is the treatment of diabetic?",
      max_tokens=512,
      temperature=0,
    )

print(model_output['choices'][0]['text'])

Llama.generate: prefix-match hit




Diabetes is a chronic condition that requires ongoing management to keep blood sugar levels within a healthy range. The primary treatments for diabetes include:

1. Diet and exercise: Eating a healthy diet and getting regular physical activity can help manage blood sugar levels and improve overall health. A healthcare provider or registered dietitian can provide guidance on creating a meal plan that meets individual nutritional needs.
2. Medications: Depending on the type and severity of diabetes, medications may be necessary to help manage blood sugar levels. There are several types of diabetes medications, including insulin, oral medications, and injectable medications.
3. Insulin therapy: For people with type 1 diabetes or certain types of type 2 diabetes, insulin therapy may be necessary to provide the body with the insulin it needs to function properly. This can be administered through injections or an insulin pump.
4. Blood sugar monitoring: Regular blood sugar monitoring is im

### System and User Prompt Template

In [46]:
qna_system_message = system_prompt
qna_user_message_template = """Answer the question based on the following context:

{context}

Question: {question}
"""


### Response Function

In [47]:
def generate_rag_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=40):
    global qna_system_message, qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=k)
    context_list = [d.page_content for d in relevant_document_chunks]

    # Combine document chunks into a single context
    context_for_query = ". ".join(context_list)

    user_message = qna_user_message_template.replace('{context}', context_for_query)
    user_message = user_message.replace('{question}', user_input)

    prompt = qna_system_message + '\n' + user_message

    # Generate the response
    try:
        response = llm(
                  prompt=prompt,
                  max_tokens=max_tokens,
                  temperature=temperature,
                  top_p=top_p,
                  top_k=top_k
                  )

        # Extract and print the model's response
        response = response['choices'][0]['text'].strip()
    except Exception as e:
        response = f'Sorry, I encountered the following error: \n {e}'

    return response

## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [48]:
print(generate_rag_response("What is the protocol for managing sepsis in a critical care unit?"))

Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves prompt recognition, immediate intervention, and ongoing supportive care.

1. Recognition:
   - Suspect sepsis in patients with fever (≥ 38°C or ≥ 100.4°F), tachycardia (heart rate ≥ 125 beats/min), respiratory rate > 30 breaths per minute, altered mental status, and low blood pressure un


* Query response is domain specific (e.g., specific temperature, heart rate, respiratory rate, etc.) and reflect information from Merck Manual that is provided to RAG.
* RAG works as per its design.

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [49]:
print(generate_rag_response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"))

Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by epigastric or periumbilical pain that shifts to the right lower quadrant, anorexia, nausea, vomiting, and abdominal tenderness. The classic symptoms include right lower quadrant direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. However, these signs are present in less than 50% of patients, especially in children, the elderly, and pregnant women. The diagnosis is primarily clinical, but imaging studies like CT or ul


* Query response is domain specific and reflect information from Merck Manual that is provided to RAG as mentioned in the Sources.
* RAG works as per its design.

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [50]:
print(generate_rag_response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scal


* Query response is domain specific and reflect information from Merck Manual that is provided to RAG.
* We observed in the previous query without rag, returns extra information but RAG provides specific response like, it is a auto immune disease and specific treatment plan, e.g., Minoxidil with 2% or 5% solution.
* RAG works as per its design.

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [51]:
print(generate_rag_response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"))

Llama.generate: prefix-match hit


Answer: For a person with a brain injury leading to temporary or permanent impairment of brain function, the primary focus is on supportive care. This includes preventing systemic complications such as pneumonia, urinary tract infections (UTIs), and thromboembolic disease; providing good nutrition; and preventing pressure ulcers. Additionally, physical therapy may be necessary to prevent limb contractures.

In cases of traumatic brain injury, posttraumatic seizures may occur, and drugs are given to prevent seizures during the first week if there is significant structural injury or a Glasgow Com


* Query response is domain specific and reflect information from Merck Manual that is provided to RAG.
* Response is very succint and specific
* RAG works as per its design.

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [52]:
print(generate_rag_response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"))

Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

For most fractures, the first steps include immobilizing the affected limb with a splint to prevent further injury and decrease pain. Pain is typically treated with opioids. Definitive


* Query response is domain specific and reflect information from Merck Manual that is provided to RAG.
* Previous query with prompt only a lot of information whereas RAG returns specific, e.g., "treatment plan: treated with opioids", etc.
* RAG works as per its design.

### Fine-tuning

generate_rag_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=40)

Fine Tuning parameter.

1. max_tokens
  * Sets a limit on the number of tokens (words or characters) the model can generate in a single response.
  default value: 128
  * Increasing max_tokens is costly as it requires llm to more time to process with more price.

2. k refers to the number of top-ranked items (documents, chunks, or passages) that are retrieved from a knowledge base in response to a user's query. Higher value will be increase the accuracy but lower the precision.
default value: 3

3. temperature
  * influences the overall randomness of token selection.
  * Lower temperatures lead to more deterministic outputs and higher temperature gives more deterministic output. Its range is 0-1 in llama model.  
  default value: 0 (focused/specific)
  * We will vary this parameter here to observed query response.
  
4. top_k
  * Limits the model's choices to the top k most probable tokens at each step of text generation.
  * Lower value leads to more predictable and focused outputs, reducing the likelihood of irrelevant tokens. Higher value allows for greater diversity and creativity. In llama, the value of top_k 1-10 is consider low and 20-50+ is consider high.
  default value: 40.

5. top_p
  * Selects tokens from a probability distribution, considering tokens whose cumulative probability is above a specified threshold p. lower value of it, restrict the model with focus and deterministic and higher value gives more flexibilty. Its range is 0 - 1.
  default value: .95 (95% probablity)



Query 1: What is the protocol for managing sepsis in a critical care unit?

In [53]:
query = "What is the protocol for managing sepsis in a critical care unit?"

In [54]:
print(generate_rag_response(query))

Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves prompt recognition, immediate intervention, and ongoing supportive care. Here's an outline of the protocol for managing sepsis in a critical care unit based on the provided context:

1. Recognition: Suspect sepsis if the patient has signs of infection (fever, chills, or leukocytosis) and systemic inflammatory response syndrome (SIRS), which includes a heart rate ≥ 1


In [55]:
print(generate_rag_response(query,max_tokens=500))


Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves prompt recognition, immediate intervention, and ongoing supportive care. Here's an outline of the protocol for managing sepsis in a critical care unit based on the provided context:

1. Recognition: Suspect sepsis if the patient has signs of infection (fever, chills, or known source) along with systemic inflammatory response syndrome (SIRS) criteria: temperature >38°C or <36°C, heart rate >90 beats per minute, respiratory rate >20 breaths per minute, or mean arterial pressure <70 mmHg.

2. Diagnostic testing: Obtain blood cultures and other appropriate specimens for culture and sensitivity testing. Consider point-of-care testing for lactate levels, which can help identify sepsis earlier.

3. Initial stabilization: Provide supplemental oxygen, establish intravenous access, and administer fluids to maintain adequate circulating volum

In [56]:
print(generate_rag_response(query,k=10, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves prompt recognition, immediate intervention, and ongoing supportive care. Here's an outline of the protocol for managing sepsis in a critical care unit based on the provided context:

1. Recognition: Suspect sepsis if the patient has signs of infection (fever, chills, or known source) along with systemic inflammatory response syndrome (SIRS) criteria: temperature >38°C or <36°C, heart rate >90 beats per minute, respiratory rate >20 breaths per minute, or mean arterial pressure <70 mmHg.

2. Diagnostic testing: Obtain blood cultures and other appropriate specimens for culture and sensitivity testing. Consider point-of-care testing for lactate levels, which can help identify sepsis earlier.

3. Initial stabilization: Provide supplemental oxygen, establish intravenous access, and administer fluids to maintain adequate circulating volum

In [57]:
print(generate_rag_response(query,temperature=.5, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves immediate supportive measures and antimicrobial therapy. Here's an outline of the protocol:

1. Initial assessment: Check vital signs, including temperature, blood pressure, pulse rate, respiratory rate, and oxygen saturation. Evaluate consciousness level using tools like the Glasgow Coma Scale. Look for signs of infection such as shaking chills, persistent fever, altered sensorium, hypotension, and GI symptoms (abdominal pain, nausea, vomiting, diarrhea).

2. Diagnostic tests: If sepsis or septic shock is suspected, obtain cultures from blood and any other appropriate specimens. Rapid diagnostic tests like procalcitonin or lactate levels can help confirm the diagnosis.

3. Fluid resuscitation: Administer crystalloids (0.9% saline or Ringer's lactate) to maintain adequate circulating volume and vital signs. The initial bolus is typ

In [58]:
print(generate_rag_response(query,top_p=0.80, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves prompt recognition, immediate intervention, and ongoing supportive care. Here's an outline of the protocol for managing sepsis in a critical care unit based on the provided context:

1. Recognition: Suspect sepsis if the patient has signs of infection (fever, chills, or known source) along with systemic inflammatory response syndrome (SIRS) criteria: temperature >38°C or <36°C, heart rate >90 beats per minute, respiratory rate >20 breaths per minute, or mean arterial pressure <70 mmHg.

2. Diagnostic testing: Obtain blood cultures and other appropriate specimens for culture and sensitivity testing. Consider point-of-care testing for lactate levels, which can help identify sepsis earlier.

3. Initial stabilization: Provide supplemental oxygen, establish intravenous access, and administer fluids to maintain adequate circulating volum

In [59]:
print(generate_rag_response(query,top_k=100, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves prompt recognition, immediate intervention, and ongoing supportive care. Here's an outline of the protocol for managing sepsis in a critical care unit based on the provided context:

1. Recognition: Suspect sepsis if the patient has signs of infection (fever, chills, or known source) along with systemic inflammatory response syndrome (SIRS) criteria: temperature >38°C or <36°C, heart rate >90 beats per minute, respiratory rate >20 breaths per minute, or mean arterial pressure <70 mmHg.

2. Diagnostic testing: Obtain blood cultures and other appropriate specimens for culture and sensitivity testing. Consider point-of-care testing for lactate levels, which can help identify sepsis earlier.

3. Initial stabilization: Provide supplemental oxygen, establish intravenous access, and administer fluids to maintain adequate circulating volum

In [60]:
print(generate_rag_response(query,k=10, max_tokens=500, temperature=.1, top_p=.5, top_k=50))

Llama.generate: prefix-match hit


Answer: Sepsis is a life-threatening condition characterized by a dysregulated host response to infection. In a critical care unit, managing sepsis involves prompt recognition, immediate intervention, and ongoing supportive care.

1. Recognition:
   - Suspect sepsis in patients with fever (≥ 38°C or ≥ 100.4°F), tachycardia (heart rate ≥ 125 beats/min), respiratory rate > 30 breaths per minute, altered mental status, and low blood pressure unresponsive to volume resuscitation.
   - Consider sepsis if patients have signs of infection along with any of the above symptoms.
   - Septic shock is a severe form of sepsis, characterized by hypotension (systolic BP < 90 mm Hg) that is unresponsive to volume resuscitation.

2. Initial Management:
   - Keep the patient warm and provide supplemental oxygen if needed.
   - Begin antibiotics as soon as possible, ideally within 1 hour of recognition. Empiric antibiotic therapy should be based on likely pathogens and severity of illness.
   - Administe

### Observation
* Initial response of the query with default parameters clearly is truncated.
* Increasing k, increases the accuracy as the response is more specific to the query.
* Increasing max token = 500 covers the response
* increasing temperature increases the creativity,
* decreasing top_p gives more focused response, and increasing top_k have almost the same query.
* Combining k=10, max_tokens=500, temperature=.1, top_p=.5, top_k=50 results creativity and incompete response.
* As it is important to recommend to accurate, concise and human safe treatment plan and diagnosis, we choose max_token = 500, temperature = 0, k = 3, top_p = .95, and top_k =40.

Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [61]:
query = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(generate_rag_response(query))

Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by epigastric or periumbilical pain that shifts to the right lower quadrant, anorexia, nausea, vomiting, and abdominal tenderness. The classic symptoms include right lower quadrant direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. However, these signs are present in less than 50% of patients, especially in children, the elderly, and pregnant women. The diagnosis is primarily clinical, but imaging studies like CT or ul


In [62]:
print(generate_rag_response(query,max_tokens=500))


Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by epigastric or periumbilical pain that shifts to the right lower quadrant, anorexia, nausea, vomiting, and abdominal tenderness. The classic symptoms include right lower quadrant direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. However, these signs are present in less than 50% of patients, especially in children, the elderly, and pregnant women. The diagnosis is primarily clinical, but imaging studies like CT or ultrasound can be used when symptoms are atypical or equivocal.

Appendicitis cannot be cured via medicine alone; surgical removal of the appendix (appendectomy) is the standard treatment. Antibiotics may be administered before surgery to prevent infection spread. If the appendix ruptures, an abscess or peritonitis may form, requiring drainage and subsequent appendectomy. In late cases with a pericolic abscess, the abscess is drained first, followed by appendectomy at a later date. A Meckel's

In [63]:
print(generate_rag_response(query,k=10, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by epigastric or periumbilical pain that shifts to the right lower quadrant, anorexia, nausea, vomiting, and abdominal tenderness. The classic symptoms include right lower quadrant direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. However, these signs are present in less than 50% of patients, especially in children, the elderly, and pregnant women. The diagnosis is primarily clinical, but imaging studies like CT or ultrasound can be used when symptoms are atypical or equivocal.

Appendicitis cannot be cured via medicine alone; surgical removal of the appendix (appendectomy) is the standard treatment. Antibiotics may be administered before surgery to prevent infection spread. If the appendix ruptures, an abscess or peritonitis may form, requiring drainage and subsequent appendectomy. In late cases with a pericolic abscess, the abscess is drained first, followed by appendectomy at a later date. A Meckel's

In [64]:
print(generate_rag_response(query,temperature=.5, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by symptoms such as epigastric or periumbilical pain followed by brief nausea, vomiting, and anorexia. After a few hours, the pain shifts to the right lower quadrant, increasing with cough and motion. Classic signs include direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. Low-grade fever is common. However, these classic findings appear in less than 50% of patients, and variations in symptoms and signs occur, particularly among elderly patients and pregnant women.

Appendicitis cannot be cured via medicine alone as it requires surgical intervention to remove the inflamed appendix. The standard treatment for appendicitis is open or laparoscopic appendectomy. Antibiotics are administered intravenously before surgery to prevent infection spread. In cases where the appendix is difficult to locate, antibiotics may be used as a temporary measure until surgery can be performed. If the appendix ruptures and for

In [65]:
print(generate_rag_response(query,top_p=0.80, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by epigastric or periumbilical pain that shifts to the right lower quadrant, anorexia, nausea, vomiting, and abdominal tenderness. The classic symptoms include right lower quadrant direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. However, these signs are present in less than 50% of patients, especially in children, the elderly, and pregnant women. The diagnosis is primarily clinical, but imaging studies like CT or ultrasound can be used when symptoms are atypical or equivocal.

Appendicitis cannot be cured via medicine alone; surgical removal of the appendix (appendectomy) is the standard treatment. Antibiotics may be administered before surgery to prevent infection spread. If the appendix ruptures, an abscess or peritonitis may form, requiring drainage and subsequent appendectomy. In late cases with a pericolic abscess, the abscess is drained first, followed by appendectomy at a later date. A Meckel's

In [66]:
print(generate_rag_response(query,top_k=100, max_tokens=500))

Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by epigastric or periumbilical pain that shifts to the right lower quadrant, anorexia, nausea, vomiting, and abdominal tenderness. The classic symptoms include right lower quadrant direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. However, these signs are present in less than 50% of patients, especially in children, the elderly, and pregnant women. The diagnosis is primarily clinical, but imaging studies like CT or ultrasound can be used when symptoms are atypical or equivocal.

Appendicitis cannot be cured via medicine alone; surgical removal of the appendix (appendectomy) is the standard treatment. Antibiotics may be administered before surgery to prevent infection spread. If the appendix ruptures, an abscess or peritonitis may form, requiring drainage and subsequent appendectomy. In late cases with a pericolic abscess, the abscess is drained first, followed by appendectomy at a later date. A Meckel's

In [67]:
print(generate_rag_response(query,k=10, max_tokens=500, temperature=.1, top_p=.5, top_k=50))

Llama.generate: prefix-match hit


Answer: Appendicitis is characterized by symptoms such as epigastric or periumbilical pain followed by brief nausea, vomiting, and anorexia. After a few hours, the pain shifts to the right lower quadrant, which increases with cough and motion. Classic signs include direct and rebound tenderness at McBurney's point, Rovsing sign, psoas sign, or obturator sign. Low-grade fever is common. However, these classic findings appear in less than 50% of patients, and many variations of symptoms and signs occur. Pain may not be localized, particularly in infants and children, and tenderness may be diffuse or absent.

Appendicitis cannot be cured via medicine alone; surgical removal is the treatment of choice. This can be done through open or laparoscopic appendectomy. The surgeon should precede the procedure with IV antibiotics, typically third-generation cephalosporins. If the appendix is perforated, antibiotics should be continued until the patient's temperature and WBC count have normalized. I

### Observation
* Initial response of the query with default parameters clearly is truncated.
* Increasing max token = 500 covers the response
* Increasing k/top_k/decreasing top_p gave the same response.
* increasing temperature increases the creativity,
* Combining k=10, max_tokens=500, temperature=.1, top_p=.5, top_k=50 results creativity and incompete response.
* As it is important to recommend to accurate, concise and human safe treatment plan and diagnosis, we choose max_token = 500, temperature = 0, k = 3, top_p = .95, and top_k =40.

Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [68]:
query = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(generate_rag_response(query))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scal


In [69]:
print(generate_rag_response(query,max_tokens=500))


Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scalp biopsy may be required for definitive diagnosis. Scalp biopsy can differentiate scarring from nonscarring forms and help identify underlying conditions like lupus erythematosus, lichen planopilaris, or systemic sclerosis.

The treatment options for alopecia areata include:

1. Topical corticosteroids: These medications can be applied directly to the affected area to reduce inflammation and promote hair regrowth.
2. 

In [70]:
print(generate_rag_response(query,max_tokens=800))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scalp biopsy may be required for definitive diagnosis. Scalp biopsy can differentiate scarring from nonscarring forms and help identify underlying conditions like lupus erythematosus, lichen planopilaris, or systemic sclerosis.

The treatment options for alopecia areata include:

1. Topical corticosteroids: These medications can be applied directly to the affected area to reduce inflammation and promote hair regrowth.
2. 

In [71]:
print(generate_rag_response(query,k=10, max_tokens=800))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scalp biopsy may be required for definitive diagnosis. Scalp biopsy can differentiate scarring from nonscarring forms and help identify underlying conditions like lupus erythematosus, lichen planopilaris, or systemic sclerosis.

The treatment options for alopecia areata include:

1. Topical corticosteroids: These medications can be applied directly to the affected area to reduce inflammation and promote hair regrowth.
2. 

In [72]:
print(generate_rag_response(query,temperature=.5, max_tokens=800))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in otherwise healthy individuals. The exact cause of alopecia areata remains unclear, but it's believed to be an autoimmune disorder affecting genetically susceptible people exposed to unknown environmental triggers.

The treatment options for alopecia areata include:

1. Topical treatments: Minoxidil 2% or 5%, applied once daily to the affected area, can help stimulate hair regrowth in some cases. However, it may take several months before any improvement is noticeable.

2. Intralesional corticosteroid injections: These are typically administered every 4-6 weeks and can promote hair re-growth by reducing inflammation around the affected follicles. The response to treatment varies greatly from person to person, and side effects include skin atrophy, telangiectasia, and localized infections.

3. Topical immunotherapies: Diphencyprone or squaric acid dibutylester can be used to 

In [73]:
print(generate_rag_response(query,top_p=0.80, max_tokens=800))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scalp biopsy may be required for definitive diagnosis. Scalp biopsy can differentiate scarring from nonscarring forms and help identify underlying conditions like lupus erythematosus, lichen planopilaris, or systemic sclerosis.

The treatment options for alopecia areata include:

1. Topical corticosteroids: These medications can be applied directly to the affected area to reduce inflammation and promote hair regrowth.
2. 

In [74]:
print(generate_rag_response(query,top_k=100, max_tokens=800))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is believed to be an autoimmune disorder affecting genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scalp biopsy may be required for definitive diagnosis. Scalp biopsy can differentiate scarring from nonscarring forms and help identify underlying conditions like lupus erythematosus, lichen planopilaris, or systemic sclerosis.

The treatment options for alopecia areata include:

1. Topical corticosteroids: These medications can be applied directly to the affected area to reduce inflammation and promote hair regrowth.
2. 

In [75]:
print(generate_rag_response(query,k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50))

Llama.generate: prefix-match hit


Answer: The condition you're referring to is called alopecia areata. It presents with sudden, patchy hair loss in people without any obvious skin or systemic disorders. The most frequently affected areas are the scalp and beard. In severe cases, all body hair may be involved (alopecia universalis). Alopecia areata is considered an autoimmune disorder that affects genetically susceptible individuals exposed to unclear environmental triggers.

The diagnosis of alopecia areata is typically made based on the clinical presentation. However, microscopic hair examination or scalp biopsy may be required for definitive diagnosis. Scalp biopsy can differentiate scarring from nonscarring forms and help identify underlying conditions like lupus erythematosus, lichen planopilaris, or systemic sclerosis.

The treatment options for alopecia areata include:

1. Topical corticosteroids: These medications can be applied directly to the affected area to reduce inflammation and promote hair regrowth.
2. M

### Observation
* Initial response of the query with default parameters clearly is truncated.
* Increasing max token = 500 doesn't covers the response. So, we increase it to 800
* increasing temperature increases the creativity,
* increasing k/top_k/decreasing top_p gives almost the same response as as only max_tokens = 800.
* Combining k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50 results almost give the better sumerised response.
* As it is important to recommend to accurate, concise and human safe treatment plan and diagnosis, we choose k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50.

Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [76]:
query = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(generate_rag_response(query))

Llama.generate: prefix-match hit


Answer: For a person with a brain injury leading to temporary or permanent impairment of brain function, the primary focus is on supportive care. This includes preventing systemic complications such as pneumonia, urinary tract infections (UTIs), and thromboembolic disease; providing good nutrition; and preventing pressure ulcers. Additionally, physical therapy may be necessary to prevent limb contractures.

In cases of traumatic brain injury, posttraumatic seizures may occur, and drugs are given to prevent seizures during the first week if there is significant structural injury or a Glasgow Com


In [77]:
print(generate_rag_response(query,max_tokens=500))


Llama.generate: prefix-match hit


Answer: For individuals with a brain injury leading to temporary or permanent impairment of brain function, supportive care is the primary treatment approach. This includes preventing systemic complications such as pneumonia, urinary tract infections (UTIs), and thromboembolic diseases; providing good nutrition; and preventing pressure ulcers. Additionally, physical therapy can help prevent limb contractures.

In cases of traumatic brain injury, posttraumatic seizures may occur, and drugs are given to prevent seizures during the first week if there is significant structural injury or a Glasgow Coma Scale (GCS) score below 10. However, these drugs do not prevent permanent posttraumatic epilepsy months or years later. If seizures begin more than one week after head injury, long-term treatment with drugs is required.

The principles of long-term treatment involve finding the right drug or combination of drugs that effectively controls seizures while minimizing adverse effects. Some drugs 

In [78]:
print(generate_rag_response(query,max_tokens=800))


Llama.generate: prefix-match hit


Answer: For individuals with a brain injury leading to temporary or permanent impairment of brain function, supportive care is the primary treatment approach. This includes preventing systemic complications such as pneumonia, urinary tract infections (UTIs), and thromboembolic diseases; providing good nutrition; and preventing pressure ulcers. Additionally, physical therapy can help prevent limb contractures.

In cases of traumatic brain injury, posttraumatic seizures may occur, and drugs are given to prevent seizures during the first week if there is significant structural injury or a Glasgow Coma Scale (GCS) score below 10. However, these drugs do not prevent permanent posttraumatic epilepsy months or years later. If seizures begin more than one week after head injury, long-term treatment with drugs is required.

The principles of long-term treatment involve finding the right drug or combination of drugs that effectively controls seizures while minimizing adverse effects. Some drugs 

In [79]:
print(generate_rag_response(query,k=10, max_tokens=800))

Llama.generate: prefix-match hit


Answer: For individuals with a brain injury leading to temporary or permanent impairment of brain function, supportive care is the primary treatment approach. This includes preventing systemic complications such as pneumonia, urinary tract infections (UTIs), and thromboembolic diseases; providing good nutrition; and preventing pressure ulcers. Additionally, physical therapy can help prevent limb contractures.

In cases of traumatic brain injury, posttraumatic seizures may occur, and drugs are given to prevent seizures during the first week if there is significant structural injury or a Glasgow Coma Scale (GCS) score below 10. However, these drugs do not prevent permanent posttraumatic epilepsy months or years later. If seizures begin more than one week after head injury, long-term treatment with drugs is required.

The principles of long-term treatment involve finding the right drug or combination of drugs that effectively controls seizures while minimizing adverse effects. Some drugs 

In [80]:
print(generate_rag_response(query,temperature=.5, max_tokens=800))

Llama.generate: prefix-match hit


Answer: The treatment for a person with a brain injury primarily involves supportive care to prevent complications and promote recovery. This includes:

1. Preventing systemic complications due to immobilization such as pneumonia, UTI, and thromboembolic disease.
2. Providing good nutrition.
3. Preventing pressure ulcers.
4. Providing physical therapy to prevent limb contractures.
5. Managing seizures with anticonvulsant medication if necessary.

The Merck Manual suggests that for patients in a coma or impaired consciousness state, supportive care is the mainstay of treatment. No specific treatment exists for brain injuries, but addressing potential complications and providing rehabilitation services early can improve outcomes.

In cases where there is significant structural injury to the brain or a Glasgow Coma Scale (GCS) score of less than 10, anticonvulsant medication may be given to prevent seizures during the first week after injury. If seizures occur after one week, long-term tr

In [81]:
print(generate_rag_response(query,top_p=0.80, max_tokens=800))

Llama.generate: prefix-match hit


Answer: For individuals with a brain injury leading to temporary or permanent impairment of brain function, supportive care is the primary treatment approach. This includes preventing systemic complications such as pneumonia, urinary tract infections (UTIs), and thromboembolic diseases; providing good nutrition; and preventing pressure ulcers. Additionally, physical therapy can help prevent limb contractures.

In cases of traumatic brain injury, posttraumatic seizures may occur, and drugs are given to prevent seizures during the first week if there is significant structural injury or a Glasgow Coma Scale (GCS) score below 10. However, these drugs do not prevent permanent posttraumatic epilepsy months or years later. If seizures begin more than one week after head injury, long-term treatment with drugs is required.

The principles of long-term treatment involve finding the right drug or combination of drugs that effectively controls seizures while minimizing adverse effects. Some drugs 

In [82]:
print(generate_rag_response(query,top_k=100, max_tokens=800))

Llama.generate: prefix-match hit


Answer: For individuals with a brain injury leading to temporary or permanent impairment of brain function, supportive care is the primary treatment approach. This includes preventing systemic complications such as pneumonia, urinary tract infections (UTIs), and thromboembolic diseases; providing good nutrition; and preventing pressure ulcers. Additionally, physical therapy can help prevent limb contractures.

In cases of traumatic brain injury, posttraumatic seizures may occur, and drugs are given to prevent seizures during the first week if there is significant structural injury or a Glasgow Coma Scale (GCS) score below 10. However, these drugs do not prevent permanent posttraumatic epilepsy months or years later. If seizures begin more than one week after head injury, long-term treatment with drugs is required.

The principles of long-term treatment involve finding the right drug or combination of drugs that effectively controls seizures while minimizing adverse effects. Some drugs 

In [83]:
print(generate_rag_response(query,k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50))

Llama.generate: prefix-match hit


Answer: For individuals with brain injuries leading to temporary or permanent impairments, supportive care is the primary treatment approach. This includes preventing systemic complications such as pneumonia, UTI, and thromboembolic disease; providing good nutrition; and preventing pressure ulcers. In some cases, physical therapy may be necessary to prevent limb contractures.

Additionally, for patients with traumatic brain injuries who have significant structural injury or a Glasgow Coma Scale (GCS) score of less than 10, drugs are given to prevent seizures during the first week after injury. These medications reduce the risk of seizures but do not prevent permanent post-traumatic epilepsy months or years later.

Long-term treatment for epileptic seizures involves finding the most effective drug or combination of drugs for each patient, as no single drug controls all types of seizures. Some drugs reach therapeutic levels quickly, while others require gradual dosage increases over seve

### Observation
* Initial response of the query with default parameters clearly is truncated.
* Increasing max token = 500 doesn't covers the response. So, we increase it to 800
* increasing temperature increases the creativity but answer is not informative,
* increasing k/top_k/decreasing top_p gives almost the same response as as only max_tokens = 800.
* Combining k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50 results better response.
* As it is important to recommend to accurate, concise and human safe treatment plan and diagnosis, we choose k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50

Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [84]:
query = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(query))

Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

For most fractures, the first steps include immobilizing the affected limb with a splint to prevent further injury and decrease pain. Pain is typically treated with opioids. Definitive


In [85]:
print(generate_rag_response(query,max_tokens=500))


Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

For most fractures, the first steps include immobilizing the affected limb with a splint to prevent further injury and decrease pain. Pain is typically treated with opioids. Definitive treatment often involves reduction, which can be closed (without skin incision) or open (with skin incision). Closed reductions are maintained by casting, while open reductions require various surgical hardware.

RICE (rest, ice, compression, and elevation) is recommended for soft-tissue injuries. Patients should rest to prevent further injury and speed healing, apply ice and compression to minimize swelling and pain, and el

In [86]:
print(generate_rag_response(query,max_tokens=800))


Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

For most fractures, the first steps include immobilizing the affected limb with a splint to prevent further injury and decrease pain. Pain is typically treated with opioids. Definitive treatment often involves reduction, which can be closed (without skin incision) or open (with skin incision). Closed reductions are maintained by casting, while open reductions require various surgical hardware.

RICE (rest, ice, compression, and elevation) is recommended for soft-tissue injuries. Patients should rest to prevent further injury and speed healing, apply ice and compression to minimize swelling and pain, and el

In [87]:
print(generate_rag_response(query,k=10,max_tokens=800))

Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

For most fractures, the first steps include immobilizing the affected limb with a splint to prevent further injury and decrease pain. Pain is typically treated with opioids. Definitive treatment often involves reduction, which can be closed (without skin incision) or open (with skin incision). Closed reductions are maintained by casting, while open reductions require various surgical hardware.

RICE (rest, ice, compression, and elevation) is recommended for soft-tissue injuries. Patients should rest to prevent further injury and speed healing, apply ice and compression to minimize swelling and pain, and el

In [88]:
print(generate_rag_response(query,temperature=.5, max_tokens=800))

Llama.generate: prefix-match hit


Answer: A leg fracture sustained while hiking requires prompt assessment and appropriate intervention to ensure the best possible outcome. The first priority is to assess for any life-threatening injuries or complications, such as hemorrhagic shock due to significant blood loss. If there are no signs of severe injury or shock, the following steps should be taken:

1. Immobilize the injured leg: Use a non-rigid splint or sling to prevent further damage and decrease pain. This initial immobilization will help maintain stability and prevent additional injury while transporting the patient to a healthcare facility.

2. Pain management: Administer pain medication as needed, typically opioids, to make the patient comfortable during transportation and initial evaluation.

3. Definitive treatment: Depending on the type and severity of the fracture, definitive treatment may include closed or open reduction followed by casting or surgical hardware (e.g., pins, screws, plates). Closed reduction i

In [89]:
print(generate_rag_response(query,top_p=0.80, max_tokens=800))

Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

For most fractures, the first steps include immobilizing the affected limb with a splint to prevent further injury and decrease pain. Pain is typically treated with opioids. Definitive treatment often involves reduction, which can be closed (without skin incision) or open (with skin incision). Closed reductions are maintained by casting, while open reductions require various surgical hardware.

RICE (rest, ice, compression, and elevation) is recommended for soft-tissue injuries. Patients should rest to prevent further injury and speed healing, apply ice and compression to minimize swelling and pain, and el

In [90]:
print(generate_rag_response(query,top_k=100, max_tokens=800))

Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

For most fractures, the first steps include immobilizing the affected limb with a splint to prevent further injury and decrease pain. Pain is typically treated with opioids. Definitive treatment often involves reduction, which can be closed (without skin incision) or open (with skin incision). Closed reductions are maintained by casting, while open reductions require various surgical hardware.

RICE (rest, ice, compression, and elevation) is recommended for soft-tissue injuries. Patients should rest to prevent further injury and speed healing, apply ice and compression to minimize swelling and pain, and el

In [91]:
print(generate_rag_response(query,k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50))

Llama.generate: prefix-match hit


Answer: A fractured leg is a serious injury that requires prompt medical attention. In the initial assessment, evaluate the patient for signs of hemorrhagic shock due to potential blood loss. If there are symptoms or signs of ischemia (absent pulses, marked pallor, coolness distal to the injury, severe pain), immediate evaluation and treatment for arterial injuries may be necessary.

1. Precautions:
   a. Immobilize the affected limb using a splint or sling to prevent further injury and decrease pain.
   b. Ensure proper alignment of the limb to avoid complications such as flexion contracture or malunion.
   c. Provide analgesia or sedation for definitive treatment, which may involve reduction through closed or open methods.
   d. Administer RICE (rest, ice, compression, elevation) therapy for soft-tissue injuries.
   e. For patients with a lower-limb prosthesis, maintain body alignment and address any issues related to the interface fit.

2. Treatment:
   a. In the emergency departmen

### Observation
* Initial response of the query with default parameters clearly is truncated.
* Increasing max token = 500 doesn't covers the response. So, we increase it to 800
* increasing temperature increases the creativity but answer is not informative,
* increasing k/top_k/decreasing top_p gives almost the same response as as only max_tokens = 800.
* Combining k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50 results the same response.
* As it is important to recommend to accurate, concise and human safe treatment plan and diagnosis, we choose k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50

## Output Evaluation

Let us now use the LLM-as-a-judge method to check the quality of the RAG system on two parameters - retrieval and generation. We illustrate this evaluation based on the answeres generated to the question from the previous section.

* We are using the same Mistral model for evaluation, so basically here the llm is rating itself on how well he has performed in the task.

* **Groundedness**  - measures how well the generated text (often an answer to a question) is supported by the retrieved information from a knowledge base or external source. It is a measure of faithfulness

* **Relevance** - refers to how well the retrieved information from a knowledge base aligns with and directly answers a user's query or fulfills their need.

In [92]:
groundedness_rater_system_message = """
You are acting as an impartial evaluator tasked with judging how well an AI-generated answer adheres to a specific metric. Keep the response brief and to the point and keeping the token count below 1000.
Alway rate score between 1 and 5 in "Score:"

You will be given:
1. A question (introduced with ###Question)
2. A context used by the AI system to generate the answer (introduced with ###Context)
3. The AI-generated answer (introduced with ###Answer)

Evaluation Metric:
The answer should be derived only from the information presented in the context.

Scoring Scale:
Score	Description
1.	The metric is not followed at all
2.	The metric is followed only to a limited extent
3.	The metric is followed to a good extent
4.	The metric is mostly followed
5.	The metric is completely followed

Evaluation Instructions:
1. Identify the key information provided in the ###Context.
2. Compare the ###Answer against the ###Context to determine whether all claims or statements are directly supported.
3. ALWAYS Assign a final score between 1 and 5, based on your explanation and the scoring scale in "Score:".

Explain step by step:
1. Which parts of the answer are supported by the context
2. Which (if any) are not supported or go beyond the given context
3. Determine the overall degree of adherence to the metric.
4. ALWAYS Assign a final score between 1 and 5, based on your explanation and the scoring scale in "Score:".
"""

In [93]:
relevance_rater_system_message = """
You are acting as an impartial evaluator tasked with judging how well an AI-generated answer adheres to a specific metric. Keep the response brief and to the point and keeping the token count below 1000.
Alway rate score between 1 and 5 in "Score:"

For each evaluation, you will be given:

1. A question starting with ###Question.
2. A context used by the AI system to generate the answer, starting with ###Context.
3. An AI-generated answer, starting with ###Answer.

Evaluation Metric:

You will evaluate the AI-generated answer based on Relevance, which measures how well the answer addresses the main aspects of the question, given the context. Keep the response brief and to the point and keeping the token count below 1000. ALWAYS provide the score between 1 and 5.
The key points for this evaluation are:

1. Does the answer address all important aspects of the question?
2. Does the answer avoid including irrelevant or unnecessary information?
3. Does the answer align with the key elements from the context provided?

Rating Criteria:
1: The answer is completely irrelevant to the question and context.
2: The answer addresses the question but only partially or with significant gaps in relevance.
3: The answer is mostly relevant and covers most aspects of the question but may lack detail or have minor off-topic information.
4: The answer is highly relevant and addresses nearly all aspects of the question with minimal irrelevant content.
5: The answer is entirely relevant, comprehensive, and fully aligned with both the question and the context.

Instructions:
Step 1: Analyze the question, context, and AI-generated answer.
      - What is the main purpose of the question?
      - What are the key details and objectives the context highlights?
      - Does the AI answer all those objectives or details?

Step 2: Check the relevance of the answer.
      - Does the answer fully align with the key details of the context?
      - Does it answer the question in a clear and concise manner?
      - Are there any irrelevant or superfluous details in the answer?

Step 3: Rate the relevance between 1 and 5 according to the rating criteria above.

Step 4: Provide a brief explanation of the score you assigned, highlighting how well the answer adheres to the relevance metric.
"""

In [94]:
user_message_template = """
###Question
{question}

###Context
{context}

###Answer
{answer}

"Score:"
"""

In [95]:
def generate_ground_relevance_response_details(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=40):
    global qna_system_message,qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=3)
    context_list = [d.page_content for d in relevant_document_chunks]
    context_for_query = ". ".join(context_list)

    # Combine user_prompt and system_message to create the prompt
    prompt = f"""[INST]{qna_system_message}\n
                {'user'}: {qna_user_message_template.format(context=context_for_query, question=user_input)}
                [/INST]"""

    response = llm(
            prompt=prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            echo=False
            )

    answer =  response["choices"][0]["text"]

    # Combine user_prompt and system_message to create the prompt
    groundedness_prompt = f"""[INST]{groundedness_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Combine user_prompt and system_message to create the prompt
    relevance_prompt = f"""[INST]{relevance_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    response_1 = llm(
            prompt=groundedness_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            echo=False
            )

    response_2 = llm(
            prompt=relevance_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            echo=False
            )

    return response_1['choices'][0]['text'],response_2['choices'][0]['text']


### Running a sample, randomized query

In [96]:
# We used the tuned parameter k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50. So setting this as default when we access to the main generate_ground_relevance_response_details function
def generate_ground_relevance_response(user_input, k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50):
  ground,rel = generate_ground_relevance_response_details(user_input,k,max_tokens,temperature,top_p,top_k)

  BOLD = "\033[1m"
  END = "\033[0m"
  print(BOLD + "Grounded:" + END)
  print(ground,end="\n\n")
  print(BOLD + "Relevance:"  + END)
  print(rel)


generate_ground_relevance_response(user_input="What is the treatment of kidney stone?", k=10, max_tokens=800, temperature=.1, top_p=.5, top_k=50)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


[1mGrounded:[0m
 Score: 5

Explanation:
The answer adheres to the metric as it is derived solely from the context provided. The context discusses various treatments for kidney stones based on their size, location, and composition. The AI-generated answer summarizes these treatment options accurately without introducing any new or unsupported information.

[1mRelevance:[0m
 Score: 5

Explanation: The AI-generated answer fully addresses all important aspects of the question regarding kidney stone treatment, including natural passage, interventions like ESWL and endoscopic procedures, medications, pain management, and long-term management strategies. It also avoids irrelevant or unnecessary information and aligns well with the context provided, which


### Observation

* LLM rates 5/5 for groundness and 5/5 for relevance.
* It also provide explanation.

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [97]:
print(generate_ground_relevance_response("What is the protocol for managing sepsis in a critical care unit?"))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


[1mGrounded:[0m
 Score: 5

The answer adheres completely to the metric as it is derived solely from the information provided in the context. The steps outlined in the answer are directly supported by the text, and no additional or unsupported claims are made.

[1mRelevance:[0m
 Score: 5

Explanation: The AI-generated answer fully addresses the main aspects of the question by outlining the key steps for managing sepsis in a critical care unit based on the context provided. It avoids irrelevant information and aligns with the key elements from the context, making it entirely relevant and comprehensive.
None


### Observation

* LLM rantes 5/5 for groundness and 5/5 for relevance with explanation.
* It proves that the contexts are relevant to the query, and the anwer is relevant to query with context.

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [98]:
print(generate_ground_relevance_response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


[1mGrounded:[0m
 Score: 5

Explanation:
The answer adheres to the metric as it is derived solely from the context provided, accurately summarizing the symptoms, diagnosis, treatment, and limitations of appendicitis. The answer does not introduce any new or unsupported information.

[1mRelevance:[0m
 Score: 5

Explanation: The AI-generated answer fully addresses all aspects of the question, providing an accurate description of the symptoms for appendicitis and stating that it cannot be cured via medicine alone but requires surgical removal. The answer also mentions the standard treatment procedure (appendectomy with antibiotics) and possible complications (abscess formation). Additionally, the AI-generated answer distinguishes between appendicitis and hernias of the abdominal wall, ensuring relevance to both parts of the question.
None


### Observation

* LLM rantes 5/5 for groundness and 5/5 for relevance with explanation.
* It proves that the contexts are relevant to the query, and the anwer is relevant to query with context.

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [99]:
print(generate_ground_relevance_response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


[1mGrounded:[0m
 Score: 5

Explanation:
The answer adheres to the metric as it is derived entirely from the context provided. The answer accurately identifies Alopecia Areata as a type of nonscarring alopecia and lists the treatment options directly from the Merck Manual mentioned in the context. Additionally, the possible causes are also mentioned from the context.

[1mRelevance:[0m
 Score: 5

Explanation: The AI-generated answer fully addresses both the question and context by providing a clear explanation of Alopecia Areata, its causes, and effective treatments. It also aligns with the key elements from the context, such as nonscarring alopecia, autoimmune disorder, and various treatment options. The answer is comprehensive, relevant, and free of irrelevant information.
None


### Observation

* LLM rantes 5/5 for groundness and 5/5 for relevance with explanation.
* It proves that the contexts are relevant to the query, and the anwer is relevant to query with context.

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [100]:
print(generate_ground_relevance_response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


[1mGrounded:[0m
 Score: 5

Explanation:
The answer adheres to the metric as it is derived directly from the context provided in the Merck Manual of Diagnosis & Therapy, 19th Edition. The answer includes all recommended treatments for a person who has sustained a physical injury to brain tissue, based on the information presented in the context.

[1mRelevance:[0m
 Score: 5

Explanation: The AI-generated answer fully addresses all important aspects of the question by listing specific treatments recommended for a person with brain tissue injury, including supportive care, post-traumatic seizure treatment, long-term epilepsy treatment, and rehabilitation services. It also aligns with the key elements from the context provided in The Merck Manual of Diagnosis & Therapy by mentioning various treatments and their purposes. The answer is clear, concise, and free of irrelevant information.
None


### Observation

* LLM rantes 5/5 for groundness and 5/5 for relevance with explanation.
* It proves that the contexts are relevant to the query, and the anwer is relevant to query with context.

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [101]:
print(generate_ground_relevance_response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"))

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


[1mGrounded:[0m
 Score: 5

The answer adheres completely to the metric as it is derived solely from the context provided and does not go beyond the given information. The steps outlined in the answer are directly supported by the context, which discusses various aspects of fracture treatment, including assessment, immobilization, pain management, definitive treatment, RICE protocol, rehabilitation, and prosthetic considerations.

[1mRelevance:[0m
 Score: 5

Explanation: The AI-generated answer fully addresses all important aspects of the question and context by outlining necessary precautions and treatment steps for a person with a fractured leg during hiking. It avoids irrelevant information and aligns well with the key elements from the context provided, such as immobilization, pain management, definitive treatment, RICE protocol, rehabilitation, and prosthetic considerations.
None


### Observation

* LLM rantes 5/5 for groundness and 5/5 for relevance.
* It proves that RAG performed very well responding queries.

**Overall, our RAG's performance is very good according to the query input that we have**

## Actionable Insights and Business Recommendations

**Actionable Insight:**

1. Meta's llama model family with llama.cpp, and Mistral-7B-Instruct-v0.2-GGUF model has been used in this RAG application.
2. Sentence transformer, and vector database (Chroma) have been used for embedding PDF medical document.
3. overlapped chunking has been used to ensure context contunity, coherence.
4. model response has been tuned with max_tokens, temperature, top_k, and top_p. Higher max_tokens help us to avoid query truncation, capture detail response but with higher complexity, cost.
5. higher temperature, large top_k, and lower top_p increase query response randomness, creativity, etc.
* Evaluated by llm-as-a-judge using two metrics groundness and relevance.

**Business Recommendation**
1. This RAG should be integrated into company's system to increase operational efficieny, and acurate medical diagnosis, and treatment plan.
2. Establish a feedback loop to fine-tune parameters, improving performance for diverse query types.




<font size=6 color='blue'>Power Ahead</font>
___