## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

**Common Questions to Answer**

**1. Diagnostic Assistance**: "What are the common symptoms and treatments for pulmonary embolism?"

**2. Drug Information**: "Can you provide the trade names of medications used for treating hypertension?"

**3. Treatment Plans**: "What are the first-line options and alternatives for managing rheumatoid arthritis?"

**4. Specialty Knowledge**: "What are the diagnostic steps for suspected endocrine disorders?"

**5. Critical Care Protocols**: "What is the protocol for managing sepsis in a critical care unit?"

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [2]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used
# !CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m74.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m162.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m161.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.9/16.9 MB[0m [31m259.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.9/43.9 kB[0m [31m249.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for llama-cpp-python (pyproject.toml) ... [?25l[?25hdone
[31mERROR: pip's dependency

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [3]:
# For installing the libraries & downloading models from HF Hub
!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas==2.2.2, but you have pandas 1.5.3 which is incompatible.
peft 0.17.0 requires huggingface_hub>=0.25.0, but you have huggingface-hub 0.23.2 which is incompatible.
cudf-cu12 25.6.0 requires pandas<2.2.4dev0,>=2.0, but you have pandas 1.5.3 which is incompatible.
gradio 5.39.0 requires huggingface-hub<1.0,>=0.33.5, but you have huggingface-hub 0.23.2 which is incompatible.
db-dtypes 1.4.3 requires packaging>=24.2.0, but you have packaging 23.2 which is incompatible.
opencv-contrib-python 4.12.0.88 requires numpy<2.3.0,>=2; python_version >= "3.9", but you have numpy 1.25.2 which is incompatible.
diffusers 0.34.0 requires huggingface-hub>=0.27.0, but you have huggingface-hub 0.23.2 which is incompatible.
opencv-python 4.12.0.88 requires numpy<2.3.0,>=2; python_version >= "3.9", 

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [4]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

#Libraries for downloading and loading the llm
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

#### Downloading and Loading the model

In [5]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [6]:
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [11]:
llm = Llama(
    model_path=model_path,
    n_ctx=4096,
    n_gpu_layers=38,
    n_batch=512
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Response

In [8]:
def response(query,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    model_output = llm(
      prompt=query,
      max_tokens=max_tokens,
      temperature=temperature,
      top_p=top_p,
      top_k=top_k
    )

    return model_output['choices'][0]['text']

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [111]:
query = "What is the protocol for managing sepsis in a critical care unit?"
print(response(query))

Llama.generate: prefix-match hit




Sepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:

1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.
2. Source control


### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [110]:
query = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(response(query))

Llama.generate: prefix-match hit




Appendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:

1. Abdominal pain: The pain is typically located in the lower right quadrant of the abdomen and may start as a mild discomfort that gradually worsens over time. The pain may be constant or intermittent and may be aggravated by movement, deep breathing, or coughing.
2. Loss of appetite


### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [109]:
query = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(response(query))

Llama.generate: prefix-match hit




Sudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.

The exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications.


### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [108]:
query = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(response(query))

Llama.generate: prefix-match hit




A person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:

1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.
2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [107]:
query = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(response(query))

Llama.generate: prefix-match hit




First and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:

1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.
2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.
3. Immobilize the leg: Use a splint, sl


## Question Answering using LLM with Prompt Engineering

In [95]:
def build_prompt(query: str) -> str:

    system_role = (
        "You are a seasoned clinician with over 10 years of experience in critical care."
    )

    instructions = ("""
        Please answer the following question clearly and briefly, using simple language.
Focus only on the key points relevant to the question without extra details.
    """
    )

    final_prompt = f"{system_role}\n\n{instructions}\nQuestion: {query}"
    return final_prompt

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [106]:
query = "What is the protocol for managing sepsis in a critical care unit?"
prompt_query = build_prompt(query)
print(response(prompt_query, max_tokens= 200 ))

Llama.generate: prefix-match hit




Answer: 1. Early recognition and quick response: Identify sepsis early based on clinical signs, lab results (fever, tachycardia, respiratory distress, lactate level), and use of scoring systems like Sequential Organ Failure Assessment (SOFA) or Quick Sequential Organ Failure Assessment (qSOFA). Initiate treatment promptly.

2. Fluid resuscitation: Administer intravenous fluids to maintain adequate blood pressure and organ perfusion, aiming for a mean arterial pressure (MAP) of 65-70 mmHg. Use crystalloids initially, then consider colloids or blood products if fluid requirements are high.

3. Antibiotics: Start broad-spectrum antibiotics as soon as possible based on suspected infection source and local microbial resistance patterns. Adjust therapy based on culture results and clinical response.



### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [105]:
query = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
prompt_query = build_prompt(query)
print(response(prompt_query, max_tokens= 500, temperature=0.5 ))

Llama.generate: prefix-match hit




Answer: Appendicitis is characterized by the following symptoms: 1) Abdominal pain usually starting around the navel area, then shifting to the right lower quadrant; 2) Loss of appetite; 3) Nausea and vomiting; 4) Fever; 5) Constipation or diarrhea. Appendicitis cannot be cured with medicine alone as it requires surgical removal of the inflamed appendix, known as an appendectomy. This procedure can be performed laparoscopically (minimally invasive) or through a larger incision (open surgery), depending on the severity and individual circumstances.


### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [104]:
query = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
prompt_query = build_prompt(query)
print(response(prompt_query, max_tokens= 300, top_p = 0.9 ))

Llama.generate: prefix-match hit




Answer: Sudden patchy hair loss, also known as alopecia areata, is an autoimmune condition where the immune system attacks hair follicles. There's no definitive cure, but several treatments can help stimulate hair regrowth or prevent further loss.

1. Corticosteroids: Topical or injected corticosteroids are often used to reduce inflammation and suppress the immune response. They can be effective in promoting hair regrowth.
2. Immunotherapy: Minoxidil, a medication that stimulates hair growth, can be applied topically. Diphencyprone (DPCP) or squaric acid dibutyl ester (SADBE), which trigger an allergic reaction and stimulate the immune system to promote hair regrowth, are other options.
3. Hair transplant: In severe cases where there's extensive hair loss, a hair transplant may be considered as a last resort.
4. Lifestyle modifications: Stress management techniques, such as meditation or yoga, can help reduce stress and improve overall health. A balanced diet, rich in vitamins and mi

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [103]:
query = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
prompt_query = build_prompt(query)
print(response(prompt_query, max_tokens= 200, top_p = 0.95, top_k=10 ))

Llama.generate: prefix-match hit




Answer: For a person with a brain injury, treatment depends on the severity and location of the injury. Common treatments include:
1. Emergency care: This may involve surgery to remove hematomas (blood clots) or repair skull fractures.
2. Medications: To manage symptoms such as seizures, pain, or swelling in the brain.
3. Rehabilitation: Physical therapy, occupational therapy, speech therapy, and cognitive rehabilitation can help improve function and independence.
4. Assistive devices: Devices like wheelchairs, walkers, or communication aids may be necessary for mobility and communication.
5. Lifestyle modifications: These may include dietary changes, stress management techniques, and regular exercise to promote overall health and well-being.
6. Support groups: Joining support groups can provide emotional and social support for individuals with brain injuries and their families.
7. Voc


### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [102]:
query = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
prompt_query = build_prompt(query)
print(response(prompt_query, max_tokens= 1000, temperature=0.7, top_k=20))

Llama.generate: prefix-match hit




Answer: 1. Secure the area: Apply a splint or immobilize the fractured leg using a well-padded sling to prevent movement and further injury.
2. Assess vital signs: Check the person's breathing, pulse, blood pressure, and level of consciousness. Monitor for signs of shock such as weak pulse, rapid heartbeat, or clammy skin.
3. Administer pain relief: Provide appropriate pain relief using over-the-counter medication like acetaminophen or prescription painkillers if necessary.
4. Seek medical help: Encourage the person to seek medical attention as soon as possible to evaluate the extent of the injury and receive proper treatment, which may include surgery.
5. Provide hydration and nutrition: Ensure the person stays hydrated by drinking plenty of fluids. If they are unable to eat solid foods, provide nutritional supplements or a liquid diet.
6. Encourage movement: Once medical care has been received and the fracture is stabilized, encourage gentle exercises to maintain muscle strength an

## Data Preparation for RAG

### Loading the Data

In [26]:
pdf_file_path = "/content/medical_diagnosis_manual.pdf"
pdf_loader = PyMuPDFLoader(pdf_file_path)

In [27]:
medical_diag_manual = pdf_loader.load()

### Data Overview

#### Checking the first 5 pages

In [28]:
for i in range(5):
    print(f"Page Number : {i+1}",end="\n")
    print(medical_diag_manual[i].page_content,end="\n")

Page Number : 1
toktokq@gmail.com
F1XT27MIPG
meant for personal use by toktokq@gma
shing the contents in part or full is liable 

Page Number : 2
toktokq@gmail.com
F1XT27MIPG
This file is meant for personal use by toktokq@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.

Page Number : 3
Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    ...........................................................................................................................................................................................
53
1 - Nutr

#### Checking the number of pages

In [30]:
len(medical_diag_manual)

4114

### Data Chunking

In [31]:
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=512,
    chunk_overlap= 20
)

In [32]:
document_chunks = pdf_loader.load_and_split(text_splitter)

In [33]:
len(document_chunks)

8453

In [36]:
document_chunks[0].page_content

'toktokq@gmail.com\nF1XT27MIPG\nmeant for personal use by toktokq@gma\nshing the contents in part or full is liable'

In [34]:
document_chunks[-2].page_content

'Y\nYaws 1266-1267\nforest 1379\nY chromosome 3373 (see also Genetic)\nabnormalities of 3005\nYeast infection (see also Fungal infection)\nvaginal 2542, 2544, 2545\nYellow fever 1400, 1429, 1437\nhepatic inflammation in 248\nvaccine against 1172, 1437, 3441\nYellow nail syndrome 732, 1995\npleural effusion in 1997\nYellow skin (see Jaundice)\nYersinia infection 1167, 1256-1257\nY. enterocolitica infection 147\nY. pestis infection 1924\nYew poisoning 3338\nYips 1762\nYo, antibodies to 1056\nYolk sac tumor 2476\nThe Merck Manual of Diagnosis & Therapy, 19th Edition\nY\n4103\ntoktokq@gmail.com\nF1XT27MIPG\nThis file is meant for personal use by toktokq@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action.'

In [35]:
document_chunks[-1].page_content

"Z\nZafirlukast 1879\nZalcitabine 1451\nin children 2854\nZaleplon 1709\nZanamivir 1407\nin influenza 1407, 1929\nZAP-70 (zeta-associated protein 70) deficiency 1092, 1108\nZavanelli maneuver 2680\nZellweger syndrome 2383, 3023\nZenker's diverticulum 125\nZidovudine 1451, 1453\nin children 2854\nZileuton 1881\nin asthma 1880\nZinc 49, 55, 3431-3432\nin common cold 1405\ndeficiency of 11, 49, 55\nin dermatophytoses 705\npoisoning with 3328, 3353\nrecommended dietary allowances for 50\nreference values for 3499\ntoxicity of 49, 55\ncopper deficiency and 49\nin Wilson's disease 52\nZinc oxide 2233\ngelatin formulation of 646, 672\nZinc pyrithione 647\nZinc shakes 55\nZipper injury 3239, 3240\nZiprasidone\nin agitation 1492\nin bipolar disorder 3059\npoisoning with 3347\nin schizophrenia 1566\nZoledronate 359, 361, 848\nZollinger-Ellison syndrome 95, 199, 200-201, 910\nmastocytosis vs 1125\nMenetrier's disease vs 132\npeptic ulcer disease vs 134\nZolmitriptan 1721\nZolpidem 1709, 3103\nZon

### Embedding

In [37]:
embedding_model = SentenceTransformerEmbeddings(model_name='thenlper/gte-large')

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

modules.json:   0%|          | 0.00/385 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/57.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/619 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/670M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/191 [00:00<?, ?B/s]

In [38]:
embedding_1 = embedding_model.embed_query(document_chunks[0].page_content)
embedding_2 = embedding_model.embed_query(document_chunks[1].page_content)

In [44]:
print("Dimension of the embedding vector ",len(embedding_1))
len(embedding_1) == len(embedding_2)

Dimension of the embedding vector  1024


True

In [42]:
embedding_l_1 = embedding_model.embed_query(document_chunks[-1].page_content)
embedding_l_2 = embedding_model.embed_query(document_chunks[-2].page_content)

In [45]:
print("Dimension of the embedding vector ",len(embedding_l_1))
len(embedding_l_1) == len(embedding_l_2)

Dimension of the embedding vector  1024


True

* The embedding model provides a fixed-length vector for any number of chunks.
* This is necessary because we want to compare them for similarity.

### Vector Database

In [59]:
out_dir = '/content/medical_db1'

if not os.path.exists(out_dir):
  os.makedirs(out_dir)

In [61]:
os.environ["CHROMA_TELEMETRY_ENABLED"] = "false"

vectorstore = Chroma.from_documents(
    document_chunks,
    embedding_model,
    persist_directory=out_dir
)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [65]:
vectorstore = Chroma(persist_directory=out_dir,embedding_function=embedding_model)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [67]:
vectorstore.embeddings

HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
  (2): Normalize()
), model_name='thenlper/gte-large', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False)

In [69]:
os.environ["CHROMA_TELEMETRY_ENABLED"] = "false"
vectorstore.similarity_search("What are the symptoms of sepsis?" )

[Document(page_content='overwhelming, severe infections, profound neutropenia is often a poor prognostic sign. Characteristic\nmorphologic changes in the neutrophils of septic patients include Dohle bodies, toxic granulations, and\nvacuolization.\nAnemia can develop despite adequate tissue iron stores. If anemia is chronic, plasma iron and total iron-\nbinding capacity may be decreased. Serious infection, particularly with gram-negative organisms, may\ncause disseminated intravascular coagulation (DIC—see p. 976).\nOther organ systems: Pulmonary compliance may decrease, progressing to acute respiratory distress\nsyndrome (ARDS) and respiratory muscle failure.\nRenal manifestations range from minimal proteinuria to acute renal failure, which can result from shock\nand acute tubular necrosis, glomerulonephritis, or tubulointerstitial disease.\nHepatic dysfunction, including cholestatic jaundice (often a poor prognostic sign) or hepatocellular\ndysfunction, occurs with many infections, ev

* The word 'sepsis' can be found in the page_content of the received message.

### Retriever

In [70]:
retriever = vectorstore.as_retriever(
    search_type='similarity',
    search_kwargs={'k': 2}
)

In [71]:
rel_docs = retriever.get_relevant_documents("What is the protocol for managing sepsis in a critical care unit?")
rel_docs

[Document(page_content="16 - Critical Care Medicine\nChapter 222. Approach to the Critically Ill Patient\nIntroduction\nCritical care medicine specializes in caring for the most seriously ill patients. These patients are best\ntreated in an ICU staffed by experienced personnel. Some hospitals maintain separate units for special\npopulations (eg, cardiac, surgical, neurologic, pediatric, or neonatal patients). ICUs have a high\nnurse:patient ratio to provide the necessary high intensity of service, including treatment and monitoring\nof physiologic parameters.\nSupportive care for the ICU patient includes provision of adequate nutrition (see p. 21) and prevention of\ninfection, stress ulcers and gastritis (see p. 131), and pulmonary embolism (see p. 1920). Because 15 to\n25% of patients admitted to ICUs die there, physicians should know how to minimize suffering and help\ndying patients maintain dignity (see p. 3480).\nPatient Monitoring and Testing\nSome monitoring is manual (ie, by di

* We can observe that the two relevant chunks contain the answer to the query.
* If we increase the k value, there is a chance that we might find the answer in even more chunks.

### System and User Prompt Template

In [72]:
qna_system_message = """
You are a seasoned clinician with over 10 years of experience in critical care.
User input will have the context required by you to answer user questions.
This context will begin with the token: ###Context.
The context contains references to specific portions of a document relevant to the user query.

User questions will begin with the token: ###Question.

Please answer only using the context provided in the input. Do not mention anything about the context in your final answer.

If the answer is not found in the context, respond "I don't know".
"""

In [73]:
qna_user_message_template = """
###Context
Here are some documents that are relevant to the question mentioned below.
{context}

###Question
{question}
"""

### Response Function

In [74]:
def generate_rag_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=k)
    context_list = [d.page_content for d in relevant_document_chunks]

    # Combine document chunks into a single context
    context_for_query = ". ".join(context_list)

    user_message = qna_user_message_template.replace('{context}', context_for_query)
    user_message = user_message.replace('{question}', user_input)

    prompt = qna_system_message + '\n' + user_message

    # Generate the response
    try:
        response = llm(
                  prompt=prompt,
                  max_tokens=max_tokens,
                  temperature=temperature,
                  top_p=top_p,
                  top_k=top_k
                  )

        # Extract and print the model's response
        response = response['choices'][0]['text'].strip()
    except Exception as e:
        response = f'Sorry, I encountered the following error: \n {e}'

    return response

## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [75]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, the protocol for managing sepsis in a critical care unit includes:
1. Administering parenteral antibiotics after taking specimens for Gram stain and culture.
2. Starting empiric therapy immediately upon suspecting sepsis.
3. Selecting antibiotics based on suspected source, clinical setting, knowledge or suspicion of causative organisms, sensitivity patterns common to that specific inpatient unit, and previous culture results.
4. Adding vancomycin if resistant staphylococci or enterococci are suspected.


* The above response shows a very clear, concise, and professional answer.

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [76]:
user_input = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, the common symptoms for appendicitis include abdominal pain, anorexia (loss of appetite), and abdominal tenderness. Appendicitis cannot be cured via medicine alone; surgery is required for treatment. The surgical procedure for treating appendicitis is appendectomy.


* The above response shows a very clear, concise, and professional answer.

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [77]:
user_input = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, the condition being described is likely Alopecia Areata. The effective treatments for this condition include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). The possible cause behind sudden patchy hair loss in this context is an autoimmune disorder.


* The above response shows a very clear, concise, and professional answer.

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [78]:
user_input = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, the recommended treatments for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function include ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure. Surgery may be needed in patients with more severe injury to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas. In the first few days after the injury, maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium are important


* It is a concise and professional response.

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [79]:
user_input = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(user_input))

Llama.generate: prefix-match hit


Based on the context provided, here is the answer:

The person with a fractured leg should follow these steps:
1. Elevate the injured limb above heart level for the first 2 days to minimize swelling.
2. Apply elastic bandages (batting and layers 2 and 4) to the injured area.
3. Immobilize the joints proximal and distal to the injury using a cast or splint.
   - If severe swelling is likely, the cast may be bivalved.
   - Keep the cast dry and never put an


* The sentence was cut off midway, so I will proceed again with some fine-tuning.

### Fine-tuning

In [85]:
user_input = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
print(generate_rag_response(user_input, max_tokens=300, top_p=0.98, top_k=20))

Llama.generate: prefix-match hit


Based on the context provided, here is the answer:

The person with a fractured leg should follow these steps:
1. Elevate the injured limb above heart level for the first 2 days to minimize swelling.
2. Apply elastic bandages (batting and layers 2 and 4) to the injured area.
3. Immobilize the joints proximal and distal to the injury using a cast or splint.
   - If severe swelling is likely, the cast may be bivalved.
   - Keep the cast dry and never put an object inside it.
   - Inspect the cast's edges and skin around it daily for redness or soreness.
   - Apply lotion to any red or sore areas and pad rough edges with soft material.
   - Seek medical care if an odor emanates from within the cast, a fever develops, or there is an infection.
4. After 48 hours, apply warmth (heating pad) for 15 to 20 minutes to relieve pain and speed healing.
5. For fractures requiring prolonged immobilization (more than 3-4 weeks), consider early mobilization to minimize contractures and muscle atrophy.


* By tuning max_tokens, top_p, and top_k, I aim to obtain more detailed and varied answers.
* The response appears complete and uninterrupted.

In [87]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
print(generate_rag_response(user_input, max_tokens=500, top_p=0.98, top_k=20))

Llama.generate: prefix-match hit


Based on the context provided, the protocol for managing sepsis in a critical care unit includes:
1. Administering parenteral antibiotics after taking specimens for Gram stain and culture.
2. Starting empiric therapy immediately upon suspecting sepsis.
3. Selecting antibiotics based on suspected source, clinical setting, knowledge or suspicion of causative organisms, sensitivity patterns common to that specific inpatient unit, and previous culture results.
4. Adding vancomycin if resistant staphylococci or enterococci are suspected.
5. Including a drug effective against anaerobes if there is an abdominal source.
6. Changing the antibiotic regimen based on culture and sensitivity results.
7. Continuing antibiotics for at least 5 days after shock resolves and evidence of infection subsides.
8. Draining abscesses and excising necrotic tissues to eliminate septic foci.
9. Normalizing blood glucose levels between 80 to 110 mg/dL (4.4 to 6.1 mmol/L) using a continuous IV insulin infusion.


* It provides a more detailed response than the one shown for the first question above.

## Output Evaluation

Let us now use the LLM-as-a-judge method to check the quality of the RAG system on two parameters - retrieval and generation. We illustrate this evaluation based on the answeres generated to the question from the previous section.

- We are using the same Mistral model for evaluation, so basically here the llm is rating itself on how well he has performed in the task.

In [112]:
groundedness_rater_system_message  = """
You are tasked with rating AI generated answers to questions posed by users.
You will be presented a question, context used by the AI system to generate the answer and an AI generated answer to the question.
In the input, the question will begin with ###Question, the context will begin with ###Context while the AI generated answer will begin with ###Answer.

Evaluation criteria:
The task is to judge the extent to which the metric is followed by the answer.
1 - The metric is not followed at all
2 - The metric is followed only to a limited extent
3 - The metric is followed to a good extent
4 - The metric is followed mostly
5 - The metric is followed completely

Metric:
The answer should be derived only from the information presented in the context

Instructions:
1. First write down the steps that are needed to evaluate the answer as per the metric.
2. Give a step-by-step explanation if the answer adheres to the metric considering the question and context as the input.
3. Next, evaluate the extent to which the metric is followed.
4. Use the previous information to rate the answer using the evaluaton criteria and assign a score.
"""

In [113]:
relevance_rater_system_message = """
You are tasked with rating AI generated answers to questions posed by users.
You will be presented a question, context used by the AI system to generate the answer and an AI generated answer to the question.
In the input, the question will begin with ###Question, the context will begin with ###Context while the AI generated answer will begin with ###Answer.

Evaluation criteria:
The task is to judge the extent to which the metric is followed by the answer.
1 - The metric is not followed at all
2 - The metric is followed only to a limited extent
3 - The metric is followed to a good extent
4 - The metric is followed mostly
5 - The metric is followed completely

Metric:
Relevance measures how well the answer addresses the main aspects of the question, based on the context.
Consider whether all and only the important aspects are contained in the answer when evaluating relevance.

Instructions:
1. First write down the steps that are needed to evaluate the context as per the metric.
2. Give a step-by-step explanation if the context adheres to the metric considering the question as the input.
3. Next, evaluate the extent to which the metric is followed.
4. Use the previous information to rate the context using the evaluaton criteria and assign a score.
"""

In [114]:
user_message_template = """
###Question
{question}

###Context
{context}

###Answer
{answer}
"""

In [115]:
def generate_ground_relevance_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=3)
    context_list = [d.page_content for d in relevant_document_chunks]
    context_for_query = ". ".join(context_list)

    # Combine user_prompt and system_message to create the prompt
    prompt = f"""[INST]{qna_system_message}\n
                {'user'}: {qna_user_message_template.format(context=context_for_query, question=user_input)}
                [/INST]"""

    response = llm(
            prompt=prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    answer =  response["choices"][0]["text"]

    # Combine user_prompt and system_message to create the prompt
    groundedness_prompt = f"""[INST]{groundedness_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Combine user_prompt and system_message to create the prompt
    relevance_prompt = f"""[INST]{relevance_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    response_1 = llm(
            prompt=groundedness_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    response_2 = llm(
            prompt=relevance_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    return response_1['choices'][0]['text'],response_2['choices'][0]['text']

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [117]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information related to managing sepsis in a critical care unit from the context.
2. Compare the identified information with the AI generated answer to check if the answer is derived only from the context.
3. Evaluate the extent to which the metric is followed.

Explanation:
The context provides detailed information about managing critically ill patients in an ICU, including supportive care and patient monitoring. Among various aspects of critical care, the context specifically mentions sepsis management with antibiotics, draining abscesses, and normalizing blood glucose. The AI generated answer includes these exact steps for managing sepsis in a critical care unit.

Since the AI generated answer is derived directly from the information provided in the context, it follows the metric completely.

Rating:
Based on the evaluation criteria, the answer receives a score of 5 as it follows the metric completely.

 Steps to evaluate the context

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [118]:
user_input = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information related to the question in the context.
2. Determine if the symptoms mentioned in the question are present in the context.
3. Check if the context provides any information about curing appendicitis with medicine.
4. Verify if the surgery mentioned in the answer is consistent with the context.

Explanation:
The answer adheres to the metric as it derives the common symptoms for appendicitis directly from the context and states that appendicitis cannot be cured via medicine based on the information provided in the context. The surgery mentioned in the answer, open or laparoscopic appendectomy, is also consistent with the surgical treatment mentioned in the context.

Evaluation:
The metric is followed mostly as the symptoms are directly derived from the context and the statement about curing appendicitis with medicine is based on accurate information from the context.

Rating:
Based on the evaluation criteria, I would rate this

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [119]:
user_input = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the main topic of the question and context.
2. Understand the specific information being asked for in the question.
3. Determine if the AI generated answer is derived only from the context provided.
4. Check if any additional or irrelevant information is included in the answer.

Explanation:
The main topic of the question is about effective treatments and possible causes for sudden patchy hair loss. The context provides detailed information about various types of hair loss, including alopecia areata, and the different treatment options available for each type.

The AI generated answer correctly identifies the condition being discussed as alopecia areata and lists several treatment options for it. These treatment options are directly taken from the context, making it clear that the answer is derived only from the information presented in the context. There is no additional or irrelevant information included in the answer.

Evaluation:
Based on 

### Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [120]:
user_input = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information in the context related to treatments for TBI.
2. Determine if the AI generated answer is derived solely from the information presented in the context.

Explanation:
The AI generated answer is a direct summary of the information provided in the context regarding the treatments for TBI. The answer mentions that there is no specific treatment for TBI, but supportive care should include preventing systemic complications, providing good nutrition, and preventing pressure ulcers. This information is explicitly stated in the context.

Evaluation:
The metric is followed completely as the AI generated answer is derived solely from the information presented in the context.

Rating:
Based on the evaluation criteria, I would rate this answer a 5 for following the metric completely.

 Steps to evaluate the context as per the relevance metric:
1. Identify the main aspects of the question: The question asks about treatments for a person w

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [121]:
user_input = "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=500)

print(ground,end="\n\n")
print(rel)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the information in the context related to precautions and treatment for a person with a fractured leg.
2. Compare the information in the context to the information in the AI generated answer.
3. Determine if the AI generated answer is derived only from the information presented in the context.

Explanation:
The AI generated answer includes all the necessary precautions and treatment steps for a person with a fractured leg as mentioned in the context. The answer also includes additional information about complications, symptoms, diagnosis, and treatment which are also discussed in the context. However, the answer does not include any new or incorrect information.

Therefore, the metric is followed mostly because the AI generated answer is derived primarily from the information presented in the context with some additional relevant information that is also discussed in the context.

Rating:
Based on the evaluation criteria and the extent to whic

## Actionable Insights and Business Recommendations

1. The vector database creation time took nearly over 10 minutes. This was because the size of the PDF file was large.

2. For concise answers, such as symptoms of a certain disease, I found that even when increasing the number of max_tokens, there was a limit to the detailed explanation, and the answers were almost the same as previous ones. This is probably because only the context defined is referenced.

3. For parameters like temperature, I mostly set it to 0 to encourage answers based strictly on the context.
It's a parameter used to obtain creative or more diverse responses.

4. When the answers were cut off in the middle, I adjusted the max_tokens parameter to prevent any truncation.

5. I discovered that the quality of responses varies significantly depending on how the questions are phrased through prompt engineering.

<font size=6 color='blue'>Power Ahead</font>
___