## Problem Statement

### Business Context

The healthcare industry is rapidly evolving, with professionals facing increasing challenges in managing vast volumes of medical data while delivering accurate and timely diagnoses. The need for quick access to comprehensive, reliable, and up-to-date medical knowledge is critical for improving patient outcomes and ensuring informed decision-making in a fast-paced environment.

Healthcare professionals often encounter information overload, struggling to sift through extensive research and data to create accurate diagnoses and treatment plans. This challenge is amplified by the need for efficiency, particularly in emergencies, where time-sensitive decisions are vital. Furthermore, access to trusted, current medical information from renowned manuals and research papers is essential for maintaining high standards of care.

To address these challenges, healthcare centers can focus on integrating systems that streamline access to medical knowledge, provide tools to support quick decision-making, and enhance efficiency. Leveraging centralized knowledge platforms and ensuring healthcare providers have continuous access to reliable resources can significantly improve patient care and operational effectiveness.

**Common Questions to Answer**

**1. Diagnostic Assistance**: "What are the common symptoms and treatments for pulmonary embolism?"

**2. Drug Information**: "Can you provide the trade names of medications used for treating hypertension?"

**3. Treatment Plans**: "What are the first-line options and alternatives for managing rheumatoid arthritis?"

**4. Specialty Knowledge**: "What are the diagnostic steps for suspected endocrine disorders?"

**5. Critical Care Protocols**: "What is the protocol for managing sepsis in a critical care unit?"

### Objective

As an AI specialist, your task is to develop a RAG-based AI solution using renowned medical manuals to address healthcare challenges. The objective is to **understand** issues like information overload, **apply** AI techniques to streamline decision-making, **analyze** its impact on diagnostics and patient outcomes, **evaluate** its potential to standardize care practices, and **create** a functional prototype demonstrating its feasibility and effectiveness.

### Data Description

The **Merck Manuals** are medical references published by the American pharmaceutical company Merck & Co., that cover a wide range of medical topics, including disorders, tests, diagnoses, and drugs. The manuals have been published since 1899, when Merck & Co. was still a subsidiary of the German company Merck.

The manual is provided as a PDF with over 4,000 pages divided into 23 sections.

## Installing and Importing Necessary Libraries and Dependencies

In [7]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q
# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used
# !CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━[0m [32m1.1/1.8 MB[0m [31m34.2 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m47.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m233.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m286.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.9/16.9 MB[0m [31m323.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.9/43.9 kB[0m [31m294.1 MB/

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [1]:
#Install required libraries & packages
!pip install huggingface_hub==0.23.2 pandas==1.5.3 tiktoken==0.6.0 pymupdf==1.25.1 langchain==0.1.1 langchain-community==0.0.13 chromadb==0.4.22 sentence-transformers==2.3.1 numpy==1.25.2 -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.7/41.7 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.7/41.7 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.9/40.9 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.9/40.9 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.9/40.9 kB[0m [31m3.6 MB/s[0m eta [36m0

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

In [1]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

#Libraries for downloading and loading the llm
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Question Answering using LLM

#### Downloading and Loading the model

In [2]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"
# The model chosen from hugging face for NLP is from Bloke user using Mistral LLM

In [3]:
from huggingface_hub import hf_hub_download

In [4]:
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)
#downloading the model

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [5]:
from llama_cpp import Llama

In [6]:
llm = Llama(
    model_path=model_path,
    n_ctx=2300,
    n_gpu_layers=38,
    n_batch=512
)
#Loading the Llama

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


#### Response

In [7]:
def response(query,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    model_output = llm(
      prompt=query,
      max_tokens=max_tokens,
      temperature=temperature,
      top_p=top_p,
      top_k=top_k
    )

    return model_output['choices'][0]['text']

In [8]:
response("What is the AI and ML program from UT Austin?")
# This response is based on already pre trained data set of the model- As we can see, it was hallucinating about psychology and cognitive sciences

'\n\nThe Artificial Intelligence (AI) and Machine Learning (ML) Program at The University of Texas at Austin offers a multidisciplinary approach to the study of AI and ML, combining computer science, mathematics, statistics, psychology, and cognitive sciences. The program provides students with a strong foundation in theoretical and practical aspects of AI and ML, preparing them for careers in industry, academia, or government.\n\nThe program offers both a Master of Science (MS) and a Doctor of Philosophy (PhD) degree in Computer Science with a specialization in AI and ML. Students can also pursue a graduate certificate'

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [9]:
response("What is the protocol for managing sepsis in a critical care unit?")
# This response is based on already pre trained data set of the model- As we can see, it gave a response generic and didnt talk about protocol at all

Llama.generate: prefix-match hit


'\n\nSepsis is a life-threatening condition that can arise from an infection, and it requires prompt recognition and aggressive management in a critical care unit. The following are general steps for managing sepsis in a critical care unit:\n\n1. Early recognition: Recognize the signs and symptoms of sepsis early and initiate treatment as soon as possible. Sepsis can present with various clinical features, including fever or hypothermia, tachycardia or bradycardia, altered mental status, respiratory distress, and lactic acidosis.\n2. Resusc'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [10]:
response("What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?")
# This response is based on already pre trained data set of the model- very generic response to the appendicitis but not anything related to surgical procedures

Llama.generate: prefix-match hit


'\n\nAppendicitis is a medical condition characterized by inflammation of the appendix, a small tube-shaped organ located in the lower right side of the abdomen. The symptoms of appendicitis can vary from person to person, but some common signs include:\n\n1. Abdominal pain: The pain may start as a mild discomfort around the navel or in the lower right abdomen, which then gradually moves to the right lower quadrant and becomes more severe over time. The pain may be constant or intermittent and is often worsened by movement, coughing, or deep breathing'

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [11]:
response("What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?")
# This response is based on already pre trained data set of the model- it did try to address it for the user input

Llama.generate: prefix-match hit


"\n\nSudden patchy hair loss, also known as alopecia areata, is a common autoimmune disorder that affects the hair follicles. It can result in round or oval bald patches on the scalp, but it can also occur on other parts of the body such as the beard area, eyebrows, or eyelashes.\n\nThe exact cause of alopecia areata is not known, but it's believed to be related to a problem with the immune system. Some possible triggers for this condition include stress, genetics, viral infections, and certain medications."

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [12]:
response("What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?")
#likewise here too by the llama model without any RAG involved

Llama.generate: prefix-match hit


"\n\nA person who has sustained a physical injury to brain tissue, also known as a traumatic brain injury (TBI), may require various treatments depending on the severity and location of the injury. Here are some common treatments recommended for TBIs:\n\n1. Emergency care: The first priority is to ensure the person's airway is clear, they are breathing, and their heart is beating normally. In severe cases, emergency surgery may be required to remove hematomas or other obstructions.\n2. Medications: Depending on the symptoms, medications may be prescribed to manage conditions such as"

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [13]:
response("What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?")
# Pretty naive in response to the person who fractured his leg

Llama.generate: prefix-match hit


"\n\nFirst and foremost, if you suspect that someone has fractured their leg while hiking, it's essential to ensure their safety and prevent further injury. Here are some necessary precautions:\n\n1. Keep the person calm and still: Encourage them to remain as still as possible to minimize pain and prevent worsening the injury.\n2. Assess the situation: Check for any signs of shock, such as pale skin, rapid heartbeat, or shallow breathing. If you notice these symptoms, seek medical help immediately.\n3. Immobilize the leg: Use a splint, sl"

In [14]:
qna_system_message = """You are a helpful AI assistant.

You must answer user questions using only the information provided in the context.
- If the answer is not in the context, reply with "I don't know".
- Do not mention the context or refer to it in your response.
- Keep the answer short and clear."""
#Telling system to be very clear and do not hallucinate

In [15]:
qna_user_message = """###Context
The AI and Machine Learning Program at UT Austin is an online postgraduate certificate program offered in collaboration with Great Learning. It is designed for working professionals and includes topics like deep learning, natural language processing, and machine learning operations. The program typically lasts for 6 to 9 months.

###Question
What is the duration of the AI and ML program at UT Austin?"""
#how the User prompt looks like and add the context

## Question Answering using LLM with Prompt Engineering

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [16]:
user_input = qna_system_message+"\n"+ "What is the protocol for managing sepsis in a critical care unit?"
response(user_input)
#definitely improved with prompt engineering but not adding the relavant context and could still hallucinating a little bit

Llama.generate: prefix-match hit


'\n\nSepsis management in a critical care unit typically involves:\n1. Early recognition and diagnosis.\n2. Immediate administration of antibiotics.\n3. Fluid resuscitation to maintain adequate blood pressure and organ perfusion.\n4. Oxygen therapy to maintain adequate oxygenation.\n5. Close monitoring of vital signs, laboratory values, and organ function.\n6. Adjustment of treatments based on response to therapy and clinical progression.\n7. Supportive measures such as mechanical ventilation or renal replacement therapy if necessary.'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [17]:
user_input = qna_system_message+"\n"+ " What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
response(user_input)
#Improved factor is definitely max tokens below and it did talk about curing anything using medicine

Llama.generate: prefix-match hit


'\n\nAppendicitis is characterized by abdominal pain, usually located in the lower right area of the belly. Other common symptoms include loss of appetite, nausea, vomiting, fever, and a feeling of being sick or unwell. Appendicitis cannot be cured with medicine alone; surgery is required to remove the inflamed appendix. The most common surgical procedure for treating appendicitis is an appendectomy.'

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [18]:
user_input = qna_system_message+"\n"+ " What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
response(user_input)
#With the same pre trained llama model and improved prompts yield better results- here max toekns were reduced and talking contextually related to hair loss

Llama.generate: prefix-match hit


'\n\nSudden patchy hair loss can have various causes, including stress, autoimmune conditions like alopecia areata, nutritional deficiencies, or certain medications. Effective treatments depend on the underlying cause:\n- Alopecia areata: Topical corticosteroids, immunotherapy (minoxidil), or systemic steroids may be prescribed by a healthcare professional.\n- Nutritional deficiencies: Consuming a balanced diet rich in essential nutrients like iron, zinc, and biotin can help promote hair growth.\n- Stress'

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [23]:
user_input = qna_system_message+"\n"+ "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
response(user_input)
#Here it is not accurate as it doesnt talk about any treatment for the injury in detail - lifestyle and modificiations is not related

Llama.generate: prefix-match hit


'\n\nThe treatment for a brain injury depends on the severity and location of the injury. Common treatments include:\n1. Medications to manage symptoms such as pain, swelling, and seizures.\n2. Rehabilitation therapy, including physical therapy, occupational therapy, speech therapy, and cognitive rehabilitation.\n3. Surgery to remove hematomas or other obstructions.\n4. Assistive devices, such as braces or wheelchairs, to help with mobility and daily activities.\n5. Lifestyle modifications, such as a healthy diet and regular exercise, to promote brain health'

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [22]:
user_input = qna_system_message+"\n"+ "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
response(user_input)
#stil hallucinating but improved responses with fractured leg as compared to before

Llama.generate: prefix-match hit


"\n\nFor a fractured leg during hiking:\n1. Immobilize the leg using a splint or sling to prevent movement and further injury.\n2. Apply ice packs to reduce swelling and pain.\n3. Elevate the injured leg above heart level whenever possible.\n4. Seek medical attention as soon as possible, especially if there is severe pain, deformity, or inability to walk.\n5. Follow the doctor's instructions for proper care and recovery, which may include:\n   a. Rest and avoiding weight-bearing activities.\n   b. Cont"

## Data Preparation for RAG

In [21]:
#Libraries for processing dataframes,text
import json,os
import tiktoken
import pandas as pd

#Libraries for Loading Data, Chunking, Embedding, and Vector Databases
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Chroma

### Loading the Data

### Fine-tuning

In [24]:
from google.colab import drive
drive.mount('/content/drive')
#mouting the drive

Mounted at /content/drive


In [28]:
pdf_path="/content/drive/MyDrive/ColabNotebooks/medical_diagnosis_manual.pdf"
#path to the drive

In [26]:
from langchain_community.document_loaders import PyMuPDFLoader


In [29]:
pdf_loader = PyMuPDFLoader(pdf_path)
#load the pdf using pyMuPDF loader fucntion

In [35]:
data_manual=pdf_loader.load()
#content is tored in data_manual obj

### Data Overview

#### Checking the first 5 pages

In [36]:
for i in range(5):
    print(data_manual[i].page_content,end='\n')
    #checked the fist 5 pages with range=5

visucasukhela@gmail.com
WM1E5NG069
ant for personal use by visucasukhela@g
shing the contents in part or full is liable 

visucasukhela@gmail.com
WM1E5NG069
This file is meant for personal use by visucasukhela@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.

Table of Contents
1
Front    ................................................................................................................................................................................................................
1
Cover    .......................................................................................................................................................................................................
2
Front Matter    ...........................................................................................................................................................................................
53
1 - Nutritional Disorders    .......

#### Checking the number of pages

In [37]:
len(data_manual)
#it has total of 4114 pages

4114

### Data Chunking

In [30]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [38]:
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    encoding_name='cl100k_base',
    chunk_size=512,
    chunk_overlap= 20
)
#chunking the data using encoder and overlap is 20

In [32]:
document_chunks = pdf_loader.load_and_split(text_splitter)
#loaded after splitting into document_chunk obj

In [33]:
len(document_chunks)
#the manual of 4114 pages now loaded as 8469 chunks

8469

In [None]:
document_chunks[0].page_content
#reading the first chunk content

'visucasukhela@gmail.com\nWM1E5NG069\nant for personal use by visucasukhela@g\nshing the contents in part or full is liable'

In [None]:
document_chunks[1].page_content
#reading the second chunk content

'visucasukhela@gmail.com\nWM1E5NG069\nThis file is meant for personal use by visucasukhela@gmail.com only.\nSharing or publishing the contents in part or full is liable for legal action.'

In [None]:
document_chunks[2].page_content
#reading the third chunk content

'Table of Contents\n1\nFront    ................................................................................................................................................................................................................\n1\nCover    .......................................................................................................................................................................................................\n2\nFront Matter    ...........................................................................................................................................................................................\n53\n1 - Nutritional Disorders    ...............................................................................................................................................................\n53\nChapter 1. Nutrition: General Considerations    ...........................................................................................

In [None]:
document_chunks[3].page_content
#reading the fourth chunk content

"Chapter 24. Testing for Hepatic & Biliary Disorders    ......................................................................................................\n305\nChapter 25. Drugs & the Liver    ................................................................................................................................................\n308\nChapter 26. Alcoholic Liver Disease    ....................................................................................................................................\n314\nChapter 27. Fibrosis & Cirrhosis    ............................................................................................................................................\n322\nChapter 28. Hepatitis    ..................................................................................................................................................................\n333\nChapter 29. Vascular Disorders of the Liver    .................................................

### Embedding

In [39]:
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings

In [40]:
embedding_model = SentenceTransformerEmbeddings(model_name='thenlper/gte-large')

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

modules.json:   0%|          | 0.00/385 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/57.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/619 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/670M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/191 [00:00<?, ?B/s]

In [41]:
embedding_1 = embedding_model.embed_query(document_chunks[0].page_content)
embedding_2 = embedding_model.embed_query(document_chunks[1].page_content)
#embedded the chunks using the gte-large model

In [None]:
print("Dimension of the embedding vector ",len(embedding_1))
len(embedding_1)==len(embedding_2)
# both chunk 0 and chunk 1 are of same size length

Dimension of the embedding vector  1024


True

### Vector Database

In [42]:
out_dir = 'medical_db'

if not os.path.exists(out_dir):
  os.makedirs(out_dir)

In [43]:
vectorstore = Chroma.from_documents(
    document_chunks,
    embedding_model,
    persist_directory=out_dir
)
#stored the embeddings into the chroma vector db here

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [None]:
vectorstore = Chroma(persist_directory=out_dir,embedding_function=embedding_model)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [None]:
vectorstore.embeddings

HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
  (2): Normalize()
), model_name='thenlper/gte-large', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False)

In [44]:
vectorstore.similarity_search("Appendicitis surgery required",k=3)
#it did the similarity search and pulled the top 3 results for the user input question

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given


[Document(page_content="• Surgical removal\n• IV fluids and antibiotics\nTreatment of acute appendicitis is open or laparoscopic appendectomy; because treatment delay\nincreases mortality, a negative appendectomy rate of 15% is considered acceptable. The surgeon can\nusually remove the appendix even if perforated. Occasionally, the appendix is difficult to locate: In these\ncases, it usually lies behind the cecum or the ileum and mesentery of the right colon. A contraindication to\nappendectomy is inflammatory bowel disease involving the cecum. However, in cases of terminal ileitis\nand a normal cecum, the appendix should be removed.\nAppendectomy should be preceded by IV antibiotics. Third-generation cephalosporins are preferred. For\nnonperforated appendicitis, no further antibiotics are required. If the appendix is perforated, antibiotics\nshould be continued until the patient's temperature and WBC count have normalized or continued for a\nfixed course, according to the surgeon's pr

### Retriever

In [46]:
retriever = vectorstore.as_retriever(
    search_type='similarity',
    search_kwargs={'k': 2}
)
#retriever returns the context to the LLM after searching through all relavant docs

In [47]:
rel_docs = retriever.get_relevant_documents("How does the appendicitis surgery happen?")
rel_docs
# it did talk about the appendicitis clearly

[Document(page_content="• Surgical removal\n• IV fluids and antibiotics\nTreatment of acute appendicitis is open or laparoscopic appendectomy; because treatment delay\nincreases mortality, a negative appendectomy rate of 15% is considered acceptable. The surgeon can\nusually remove the appendix even if perforated. Occasionally, the appendix is difficult to locate: In these\ncases, it usually lies behind the cecum or the ileum and mesentery of the right colon. A contraindication to\nappendectomy is inflammatory bowel disease involving the cecum. However, in cases of terminal ileitis\nand a normal cecum, the appendix should be removed.\nAppendectomy should be preceded by IV antibiotics. Third-generation cephalosporins are preferred. For\nnonperforated appendicitis, no further antibiotics are required. If the appendix is perforated, antibiotics\nshould be continued until the patient's temperature and WBC count have normalized or continued for a\nfixed course, according to the surgeon's pr

### System and User Prompt Template

In [48]:
qna_system_message = """
You are an assistant whose work is to review the report and provide the appropriate answers from the context.
User input will have the context required by you to answer user questions.
This context will begin with the token: ###Context.
The context contains references to specific portions of a document relevant to the user query.

User questions will begin with the token: ###Question.

Please answer only using the context provided in the input. Do not mention anything about the context in your final answer.

If the answer is not found in the context, respond "I don't know".
"""

In [49]:
qna_user_message_template = """
###Context
Here are some documents that are relevant to the question mentioned below.
{context}

###Question
{question}
"""

### Response Function

In [50]:
def generate_rag_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=k)
    context_list = [d.page_content for d in relevant_document_chunks]

    # Combine document chunks into a single context
    context_for_query = ". ".join(context_list)

    user_message = qna_user_message_template.replace('{context}', context_for_query)
    user_message = user_message.replace('{question}', user_input)

    prompt = qna_system_message + '\n' + user_message

    # Generate the response
    try:
        response = llm(
                  prompt=prompt,
                  max_tokens=max_tokens,
                  temperature=temperature,
                  top_p=top_p,
                  top_k=top_k
                  )

        # Extract and print the model's response
        response = response['choices'][0]['text'].strip()
    except Exception as e:
        response = f'Sorry, I encountered the following error: \n {e}'

    return response

## Question Answering using RAG

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [51]:
user_input = "What is the protocol for managing sepsis in a critical care unit?"
generate_rag_response(user_input)
#very crisp and straight to the question whtout adding additional information like the initial pre trained one

Llama.generate: prefix-match hit


'Based on the context provided, the protocol for managing sepsis in a critical care unit includes:\n1. Administering parenteral antibiotics after taking specimens for Gram stain and culture.\n2. Starting very prompt empiric therapy as soon as sepsis is suspected.\n3. Selecting an antibiotic regimen based on the suspected source, clinical setting, knowledge or suspicion of causative organisms and sensitivity patterns common to that specific inpatient unit, and previous culture results.\n4. Adding vancomycin if resistant staphylococci or enter'

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [52]:
user_input = "What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
generate_rag_response(user_input)
#pulled up document result regarding symptoms- max token size is 128 however it didnt use up all since it knows

Llama.generate: prefix-match hit


'###Answer\nThe common symptoms for appendicitis include abdominal pain, anorexia, and abdominal tenderness. Appendicitis cannot be cured via medicine alone; surgery, specifically a surgical removal of the appendix (appendectomy), is required for treatment.'

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [53]:
user_input = "What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
generate_rag_response(user_input)
#pulled up a response straight from the manual adding the context to the LLM with regards to treatments for hair loss very accurately

Llama.generate: prefix-match hit


'Based on the context provided, the condition being described is Alopecia Areata. The effective treatments for this condition include topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), or psoralen plus ultraviolet A (PUVA). The possible cause behind sudden patchy hair loss in this context is an autoimmune disorder.'

### Query 4:  What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [54]:
user_input = "What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
generate_rag_response(user_input)
# though it consumed max token size completely it still provided the response in detail for injury to brain

Llama.generate: prefix-match hit


'Based on the context provided, the recommended treatments for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function include:\n1. Ensuring a reliable airway and maintaining adequate ventilation, oxygenation, and blood pressure.\n2. Surgery to place monitors to track and treat intracranial pressure, decompress the brain if intracranial pressure is increased, or remove intracranial hematomas.\n3. Maintaining adequate brain perfusion and oxygenation and preventing complications of altered sensorium in the first few days after'

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [55]:
user_input= "What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
generate_rag_response(user_input)
# celarly talking about treatment steps for the fracture injury

Llama.generate: prefix-match hit


'Based on the context provided, here is the answer:\n\nThe person with a fractured leg should elevate the injured limb above heart level for the first 2 days to minimize swelling. After 48 hours, they can apply warmth using a heating pad for 15 to 20 minutes to relieve pain and speed healing. Immobilization is necessary to prevent further injury and facilitate healing. Joints proximal and distal to the injury should be immobilized using either a cast or a splint. A cast is usually used for fractures that require weeks of immobilization,'

## Output Evaluation

Let us now use the LLM-as-a-judge method to check the quality of the RAG system on two parameters - retrieval and generation. We illustrate this evaluation based on the answeres generated to the question from the previous section.

- We are using the same Mistral model for evaluation, so basically here the llm is rating itself on how well he has performed in the task.

In [56]:
groundedness_rater_system_message  = """
You are tasked with rating AI generated answers to questions posed by users.
You will be presented a question, context used by the AI system to generate the answer and an AI generated answer to the question.
In the input, the question will begin with ###Question, the context will begin with ###Context while the AI generated answer will begin with ###Answer.

Evaluation criteria:
The task is to judge the extent to which the metric is followed by the answer.
1 - The metric is not followed at all
2 - The metric is followed only to a limited extent
3 - The metric is followed to a good extent
4 - The metric is followed mostly
5 - The metric is followed completely

Metric:
The answer should be derived only from the information presented in the context

Instructions:
1. First write down the steps that are needed to evaluate the answer as per the metric.
2. Give a step-by-step explanation if the answer adheres to the metric considering the question and context as the input.
3. Next, evaluate the extent to which the metric is followed.
4. Use the previous information to rate the answer using the evaluaton criteria and assign a score.
"""

In [57]:
relevance_rater_system_message = """
You are tasked with rating AI generated answers to questions posed by users.
You will be presented a question, context used by the AI system to generate the answer and an AI generated answer to the question.
In the input, the question will begin with ###Question, the context will begin with ###Context while the AI generated answer will begin with ###Answer.

Evaluation criteria:
The task is to judge the extent to which the metric is followed by the answer.
1 - The metric is not followed at all
2 - The metric is followed only to a limited extent
3 - The metric is followed to a good extent
4 - The metric is followed mostly
5 - The metric is followed completely

Metric:
Relevance measures how well the answer addresses the main aspects of the question, based on the context.
Consider whether all and only the important aspects are contained in the answer when evaluating relevance.

Instructions:
1. First write down the steps that are needed to evaluate the context as per the metric.
2. Give a step-by-step explanation if the context adheres to the metric considering the question as the input.
3. Next, evaluate the extent to which the metric is followed.
4. Use the previous information to rate the context using the evaluaton criteria and assign a score.
"""

In [60]:
user_message_template = """
###Question
{question}

###Context
{context}

###Answer
{answer}
"""

In [58]:
def generate_ground_relevance_response(user_input,k=3,max_tokens=128,temperature=0,top_p=0.95,top_k=50):
    global qna_system_message,qna_user_message_template
    # Retrieve relevant document chunks
    relevant_document_chunks = retriever.get_relevant_documents(query=user_input,k=3)
    context_list = [d.page_content for d in relevant_document_chunks]
    context_for_query = ". ".join(context_list)

    # Combine user_prompt and system_message to create the prompt
    prompt = f"""[INST]{qna_system_message}\n
                {'user'}: {qna_user_message_template.format(context=context_for_query, question=user_input)}
                [/INST]"""

    response = llm(
            prompt=prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    answer =  response["choices"][0]["text"]

    # Combine user_prompt and system_message to create the prompt
    groundedness_prompt = f"""[INST]{groundedness_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    # Combine user_prompt and system_message to create the prompt
    relevance_prompt = f"""[INST]{relevance_rater_system_message}\n
                {'user'}: {user_message_template.format(context=context_for_query, question=user_input, answer=answer)}
                [/INST]"""

    response_1 = llm(
            prompt=groundedness_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    response_2 = llm(
            prompt=relevance_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            top_k=top_k,
            stop=['INST'],
            )

    return response_1['choices'][0]['text'],response_2['choices'][0]['text']

### Query 1: What is the protocol for managing sepsis in a critical care unit?

In [None]:
user_input="What is the protocol for managing sepsis in a critical care unit?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=350)

print(ground,end="\n\n")
print(rel)
# it is 5 for the retrival based on context and 4 followed completely for the response as per the evaluation criteria- Hence RAG retrieval is based on the context

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information related to managing sepsis in a critical care unit from the context.
2. Compare the identified information with the AI generated answer to check if the answer is derived only from the context.
3. Evaluate the extent to which the metric is followed.

Explanation:
The context provides detailed information about managing critically ill patients in an ICU, including supportive care and patient monitoring. Among these details, there are specific instructions for managing sepsis, such as taking specimens for Gram stain and culture before administering parenteral antibiotics, starting empiric therapy immediately, and adjusting the antibiotic regimen based on culture and sensitivity results.

The AI generated answer includes all of these steps, verbatim from the context, with some minor rephrasings for clarity. Therefore, the answer is derived solely from the information presented in the context.

Evaluation:
The metric is followed

### Query 2: What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?

In [None]:
user_input="What are the common symptoms for appendicitis, and can it be cured via medicine? If not, what surgical procedure should be followed to treat it?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=350)

print(ground,end="\n\n")
print(rel)
#same as above 5 and 4 as all ifnromation is provided for appendicitis related

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information in the context related to appendicitis and its treatment.
2. Determine if the symptoms mentioned in the question are present in the context.
3. Check if the answer mentions only the symptoms and treatment options derived from the context.
4. Verify that the answer does not include any additional or incorrect information.

Explanation:
The answer adheres to the metric as it mentions the common symptoms for appendicitis (abdominal pain, anorexia, and abdominal tenderness) and states that it cannot be cured via medicine alone but requires surgical removal of the appendix. Both of these pieces of information are directly derived from the context.

Evaluation:
The metric is followed completely.

Rating:
Based on the evaluation criteria, I would rate the answer as a 5 (The metric is followed completely).

 Steps to evaluate the context as per the relevance metric:
1. Identify the main aspects of the question: common symptoms for 

### Query 3: What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?

In [61]:
user_input="What are the effective treatments or solutions for addressing sudden patchy hair loss, commonly seen as localized bald spots on the scalp, and what could be the possible causes behind it?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=350)

print(ground,end="\n\n")
print(rel)
# treatments for hair loss is very generated and retrieved based onthe user input hence its 4

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the main question and the specific information being asked for in the question.
2. Read through the context provided to understand the key points related to the topic of sudden patchy hair loss, including possible causes and effective treatments.
3. Determine if the AI generated answer is derived only from the information presented in the context.

Explanation:
The AI generated answer adheres to the metric as it mentions the specific treatments for addressing sudden patchy hair loss (alopecia areata) that were discussed in the context, including topical, intralesional, or systemic corticosteroids, topical minoxidil, topical anthralin, topical immunotherapy (diphencyprone or squaric acid dibutylester), and psoralen plus ultraviolet A (PUVA). The possible causes mentioned in the answer are also consistent with what was stated in the context as an autoimmune disorder affecting genetically susceptible people exposed to unclear environmental trigge

### Query 4: What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?

In [None]:
user_input="What treatments are recommended for a person who has sustained a physical injury to brain tissue, resulting in temporary or permanent impairment of brain function?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=350)

print(ground,end="\n\n")
print(rel)
#it provided additional relavant information for the brain injuries along with the treaments

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the key information in the context related to treatments for traumatic brain injuries (TBI).
2. Compare the identified information with the AI generated answer to check if the answer is derived only from the context.
3. Evaluate the extent to which the metric is followed.

Explanation:
The context provides information about the initial treatment and subsequent care for TBI patients, including supportive care and potential surgical interventions. The AI generated answer acknowledges that there is no specific treatment for TBI but emphasizes the importance of supportive care, which aligns with the information in the context. Therefore, the answer is derived from the context.

Evaluation:
The metric is followed completely as the answer is based solely on the information provided in the context.

Rating:
Based on the evaluation criteria, I would rate the answer a 5 for following the metric completely.

 Steps to evaluate context as per relevance m

### Query 5: What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?

In [None]:
user_input="What are the necessary precautions and treatment steps for a person who has fractured their leg during a hiking trip, and what should be considered for their care and recovery?"
ground,rel = generate_ground_relevance_response(user_input,max_tokens=350)

print(ground,end="\n\n")
print(rel)
#looks like the response to this one is not so great since it added other information for diff injruies and thats out of context

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


 Steps to evaluate the answer:
1. Identify the information in the context related to precautions and treatment for a fractured leg.
2. Compare each point in the AI generated answer with the corresponding information in the context.
3. Determine if the AI generated answer is derived only from the information presented in the context.

Explanation:
The AI generated answer includes all the necessary precautions and treatment steps for a person with a fractured leg as mentioned in the context. The answer also adds "avoiding sharing or publishing the contents of the manual" which is not present in the context but it does not affect the metric as it is an additional information that does not alter the given information.

Therefore, the AI generated answer adheres to the metric as it is derived only from the information presented in the context.

Rating:
Based on the evaluation criteria, I would rate the answer as 5 - The metric is followed completely.

 Steps to evaluate context as per relev

## Actionable Insights and Business Recommendations

1. Vector database creation time increases with the number of pages in the PDF document.
2. Retrieval parameter k is critical as the answer can be spread across multiple contexts. The more K value the higher the relevance.
3. chunk_overlap ensures coherence, especially when context spans across chunks so that it would follow cohesion. However the more overlap would also be computationally expensive.
4. Refine prompt design and temperature settings to control response length and creativity.
5. Continuously adjust RAG parameters based on specific use cases for optimal performance.
6. Establish a feedback loop to fine-tune parameters, improving performance for diverse query types.
7. The more temperature value the more creative the model would behave.


<font size=6 color='blue'>Power Ahead</font>
___