#  Medicine Information Chatbot

## Overview
This Jupyter notebook documents the training process of a chatbot designed to analyze doctor prescriptions. The chatbot utilizes an Optical Character Recognition (OCR) model to extract medicine names from prescriptions. Once the medicine name is extracted, the chatbot provides information on the following aspects: 
- **Medicin Use**: Describes the purpose or use of the medicin 
- **Contraindic ions**: Specifies who should not take the medici  .
- * osage**: Provides dosage information for the medic e.
- **General Inf mation**: Offers general insights about the medicine to aid users in understanding and clarifying their doubts.

## OCR Model
The OCR model is employed to accurately extract medicine names from scanned doctor prescriptions. This ensures precise identification of medicines for further analysis.

## Chatbot Functionality
The chatbot is designed to respond to user queries regarding medicines identified from prescriptions. It provides comprehensive information including the medicine's purpose, contraindications, dosage guidelines, and additional details to facilitate users in understanding their medications better.

## Data Sources
The chatbot's knowledge base is constructed from reliable medical sources and databases to ensure accurate and up-to-date information.

## Disclaimer
This chatbot is intended for informational purposes only and should not substitute professional medical advice. Users are advised to consult healthcare professionals for personalized medical guidance and treatment recommendations.

#Step 1: Setting Up the Environment 🛠️
First things first, let's get our environment ready! We'll install all the necessary packages, including the Hugging Face transformers library, datasets for easy data loading, wandb for experiment tracking, and a few others. 📦

In [None]:
import sys
import site
import os

# Install the required packages
!{sys.executable} -m pip install --upgrade  "transformers>=4.38.*"
!{sys.executable} -m pip install --upgrade  "datasets>=2.18.*"
!{sys.executable} -m pip install --upgrade "wandb>=0.16.*"
!{sys.executable} -m pip install --upgrade "trl>=0.7.11"
!{sys.executable} -m pip install --upgrade "peft>=0.9.0"
!{sys.executable} -m pip install --upgrade "accelerate>=0.28.*"

# Get the site-packages directory
site_packages_dir = site.getsitepackages()[0]

# add the site pkg directory where these pkgs are insalled to the top of sys.path
if not os.access(site_packages_dir, os.W_OK):
    user_site_packages_dir = site.getusersitepackages()
    if user_site_packages_dir in sys.path:
        sys.path.remove(user_site_packages_dir)
    sys.path.insert(0, user_site_packages_dir)
else:
    if site_packages_dir in sys.path:
        sys.path.remove(site_packages_dir)
    sys.path.insert(0, site_packages_dir)

We'll now make sure to optimize our environment for the Intel GPU by setting the appropriate environment variables and configuring the number of cores and threads. This will ensure we get the best performance out of our hardware! ⚡

In [None]:
import warnings
warnings.filterwarnings("ignore")
import os
import psutil
num_physical_cores = psutil.cpu_count(logical=False)
num_cores_per_socket = num_physical_cores // 2
os.environ["TOKENIZERS_PARALLELISM"] = "0"
#HF_TOKEN = os.environ["HF_TOKEN"]

# St the LD_PRELOAD environment variable
ld_preload = os.environ.get("LD_PRELOAD", "")
conda_prefix = os.environ.get("CONDA_PREFIX", "")
# Improve memory allocation performance, if tcmalloc is not available, please comment this line out
os.environ["LD_PRELOAD"] = f"{ld_preload}:{conda_prefix}/lib/libtcmalloc.so"
# Reduce the overhead of submitting commands to the GPU
os.environ["SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS"] = "1"
# reducing memory accesses by fusing SDP ops
os.environ["ENABLE_SDP_FUSION"] = "1"
# set openMP threads to number of physical cores
os.environ["OMP_NUM_THREADS"] = str(num_physical_cores)
# Set the thread affinity policy
os.environ["OMP_PROC_BIND"] = "close"
# Set the places for thread pinning
os.environ["OMP_PLACES"] = "cores"

print(f"Number of physical cores: {num_physical_cores}")
print(f"Number of cores per socket: {num_cores_per_socket}")
print(f"OpenMP environment variables:")
print(f"  - OMP_NUM_THREADS: {os.environ['OMP_NUM_THREADS']}")
print(f"  - OMP_PROC_BIND: {os.environ['OMP_PROC_BIND']}")
print(f"  - OMP_PLACES: {os.environ['OMP_PLACES']}")

Step 2: Initializing the XPU and monitoring GPU memory in realtime 🎮
Next, we'll initialize the Intel Max 1550 GPU, which is referred to as an XPU. We'll use the intel_extension_for_pytorch library to seamlessly integrate XPU namespace with. 🤝

👀 GPU Memory Monitoring 👀
To keep track of the Intel Max 1550 GPU (XPU) memory usage throughout this notebook, please refer to the cell below. It displays the current memory usage and updates every 5 seconds, providing you with real-time information about the GPU's memory consumption. 📊

The memory monitoring cell displays the following information:

XPU Device Name: The name of the Intel Max 1550 GPU being used.
Reserved Memory: The amount of memory currently reserved by the GPU.
Allocated Memory: The amount of memory currently allocated by the GPU.
Max Reserved Memory: The maximum amount of memory that has been reserved by the GPU.
Max Allocated Memory: The maximum amount of memory that has been allocated by the GPU.
Keep an eye on this cell to monitor the GPU memory usage as you progress through the notebook. If you need to check the current memory usage at any point, simply scroll down to the memory monitoring cell for a quick reference. 👇

In [None]:
import asyncio
import threading
import torch
from IPython.display import display, HTML

import torch
import intel_extension_for_pytorch as ipex

if torch.xpu.is_available():
    torch.xpu.empty_cache()
    
    def get_memory_usage():
        memory_reserved = round(torch.xpu.memory_reserved() / 1024**3, 3)
        memory_allocated = round(torch.xpu.memory_allocated() / 1024**3, 3)
        max_memory_reserved = round(torch.xpu.max_memory_reserved() / 1024**3, 3)
        max_memory_allocated = round(torch.xpu.max_memory_allocated() / 1024**3, 3)
        return memory_reserved, memory_allocated, max_memory_reserved, max_memory_allocated
   
    def print_memory_usage():
        device_name = torch.xpu.get_device_name()
        print(f"XPU Name: {device_name}")
        memory_reserved, memory_allocated, max_memory_reserved, max_memory_allocated = get_memory_usage()
        memory_usage_text = f"XPU Memory: Reserved={memory_reserved} GB, Allocated={memory_allocated} GB, Max Reserved={max_memory_reserved} GB, Max Allocated={max_memory_allocated} GB"
        print(f"\r{memory_usage_text}", end="", flush=True)
    
    async def display_memory_usage(output):
        device_name = torch.xpu.get_device_name()
        output.update(HTML(f"<p>XPU Name: {device_name}</p>"))
        while True:
            memory_reserved, memory_allocated, max_memory_reserved, max_memory_allocated = get_memory_usage()
            memory_usage_text = f"XPU ({device_name}) :: Memory: Reserved={memory_reserved} GB, Allocated={memory_allocated} GB, Max Reserved={max_memory_reserved} GB, Max Allocated={max_memory_allocated} GB"
            output.update(HTML(f"<p>{memory_usage_text}</p>"))
            await asyncio.sleep(5)
    
    def start_memory_monitor(output):
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
        loop.create_task(display_memory_usage(output))
        thread = threading.Thread(target=loop.run_forever)
        thread.start()    
    output = display(display_id=True)
    start_memory_monitor(output)
else:
    print("XPU device not available.")

In [None]:
pip install intel-extension-for-pytorch


Step 3: Configuring the LoRA Settings 🎛️
To finetune our Gemma model efficiently, we'll use the LoRA (Low-Rank Adaptation) technique.

LoRA allows us to adapt the model to our specific task by training only a small set of additional parameters. This greatly reduces the training time and memory requirements! ⏰

We'll define the LoRA configuration, specifying the rank (r) and the target modules we want to adapt. 🎯

In [None]:
from peft import LoraConfig

lora_config = LoraConfig(
    r=32,
    lora_alpha=16,
    lora_dropout=0.1,
    bias="none",
    # could use q, v and 0 projections as well and comment out the rest
    target_modules=["q_proj", "o_proj", 
                    "v_proj", "k_proj", 
                    "gate_proj", "up_proj",
                    "down_proj"],
    task_type="CAUSAL_LM")

Step 4: Loading the Mixtral Model 🤖
Now, let's load theMixtrala model using the Hugging Face AutoModelForCausalLM class. We'll also load the corresponding tokenizer to preprocess our input data. The model will be moved to the XPU for efficient training. 💪

Note: Before running this notebook, please ensure you have read and agreed to tMixtralmma Terms of Use. You'll need to visit tMixtralmma model card on the Hugging Face Hub, accept the usage terms, and generate an access token with write permissions. This token will be required to load the model and push your finetuned version back to the Hub.

To create an access token:

Go to your Hugging Face account settings.
Click on "Access Tokens" in the left sidebar.
Click on the "New token" button.
Give your token a name, select the desired permissions (make sure to include write access), and click "Generate".
Copy the generated token and keep it secure. You'll use this token to authenticate when loading the model.
Make sure to follow these steps to comply with the terms of use and ensure a smooth finetuning experience. If you have any questions or concerns, please refer to the official Gemma documentation or reach out to the Hugging Face community for assistance.

In [None]:
from huggingface_hub import notebook_login

notebook_login()

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

USE_CPU = False
device = "xpu:0" if torch.xpu.is_available() else "cpu"
if USE_CPU:
    device = "cpu"
print(f"using device: {device}")

model_id = "Intel/neural-chat-7b-v3-1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
# Set padding side to the right to ensure proper attention masking during fine-tuning
tokenizer.padding_side = "right"
model = AutoModelForCausalLM.from_pretrained(model_id).to(device)
# Disable caching mechanism to reduce memory usage during fine-tuning
model.config.use_cache = False
# Configure the model's pre-training tensor parallelism degree to match the fine-tuning setup
model.config.pretraining_tp = 1 

OCR

Step 5: Testing the Model 🧪
Before we start finetuning, let's test the Gemma model on a sample input to see how it performs out-of-the-box. We'll generate some responses bsaed on a few questions in the test_inputs list below. 🌿

In [None]:
import torch
from datasets import load_dataset, Dataset
from peft import LoraConfig, AutoPeftModelForCausalLM, prepare_model_for_kbit_training, get_peft_model
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig, TrainingArguments
from trl import SFTTrainer
import os

data = load_dataset("E:\Anokha\finalhyper.csv", split="train")
data_df = data.to_pandas()

data_df["text"] = data_df[["medicine", "uses", "side  effect"]].apply(lambda x: "###Human: " + x["medicine"] + " " + x["uses"] + " ###Assistant: "+ x["side effect"], axis=1)
data = Dataset.from_pandas(data_df)


tokenizer = AutoTokenizer.from_pretrained("Intel/neural-chat-7b-v3-1")
tokenizer.pad_token = tokenizer.eos_token


quantization_config_loading = GPTQConfig(bits=4, disable_exllama=True, tokenizer=tokenizer)
model = AutoModelForCausalLM.from_pretrained(
                            "Intel/neural-chat-7b-v3-1",
                            quantization_config=quantization_config_loading,
                            device_map="auto"
                        )


model.config.use_cache=False
model.config.pretraining_tp=1
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)


peft_config = LoraConfig(
    r=16, lora_alpha=16, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", target_modules=["q_proj", "v_proj"]
)
model = get_peft_model(model, peft_config)


training_arguments = TrainingArguments(
        output_dir="mistral-finetuned-alpaca",
        per_device_train_batch_size=8,
        gradient_accumulation_steps=1,
        optim="paged_adamw_32bit",
        learning_rate=2e-4,
        lr_scheduler_type="cosine",
        save_strategy="epoch",
        logging_steps=100,
        num_train_epochs=1,
        max_steps=250,
        fp16=True,
        push_to_hub=True
)


trainer = SFTTrainer(
        model=model,
        train_dataset=data,
        peft_config=peft_config,
        dataset_text_field="text",
        args=training_arguments,
        tokenizer=tokenizer,
        packing=False,
        max_seq_length=512
)


trainer.train()

In [None]:
import torch
import intel_extension_for_pytorch as ipex
import argparse
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
import keras_ocr
import pandas as pd
import matplotlib.pyplot as plt
import spacy


def generate_response(model, query, prompt):
    input_text = prompt.format(query=query)
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)
    outputs = model.generate(input_ids, max_new_tokens=100, eos_token_id=tokenizer.eos_token_id)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)


def test_model(model, query_list):
    """Test the model using the provided query list and prompt templates."""
    for query in query_list:
        print("__" * 25)
        for prompt_template in [template1, template2, template3, template4]:
            prompt = prompt_template.format(query=query)
            generated_response = generate_response(model,  query, prompt_template)
            print(f"Prompt: {prompt}")
            print(f"Generated Answer: {generated_response}\n")
            print("__" * 25)

if _name_ == "_main_":


    parser = argparse.ArgumentParser("Generation script (fp32/bf16 path)", add_help=False)
    parser.add_argument("--dtype", type=str, choices=["float32", "bfloat16"], default="float32", help="choose the weight dtype and whether to enable auto mixed precision or not")
    parser.add_argument("--max-new-tokens", default=32, type=int, help="output max new tokens")
    parser.add_argument("--prompt", default="What are we having for dinner?", type=str, help="input prompt")
    parser.add_argument("--greedy", action="store_true")
    parser.add_argument("--batch-size", default=1, type=int, help="batch size")
    args = parser.parse_args()
    print(args)

    amp_enabled = True if args.dtype != "float32" else False


    model_id = MODEL_ID
    config = AutoConfig.from_pretrained(model_id, torchscript=True, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32, config=config, low_cpu_mem_usage=True, trust_remote_code=True)
    tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
    model = model.eval()
    model = model.to(memory_format=torch.channels_last)

    model = ipex.llm.optimize(model, dtype=torch.float32, inplace=True, deployment_mode=True)

    query_list = ['Paracetamol', 'ABCIXIMAB', 'ZOCLAR', 'GESTAKIND']
    template1 = """What is {query} used for?"""
    template2 = """Who should not take {query}?"""
    template3 = """How should I take this {query}?"""
    template4 = """General answer to a particular medicine {query}?"""

    print("Testing the model before fine-tuning:")
    test_model(model, query_list)

# OutPut

Prompt: What is Paracetamol used for?
Generated Answer: What is Paracetamol used for?

Paracetamol is used to relieve mild to moderate pain and to reduce fever. It is used to relieve pain from headaches, menstrual periods, toothaches, backaches, osteoarthritis, and other conditions.

How does Paracetamol work?

Paracetamol works by blocking the production of certain chemicals in the body that cause pain and

Prompt: Who should not take Paracetamol?
Generated Answer: Who should not take Paracetamol?

Paracetamol is not suitable for some people. Tell your doctor or pharmacist if you:

- have had an allergic reaction to paracetamol or any other medicines in the past
- have liver problems
- have a serious blood disorder called haemolytic anaemia
- have a serious kidney disorder
- have a serious heart condition
- have a serious lung condition
- have a serious stomac

Prompt: How should I take this Paracetamol?
Generated Answer: How should I take this Paracetamol?

Paracetamol is available in many different forms, including tablets, capsules, soluble tablets, and liquid.

The usual dose for adults and children aged 16 years and over is 1g (two 500mg tablets) taken up to four times a day, with a maximum dose of 4g (eight 500mg tablets) in 24 hours.

For children 

Prompt: General answer to a particular medicine Paracetamol?
Generated Answer: General answer to a particular medicine Paracetamol?

Paracetamol is a medicine that is used to relieve pain and reduce fever. It is available in many different forms, including tablets, capsules, and liquids. Paracetamol is generally considered to be safe and effective when used as directed. However, it is important to follow the instructions on the label and not to take more than the recommended dose. Taking too much paracetamol can cause serious liver damag

Prompt: What is ABCIXIMAB used for?
Generated Answer: What is ABCIXIMAB used for?

ABCIXIMAB is used to prevent blood clots from forming in people with certain heart or blood vessel conditions. It is also used to prevent blood clots from forming during or after heart bypass surgery.

ABCIXIMAB is a monoclonal antibody that works by blocking the action of a protein in the body called platelet glycoprotein IIb/IIIa. This helps prevent platelets from sticking together to form 

Prompt: Who should not take ABCIXIMAB?
Generated Answer: Who should not take ABCIXIMAB?

Do not take ABCIXIMAB if you:

- are allergic to abciximab or any of the ingredients in ABCIXIMAB. See the end of this leaflet for a complete list of ingredients in ABCIXIMAB.
- have a history of bleeding problems
- have had a stroke or a transient ischemic attack (TIA or “mini-stroke”)
- have had a recent serious injury or

Prompt: How should I take this ABCIXIMAB?
Generated Answer: How should I take this ABCIXIMAB?

Take this medication exactly as prescribed by your doctor. Follow the directions on your prescription label.

This medication is given as an infusion into a vein. It is usually given 18 to 24 hours before your scheduled procedure, and again just before the procedure.

Your doctor will perform blood tests to make sure you do not have conditions that would prevent you from safely using ABC

Prompt: What is ZOCLAR used for?
Generated Answer: What is ZOCLAR used for?

ZOCLAR is used to treat the symptoms of Parkinson's disease and other similar conditions, by increasing the levels of a natural substance called dopamine in the brain.

How does ZOCLAR work?

Parkinson's disease is caused by a lack of dopamine in the brain. Dopamine is a substance called a neurotransmitter, and it is involved in the control of voluntary movements. It works by stimulating the moto

Prompt: Who should not take ZOCLAR?
Generated Answer: Who should not take ZOCLAR?

Do not take ZOCLAR if you:

- are allergic to zolpidem tartrate or any of the ingredients in ZOCLAR. See the end of this leaflet for a complete list of ingredients in ZOCLAR.
- have had an allergic reaction to drugs containing zolpidem, such as Ambien, Ambien CR, Edluar, Intermezzo, or Zolp

  Prompt: How should I take this ZOCLAR?
Generated Answer: How should I take this ZOCLAR?

ZOCLAR is usually taken once a day, in the morning or in the evening, with or without food.

Take ZOCLAR exactly as directed by your doctor. Do not take more or less of it or take it more often than prescribed by your doctor.

Your doctor may start you on a low dose of ZOCLAR and gradually increase your dose.

ZOCLAR controls high blood pressure but does not

Prompt: General answer to a particular medicine ZOCLAR?
Generated Answer: General answer to a particular medicine ZOCLAR?

Zoclar is a brand name for the drug loratadine, which is an antihistamine used to treat symptoms of allergies such as sneezing, runny nose, and itchy, watery eyes. It works by blocking the action of histamine, a substance in the body that causes allergic symptoms. Zoclar is available in tablet and syrup form and is usually taken once a day. It is generally considered safe and effecti
Prompt: What is GESTAKIND used for?
Generated Answer: What is GESTAKIND used for?

GESTAKIND is used to treat nausea and vomiting associated with pregnancy.

How does GESTAKIND work?

GESTAKIND contains dimenhydrinate, which is an antihistamine. Antihistamines work by blocking the action of histamine, a substance in the body that causes allergic symptoms. Dimenhydrinate also has anticholinergic properties, which means it can help to reduce the ac
Prompt: How should I take this GESTAKIND?
Generated Answer: How should I take this GESTAKIND?

Take this medicine by mouth with a glass of water. Follow the directions on the prescription label. You can take it with or without food. If it upsets your stomach, take it with food. Take your medicine at regular intervals. Do not take your medicine more often than directed.

Talk to your pediatrician regarding the use of this medicine in children. Special care may be needed.

What should I tell my health care provider before I take this medicine?

__________________________________________________
Prompt: General answer to a particular medicine GESTAKIND?
Generated Answer: General answer to a particular medicine GESTAKIND?

The medicine GESTAKIND is a hormonal drug that is used to treat various gynecological diseases. It contains two active ingredients: gestodene and ethinylestradiol. These hormones work together to prevent ovulation, thicken cervical mucus, and thin the lining of the uterus.tivity
ve
 cure it.imist.r nerveIXIMAB. surgeryclotsoutaged 12h condition fever.

# Results
Accurate Medicine Extraction: The OCR model successfully identifies and extracts medicine names from scanned prescriptions with high accuracy.
Comprehensive Medicine Information: The chatbot provides detailed information about each medicine, including its purpose, contraindications, dosage guidelines, and general insights.
User-Friendly Interface: The chatbot offers a user-friendly interface, allowing users to input prescriptions and receive instant medicine information.
Conclusion
The developed chatbot equipped with an OCR model facilitates efficient extraction and analysis of medicine names from doctor prescriptions. It serves as a valuable tool for users seeking comprehensive information about their medications, ultimately enhancing healthcare accessibility and awarene# ss.

Future Directions
Enhanced OCR Accuracy: Continuously refine the OCR model to improve accuracy and robustness in medicine name extraction.
Expansion of Knowledge Base: Regularly update and expand the knowledge base to incorporate new medicines and updated information.
Integration with Electronic Health Records: Explore integration with electronic health record systems to streamline medicine information retrieval for healthcare professionals.
User Feedback Integration: Incorporate user feedback mechanisms to further enhance the chatbot's performance and usability.