<a href="https://colab.research.google.com/github/fatemafaria142/Impact-of-Instruction-Tuning-in-Biomedical-Language-Processing-Using-Different-LLMs/blob/main/Biomedical_Language_Processing_using_TinyLlama.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install accelerate peft bitsandbytes transformers trl datasets

Collecting accelerate
  Downloading accelerate-0.26.1-py3-none-any.whl (270 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.9/270.9 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting peft
  Downloading peft-0.7.1-py3-none-any.whl (168 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m168.3/168.3 kB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting bitsandbytes
  Downloading bitsandbytes-0.42.0-py3-none-any.whl (105.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 MB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
Collecting trl
  Downloading trl-0.7.9-py3-none-any.whl (141 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m141.1/141.1 kB[0m [31m18.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets
  Downloading datasets-2.16.1-py3-none-any.whl (507 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m507.1/507.1 kB[0m [31m49.9 MB/s[0m eta [36m0:00:00

# **Load the required packages**

In [2]:
import torch
from datasets import load_dataset, Dataset
from peft import LoraConfig, AutoPeftModelForCausalLM
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments
from trl import SFTTrainer
import os

### **Dataset Link:** https://huggingface.co/datasets/nlpie/Llama2-MedTuned-Instructions?row=0

In [3]:
dataset="nlpie/Llama2-MedTuned-Instructions"
model_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0"
output_model="tinyllama-Instructionse-v1"

# **Dataset preparation**

In [4]:
def prepare_train_data(data_id, num_samples=1000):
    data = load_dataset(data_id, split="train")

    # Take the first 1000 rows
    data = data.select(range(min(num_samples, len(data))))

    data_df = data.to_pandas()
    data_df["text"] = data_df[["instruction", "input", "output"]].apply(lambda x: "user\n" + x["instruction"] + "  \n input\n" + x["input"] + "\nassistant\n" + x["output"] + "\n", axis=1)
    data = Dataset.from_pandas(data_df)

    return data



In [5]:
data = prepare_train_data(dataset, num_samples=1000)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/2.58k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/91.7M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/205048 [00:00<?, ? examples/s]

In [6]:
data

Dataset({
    features: ['instruction', 'input', 'output', 'text'],
    num_rows: 1000
})

In [7]:
data[0]

{'instruction': 'Your goal is to determine the relationship between the two provided clinical sentences and classify them into one of the following categories:\nContradiction: If the two sentences contradict each other.\nNeutral: If the two sentences are unrelated to each other.\nEntailment: If one of the sentences logically entails the other.',
 'input': 'Sentence 1: For his hypotension, autonomic testing confirmed orthostatic hypotension.\nSentence 2: the patient has orthostatic hypotension',
 'output': 'Entailment',
 'text': 'user\nYour goal is to determine the relationship between the two provided clinical sentences and classify them into one of the following categories:\nContradiction: If the two sentences contradict each other.\nNeutral: If the two sentences are unrelated to each other.\nEntailment: If one of the sentences logically entails the other.  \n input\nSentence 1: For his hypotension, autonomic testing confirmed orthostatic hypotension.\nSentence 2: the patient has orth

In [8]:
data[2]

{'instruction': 'If you have expertise in healthcare, assist users by addressing their medical questions and concerns.',
 'input': 'i am having acute gas problem which i am feeling  and my bp has raised to 130/90, During  this time i have palpitation and i feel there is a missing pulse . Kindly guide me to overcome this problem.On the other hand if there is no gas in my stomach the aforesaid problems are not occuring.',
 'output': 'Dear-thanks for using our service and will try to help you with my medical advice. You might be experiencing intestinal gas from food that you are eating as some vegetables like broccoli, cauliflower and that can make your heart rate faster too. However, the blood pressure is high, which can be the primary problem or secondary to other things like salt in your diet. I recommend you to watch what your eat, control salt, Chat Doctor.  That will help your blood pressure and the intestinal movements too. Avoid soda and caffeine too. I hope that my advice has bee

In [9]:
data[24]

{'instruction': 'Your task is to spot mentions of diseases in the text and apply the BIO labeling scheme. For each disease-related phrase, label the initial word as B (Begin), and label the rest of the words in the phrase as I (Inner). Any words unrelated to diseases should receive an O label.',
 'input': 'Clinical manifestation of DM may be caused by the expanded CTG - repeat compromising the ( alternative ) expression of DM - kinase or DMR - N9 proteins . .',
 'output': 'Clinical : O\nmanifestation : O\nof : O\nDM : B\nmay : O\nbe : O\ncaused : O\nby : O\nthe : O\nexpanded : O\nCTG : O\n- : O\nrepeat : O\ncompromising : O\nthe : O\n( : O\nalternative : O\n) : O\nexpression : O\nof : O\nDM : O\n- : O\nkinase : O\nor : O\nDMR : O\n- : O\nN9 : O\nproteins : O\n. : O\n. : O',
 'text': 'user\nYour task is to spot mentions of diseases in the text and apply the BIO labeling scheme. For each disease-related phrase, label the initial word as B (Begin), and label the rest of the words in the p

## **We have to model the Model (not the base version)**

In [11]:
def get_model_and_tokenizer(mode_id):

    tokenizer = AutoTokenizer.from_pretrained(mode_id)
    tokenizer.pad_token = tokenizer.eos_token
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype="float16", bnb_4bit_use_double_quant=True
    )
    model = AutoModelForCausalLM.from_pretrained(
        mode_id, quantization_config=bnb_config, device_map="auto"
    )
    model.config.use_cache=False
    model.config.pretraining_tp=1
    return model, tokenizer

In [12]:
model, tokenizer = get_model_and_tokenizer(model_id)

tokenizer_config.json:   0%|          | 0.00/1.29k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

# **Setting up the LoRA**

In [13]:
peft_config = LoraConfig(
        r=8, lora_alpha=16, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM"
    )

In [14]:
training_arguments = TrainingArguments(
        output_dir=output_model,
        per_device_train_batch_size=1,
        gradient_accumulation_steps=4,
        optim="paged_adamw_32bit",
        learning_rate=2e-4,
        lr_scheduler_type="cosine",
        save_strategy="epoch",
        logging_steps=10,
        num_train_epochs=3,
        max_steps=250,
        fp16=True,
        # push_to_hub=True
    )

In [15]:
trainer = SFTTrainer(
        model=model,
        train_dataset=data,
        peft_config=peft_config,
        dataset_text_field="text",
        args=training_arguments,
        tokenizer=tokenizer,
        packing=False,
        max_seq_length=1024
    )

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

In [16]:
trainer.train()

You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss
10,2.3141
20,2.3074
30,2.1562
40,2.0085
50,2.2781
60,1.774
70,1.7305
80,1.818
90,1.6338
100,1.6117


TrainOutput(global_step=250, training_loss=1.6778079528808594, metrics={'train_runtime': 284.5453, 'train_samples_per_second': 3.514, 'train_steps_per_second': 0.879, 'total_flos': 1764082249334784.0, 'train_loss': 1.6778079528808594, 'epoch': 1.0})

# **Merging the LoRA with the base model**

In [17]:
from peft import AutoPeftModelForCausalLM, PeftModel
from transformers import AutoModelForCausalLM
import torch
import os

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, load_in_8bit=False,
                                             device_map="auto",
                                             trust_remote_code=True)

model_path = "/content/tinyllama-Instructionse-v1/checkpoint-250"

peft_model = PeftModel.from_pretrained(model, model_path, from_transformers=True, device_map="auto")

model = peft_model.merge_and_unload()

In [18]:
model

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 2048)
    (layers): ModuleList(
      (0-21): 22 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=256, bias=False)
          (v_proj): Linear(in_features=2048, out_features=256, bias=False)
          (o_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=2048, out_features=5632, bias=False)
          (up_proj): Linear(in_features=2048, out_features=5632, bias=False)
          (down_proj): Linear(in_features=5632, out_features=2048, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head)

# **Inference from the LLM for some examples**

In [19]:
def formatted_prompt(instruction, input)-> str:
    return f"<|im_start|>user\n{instruction}<|im_end|>\n<|im_start|>input\n{input}<|im_end|>\n<|im_start|>assistant:"

In [20]:
from transformers import GenerationConfig
from time import perf_counter

def generate_response(instruction, user_input):

    prompt = formatted_prompt(instruction, user_input)

    inputs = tokenizer([prompt], return_tensors="pt")
    generation_config = GenerationConfig(
        penalty_alpha=0.6,
        do_sample=True,
        top_k=5,
        temperature=0.5,
        repetition_penalty=1.2,
        max_new_tokens=120,
        pad_token_id=tokenizer.eos_token_id
    )
    start_time = perf_counter()

    inputs = tokenizer(prompt, return_tensors="pt").to('cuda')

    outputs = model.generate(**inputs, generation_config=generation_config)
    decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True).strip()  # Remove leading and trailing whitespaces
    print(decoded_output)
    output_time = perf_counter() - start_time
    print(f"Time taken for inference: {round(output_time, 2)} seconds")

#### **Example No:1**

In [21]:
instruction = 'If you have medical expertise, assist the user by responding to their healthcare-related questions.'
user_input = 'Hi I am 33 year old male , 200 lbs . Sunday I had a sever panic attack , was dizzy , naseuas , almost fainted . Went to the emergency room , had a ekg and blood work all came back normal . I do suffer from anxiety but as of late I have chest pains . Sharp pains in diffrent spots of my chest . They come and go . I also have a strange feeling I m my head and pressure under my rib cage . My blood pressure is always normal , but have a high pulse rate . Ranges from 85 - 111 bpm evertime I check it . Are all of these related to stress , what can be done . A very scared man , Mike'
generate_response(instruction=instruction, user_input=user_input)

<|im_start|>user
If you have medical expertise, assist the user by responding to their healthcare-related questions.<|im_end|>
<|im_start|>input
Hi I am 33 year old male , 200 lbs . Sunday I had a sever panic attack , was dizzy , naseuas , almost fainted . Went to the emergency room , had a ekg and blood work all came back normal . I do suffer from anxiety but as of late I have chest pains . Sharp pains in diffrent spots of my chest . They come and go . I also have a strange feeling I m my head and pressure under my rib cage . My blood pressure is always normal , but have a high pulse rate . Ranges from 85 - 111 bpm evertime I check it . Are all of these related to stress , what can be done . A very scared man , Mike<|im_end|>
<|im_start|>assistant: Hi there. Thank you for your question. Stress has been linked to some of the symptoms that you describe. Anxiety and panic attacks are common among people who experience them. It's difficult to say exactly why this happens without more info

#### **Example No:2**

In [22]:
instruction = 'As a medical chatbot, your responsibility is to provide information and guidance on medical matters to users.'
user_input = 'Can i give my 6 month old 18lb baby anythi g for allergies shes having clear boogars that were green but now are clear again her eyes are watery and she keeps sneezing she cant breathe and drink her bottle and the pedi is not awnsering called 6 times already'
generate_response(instruction=instruction, user_input=user_input)

<|im_start|>user
As a medical chatbot, your responsibility is to provide information and guidance on medical matters to users.<|im_end|>
<|im_start|>input
Can i give my 6 month old 18lb baby anythi g for allergies shes having clear boogars that were green but now are clear again her eyes are watery and she keeps sneezing she cant breathe and drink her bottle and the pedi is not awnsering called 6 times already<|im_end|>
<|im_start|>assistant: hi there. thank you for reaching out to us. can you please tell me more about your daughter's condition? what kind of allergy does she have? how has it been affecting her? do you know if her breathing problems are due to this allergy or something else? I would like to help you with this matter as much as possible. Can you provide some details regarding the doctor's consultation? did they recommend any tests or treatments? thanks in advance. Have a nice day! Best regards!
Time taken for inference: 5.25 seconds


#### **Example No:3**

In [24]:
instruction =  '''In your capacity as a doctor, it is expected that you answer the medical questions relying on the patient's description.
Analyze the question given its context. Give both long answer and yes/no decision.'''
user_input = '''###Question: Does systemic inflammation response syndrome score predict the mortality in multiple trauma patients? '''
generate_response(instruction=instruction, user_input=user_input)

<|im_start|>user
In your capacity as a doctor, it is expected that you answer the medical questions relying on the patient's description. 
Analyze the question given its context. Give both long answer and yes/no decision.<|im_end|>
<|im_start|>input
###Question: Does systemic inflammation response syndrome score predict the mortality in multiple trauma patients? <|im_end|>
<|im_start|>assistant:
Thanks for your input. Yes, I think so. The score was developed by using various parameters to assess the severity of injury. It has been shown to be an independent prognostic marker for all-cause mortality after blunt trauma . In this study, we found that the score was associated with increased mortality (hazard ratio 1.62; P = .03) , but not with other outcomes such as intensive care unit admission or mechanical ventilation. This suggests that the score may have some specificity for severe injuries
Time taken for inference: 3.8 seconds
