### LLaMA Supervised Fine-Tuning

This document will take the answers of GPT-4o on the Kababutare Medical Dataset and then fine-tune the LLaMA Model on those answers.

The purpose of this exercise is to test whether the LLaMA fine-tuning is able to distill the knowledge of GPT-4o and improve the performance on the open-ended question/answering related to healthcare dataset

In [1]:
import os

In [2]:
os.environ["CUDA_VISIBLE_DEVICES"] = "1"

In [3]:
import pandas as pd
import json
import torch
import pickle
from unsloth import FastLanguageModel
from datasets import Dataset
from tqdm  import tqdm
import evaluate

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


  from .autonotebook import tqdm as notebook_tqdm


🦥 Unsloth Zoo will now patch everything to make training faster!


#### Reading the Question and Answer Pairs from Test Dataset Phase 2

In [4]:
ques_list = []
ans_list = []
llama_resp_list = []
gpt_resp_list = []

with open('phase2_data_kabatubare/test_kabatubare.jsonl', 'rb') as file:
    for line in file:
        json_object = json.loads(line)
        ques_list.append(json_object['question'])
        ans_list.append(json_object['answer'])

In [5]:
test_dataset = pd.DataFrame({'question': ques_list,
                          'answer': ans_list})
test_dataset

Unnamed: 0,question,answer
0,i don't have periods due to taking nuvaring. h...,you really shouldn't worry about getting pregn...
1,when you can't digest food. my daughter is 38 ...,i'm sorry your daughter is having a hard time....
2,8 wks post-hemorrhoidectomy. knife-like pain o...,i'm sorry you're going through this. hemorrhoi...
3,i sometimes feel a mild discomfort under my le...,after reading your full statement i would also...
4,i had sex about 12 hours ago and i'm noticing ...,hi ok if your shaved and this is on your pubic...
...,...,...
4683,bright blood in stool for a while now burning ...,hi but as your not really well now i still thi...
4684,i feel fine i'm in a great mood but food taste...,well i'm happy you're in a good mood anyway! i...
4685,i have recently had unprotected sex and both m...,"you had unprotected sex? it is your ""other"" he..."
4686,is there a coated naproxen medication to preve...,all over-the-counter and prescription-strength...


### Inference

In [6]:
peft_model_path = "./llama32-sft-peft-kabatubare-phase3-distill-groundtruth/checkpoint-2000" #use for LoRA based fine-tuning

In [None]:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = peft_model_path,
    max_seq_length = 2048,
    load_in_4bit = False, # 4 bit quantization to reduce memory
    load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
    full_finetuning = False, # [NEW!] We have full finetuning now!
    dtype=None, #None for auto-detection. Can be torch.bfloat16 or torch.float16 (will be automatically detected)
    device_map="auto"
)

Implementing sample-by-sample inference. (Batch Inference doesn't work well for fine-tuned model adapters as responses like `P P P P` are being produced)

In [7]:
def get_llama_response_ft(question_input: str):
    
    llama_input = [{"role": "system", "content": "You are a medical knowledge assistant trained to provide information and guidance on various health-related topics."},
                    {"role": "user", "content": question_input}]

    prompt = tokenizer.apply_chat_template(llama_input, tokenize=False, add_generation_prompt=True)
    
    inputs = tokenizer(prompt, padding=True, truncation=True, return_tensors="pt").to(model.device)
    temp_resp = tokenizer.decode(inputs['input_ids'][0], skip_special_tokens=True)
    
    outputs = model.generate(
        **inputs, 
        max_new_tokens=2048,
        num_return_sequences=1
    )

    resp = tokenizer.decode(outputs[0], skip_special_tokens=True)
    resp = resp[len(temp_resp):] #getting only the response part (i.e., assistant)
    
    return resp

In [8]:
# # Implementing the Unsloth Fast Inference
# FastLanguageModel.for_inference(model)

# llama_responses_distill_groundtruth_ft = []
# for index, row in tqdm(test_dataset.iterrows(), total=len(test_dataset)):
#     question_input = row['question']
#     llama_resp = get_llama_response_ft(question_input)
#     llama_responses_distill_groundtruth_ft.append(llama_resp)

# with open('phase3_kabatubare_medical/llama_responses_distill_groundtruth_ft.pkl', 'wb') as file:
#     pickle.dump(llama_responses_distill_groundtruth_ft, file)

In [9]:
with open('phase3_kabatubare_medical/llama_responses_distill_groundtruth_ft.pkl', 'rb') as file:
    llama_responses_distill_groundtruth_ft = pickle.load(file)

### Saving the LLaMA Fine-Tuned Responses into the complete dataframe

In [10]:
test_dataset['llama_responses_distill_groundtruth_ft'] = llama_responses_distill_groundtruth_ft
test_dataset

Unnamed: 0,question,answer,llama_responses_distill_groundtruth_ft
0,i don't have periods due to taking nuvaring. h...,you really shouldn't worry about getting pregn...,hi if your not having periods you could be pre...
1,when you can't digest food. my daughter is 38 ...,i'm sorry your daughter is having a hard time....,i have had this problem for years and have bee...
2,8 wks post-hemorrhoidectomy. knife-like pain o...,i'm sorry you're going through this. hemorrhoi...,hi if your still having problems then your bes...
3,i sometimes feel a mild discomfort under my le...,after reading your full statement i would also...,hi sorry forgot to answer your question. it co...
4,i had sex about 12 hours ago and i'm noticing ...,hi ok if your shaved and this is on your pubic...,hi if your worried just go get it checked out ...
...,...,...,...
4683,bright blood in stool for a while now burning ...,hi but as your not really well now i still thi...,hi if you have been having these symptoms for ...
4684,i feel fine i'm in a great mood but food taste...,well i'm happy you're in a good mood anyway! i...,"it sounds like you're experiencing a case of ""..."
4685,i have recently had unprotected sex and both m...,"you had unprotected sex? it is your ""other"" he...",hi this is what happens when you get into sex ...
4686,is there a coated naproxen medication to preve...,all over-the-counter and prescription-strength...,yes. there are many different formulations of ...


### Calculating the BLEU Results for Phase 3

LLaMA Response Groundtruth Fine-Tuned

In [11]:
bleu_eval = evaluate.load("bleu")
bleu_results = bleu_eval.compute(predictions=test_dataset['llama_responses_distill_groundtruth_ft'].to_list(), references=test_dataset['answer'].to_list())
bleu_results

{'bleu': 0.02003878102002895,
 'precisions': [0.10717995680139855,
  0.0247424530135614,
  0.010157962679026206,
  0.005985800702720325],
 'brevity_penalty': 1.0,
 'length_ratio': 1.9634869273342515,
 'translation_length': 1007903,
 'reference_length': 513323}

### Calculating the ROUGE Results for Phase 3

LLaMA Response Fine-Tuned

In [12]:
rouge_eval = evaluate.load("rouge")
rouge_results = rouge_eval.compute(predictions=test_dataset['llama_responses_distill_groundtruth_ft'].to_list(), references=test_dataset['answer'].to_list())
rouge_results

{'rouge1': np.float64(0.21473686373113143),
 'rouge2': np.float64(0.05231480772730421),
 'rougeL': np.float64(0.14287341226798428),
 'rougeLsum': np.float64(0.14274116785127566)}