### LLaMA Supervised Fine-Tuning

This document will take the answers of GPT-4o on the Kababutare Medical Dataset and then fine-tune the LLaMA Model on those answers.

The purpose of this exercise is to test whether the LLaMA fine-tuning is able to distill the knowledge of GPT-4o and improve the performance on the open-ended question/answering related to healthcare dataset

In [1]:
import os

In [2]:
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

In [3]:
import pandas as pd
import json
import torch
import pickle
from unsloth import FastLanguageModel, is_bfloat16_supported, train_on_responses_only
from transformers import TrainingArguments, DataCollatorForSeq2Seq
from datasets import Dataset, DatasetDict
from trl import SFTTrainer

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


  from .autonotebook import tqdm as notebook_tqdm


🦥 Unsloth Zoo will now patch everything to make training faster!


#### Reading the Question and Answer Pairs from Phase 1 of GPT-4o

In [10]:
gpt_inf_data_phase1 = pd.DataFrame()
ques_list = []
gpt_resp_list = []

with open('phase1_kabatubare_medical/kabatubare_medical_gpt4omini_qa_pairs.jsonl', 'rb') as file:
    for line in file:
        json_object = json.loads(line)
        ques_list.append(json_object['Question'])
        gpt_resp_list.append(json_object['Answer'])

gpt_inf_data_phase1['question'] = ques_list
gpt_inf_data_phase1['gpt_response_base'] = gpt_resp_list
gpt_inf_data_phase1

Unnamed: 0,question,gpt_response_base
0,my 5 1/2-year-old son displays adhd symptoms f...,It’s important to remember that only a qualifi...
1,my son has add and mild autism. he has been su...,Weight management can be a concern for childre...
2,my son is 13 and is depressed. he has been tak...,I'm really sorry to hear that your son is feel...
3,my 17-year-old has stopped taking concerta aft...,"When a person, especially a teenager, stops ta..."
4,i've been taking respa-ar for allergies. i can...,Resp-A-R is a combination medication commonly ...
...,...,...
23432,how can accidental of acetaminophen overdose b...,Accidental acetaminophen overdose is a signifi...
23433,what should i do if i take an overdose of maxalt?,If you suspect that you have taken an overdose...
23434,what do i do in case of an overdose of relpax?,If you suspect an overdose of Relpax (eletript...
23435,is overdose with acetaminophen usually acciden...,Overdoses of acetaminophen (also known as para...


Create the HuggingFace Dataset from Pandas Dataframe

In [5]:
dataset = Dataset.from_pandas(gpt_inf_data_phase1)
dataset = dataset.train_test_split(test_size=0.1)
dataset

DatasetDict({
    train: Dataset({
        features: ['question', 'gpt_response'],
        num_rows: 21093
    })
    test: Dataset({
        features: ['question', 'gpt_response'],
        num_rows: 2344
    })
})

### Fine-Tuning Code

#### Loading the model and tokenizer

In [None]:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Llama-3.2-3B-Instruct",
    max_seq_length = 4096,
    load_in_4bit = False, # 4 bit quantization to reduce memory
    load_in_8bit = True, # [NEW!] A bit more accurate, uses 2x memory
    full_finetuning = False, # [NEW!] We have full finetuning now!
    dtype=None, #None for auto-detection. Can be torch.bfloat16 or torch.float16 (will be automatically detected)
    device_map="auto"
)

#### Setting up the PEFT settings for the model

https://huggingface.co/blog/damjan-k/rslora\
https://medium.com/@fartypantsham/what-rank-r-and-alpha-to-use-in-lora-in-llm-1b4f025fd133

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 64, #max_full_rank=64 by default in FastLanguageModel
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 64, #scaling_factor = lora_alpha/r. If we select lora_alpha = 2 * r then it will multiply the adapter weights by 2 which can be un-ncessary
    lora_dropout = 0.1,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    use_rslora = True,
    loftq_config = None,
)

#### Forming the chat template

In [None]:
# Define a function to apply the chat template
def format_chat_template(example):
        
    messages = [
        {"role": "system", "content": "You are a medical knowledge assistant trained to provide information and guidance on various health-related topics."},
        {"role": "user", "content": example['question']},
        {"role": "assistant", "content": example['gpt_response_base']}
    ]
    
    prompt = tokenizer.apply_chat_template(messages, tokenize=False)

    return {"text": prompt}

In [None]:
dataset_formatted = dataset.map(format_chat_template)

In [None]:
print(dataset_formatted['train']['text'][0])

#### Initializing the TRL SFTTrainer and related Arguments

In [None]:
# full_model_path = "./llama32-sft-full-kabatubare" #use for full finetuning
peft_model_path = "./llama32-sft-peft-kabatubare" #use for LoRA based fine-tuning

training_args = TrainingArguments(
        output_dir=peft_model_path,
        per_device_train_batch_size=8,
        per_device_eval_batch_size=8,
        # gradient_accumulation_steps=4,
        eval_strategy="steps",
        eval_steps=50,
        logging_strategy="steps",
        logging_steps=50,
        save_strategy="steps",
        save_steps=1000,
        warmup_steps = 5,
        num_train_epochs = 3,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        load_best_model_at_end=True,
        metric_for_best_model="eval_loss",
        greater_is_better=False,
        seed = 42,
        report_to = "none",
    )

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset=dataset_formatted["train"],
    eval_dataset=dataset_formatted["test"],
    dataset_text_field = "text",
    max_seq_length = 4096,
    data_collator = DataCollatorForSeq2Seq(tokenizer = tokenizer), #only use when using train_on_responses_only()
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = training_args)

In [None]:
trainer.train_dataset

In [None]:
print(tokenizer.decode(trainer.train_dataset['input_ids'][0]))

#### Only Focus on the `Response Part` for the generation

In [None]:
trainer = train_on_responses_only(
    trainer,
    instruction_part = "<|start_header_id|>user<|end_header_id|>\n\n",
    response_part = "<|start_header_id|>assistant<|end_header_id|>\n\n",
)

In [None]:
trainer.train_dataset

In [None]:
# The labels are created which only contain response. Left Padding is implemented and all the padding tokens are given a score of -100 to avoid loss calculation for pad_tokens
trainer.train_dataset['labels'][0]

#### Train the model

In [None]:
trainer_stats = trainer.train()

#### Saving the model and tokenizer

Just save the LoRA Adapters without merging with base model

In [None]:
peft_model_path = "./llama32-sft-peft-kabatubare" #use for LoRA based fine-tuning

# Or run the two below statements
model.save_pretrained(peft_model_path)
tokenizer.save_pretrained(peft_model_path)

### Inference

In [6]:
# full_model_path = "./llama32-sft-full-kabatubare"
peft_model_path = "./llama32-sft-peft-kabatubare" #use for LoRA based fine-tuning

In [7]:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = peft_model_path,
    max_seq_length = 4096,
    load_in_4bit = False, # 4 bit quantization to reduce memory
    load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
    full_finetuning = False, # [NEW!] We have full finetuning now!
    dtype=None, #None for auto-detection. Can be torch.bfloat16 or torch.float16 (will be automatically detected)
    device_map="auto"
)

==((====))==  Unsloth 2025.3.19: Fast Llama patching. Transformers: 4.50.2.
   \\   /|    NVIDIA RTX A6000. Num GPUs = 1. Max memory: 47.413 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.4.1+cu124. CUDA: 8.6. CUDA Toolkit: 12.4. Triton: 3.0.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post1. FA2 = True]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Unsloth 2025.3.19 patched 28 layers with 0 QKV layers, 0 O layers and 0 MLP layers.


In [8]:
dataset['test']

Dataset({
    features: ['question', 'gpt_response'],
    num_rows: 2344
})

In [None]:
FastLanguageModel.for_inference(model)

for idx in range(1,50):

    print(dataset['test']['question'][idx])

    messages = [{"role": "system", "content": "You are a medical knowledge assistant trained to provide information and guidance on various health-related topics."},
                {"role": "user", "content": dataset['test']['question'][idx]}]

    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

    inputs = tokenizer(prompt, return_tensors='pt', padding=True, truncation=True).to(model.device)

    outputs = model.generate(**inputs, max_new_tokens=4096, num_return_sequences=1)

    text = tokenizer.decode(outputs[0], skip_special_tokens=True)

    print(text.split("assistant")[1])

    print('---------------------------------------------------')

can you get shingles only in the pubic area?


Yes, shingles (herpes zoster) can occur in the pubic area. While shingles is most commonly seen on one side of the body, it can affect any area, including the pubic region. The rash typically appears as a band or patch on the affected side and can be accompanied by pain, itching, or burning. If you suspect you have shingles, it's important to consult a healthcare professional for an accurate diagnosis and appropriate treatment.
---------------------------------------------------
can type 1 diabetics gain weight using yeast pills? . i am a maie type 1 diabetic. i am 5'4 in height and weigh 108lbs. i am trying to gain weight and my doctor says that my metabolism burns up everything i consume quickly. i was told that by taking yeast pills that i may be able to gain some weight. is this possible and what are the risks. i want to know what my weight should be and how i can attain that.


Gaining weight as a type 1 diabetic can be challenging, e