In [4]:
!pip install unsloth
# unsloth provides efficient finetuning and inference optimization for LLM

Collecting git+https://github.com/unslothai/unsloth.git
  Cloning https://github.com/unslothai/unsloth.git to /tmp/pip-req-build-h333bf0e
  Running command git clone --filter=blob:none --quiet https://github.com/unslothai/unsloth.git /tmp/pip-req-build-h333bf0e
  Resolved https://github.com/unslothai/unsloth.git to commit 038e6d4c8d40207a87297ab3aaf787c19b1006d1
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: unsloth
  Building wheel for unsloth (pyproject.toml) ... [?25l[?25hdone
  Created wheel for unsloth: filename=unsloth-2025.1.8-py3-none-any.whl size=174982 sha256=ef3b042b9050e416065e492d2bb3a9a252cb705dd66f0be0d4f99556db9268ea
  Stored in directory: /tmp/pip-ephem-wheel-cache-amr7pe1m/wheels/d1/17/05/850ab10c33284a4763b0595cd8ea9d01fce6e221cac24b3c01
Successfully built unsloth
Installing collected packages: unsloth
 

In [5]:
# Library imports
from unsloth import FastLanguageModel
from unsloth import is_bfloat16_supported
from trl import SFTTrainer

from huggingface_hub import login
from transformers import TrainingArguments
from datasets import load_dataset

import torch

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


In [None]:
# hf setup
HF_TOKEN = "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"
wandb.login(key=WNB_API_KEY)
login(HF_TOKEN)

In [8]:
# wandb setup
import wandb
WNB_API_KEY = "6f2da2be39014ee5ede28b2cf671eced504b7229"
wandb.login(key=WNB_API_KEY)
run = wandb.init(
    project='DeepSeek-finetune',
    job_type='training',
    anonymous='allow'
)

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mtanushtm-work[0m ([33mtanushtm-work-srm-easwari-engineering-college[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


In [9]:
# fetch model from unsloth
max_seq_length = 2048
dtype = None
load_in_4bit = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = HF_TOKEN
)

==((====))==  Unsloth 2025.1.8: Fast Llama patching. Transformers: 4.47.1.
   \\   /|    GPU: Tesla T4. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post1. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


In [10]:
# Prompt template - inference only
prompt_style = """
  Below is an instruction that describes a task, paired with an imput that provides further context. Write a response that appropriately completes the request.
  Before answering, think carefully about the question and create a step by step chain of thoughts to ensure a logical and accurate response.

  ### Instruction: You are a medical expert with advanced knowledge in clinical reasongong, diagnostics and treatment planning. Answer the following medical question.

  ### Question: {}

  ### Response: <think> {}

"""

In [11]:
 # Model inference before finetuning
 question = """
    A 60 year old woman with a long history of involuntary urine loss during activities like coughing or sneezig but no leakage at night undergoes a gynecological exam and a Q-tip test.
    Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions ?
 """

FastLanguageModel.for_inference(model)

inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)

response = tokenizer.batch_decode(outputs)

print(response[0].split("### Response:")[1])


 <think> 

Alright, let's try to work through this medical question. So, the patient is a 60-year-old woman with a history of involuntary urine loss during activities like coughing or sneezing, but she doesn't leak at night. She undergoes a gynecological exam and a Q-tip test. The question is asking what cystometry would most likely reveal about her residual volume and detrusor contractions.

First, I need to understand the key points here. Involuntary urine loss during activities like coughing or sneezing suggests a possible issue with the lower urinary tract, specifically the bladder. Since she doesn't leak at night, that might indicate that her bladder capacity is good when it's relaxed, but perhaps it doesn't hold up during activities that increase intra-abdominal pressure, like coughing or sneezing.

Next, she underwent a gynecological exam and a Q-tip test. I think the Q-tip test is used to assess urethral function, especially in cases of incontinence. The Q-tip is a small cathet

In [12]:
# Prompt template - for finetuning
train_prompt_style = """
Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction: You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.
Please answer the following medical question.

### Question: {}

### Response: <think> {} </think> {}
"""

In [13]:
# Load dataset from hf
dataset = load_dataset(
    "FreedomIntelligence/medical-o1-reasoning-SFT",
    "en",
    split = "train[0:500]",
    trust_remote_code=True)
dataset

Dataset({
    features: ['Question', 'Complex_CoT', 'Response'],
    num_rows: 500
})

In [14]:
dataset[5]

{'Question': 'Scientists are developing a new non-steroidal anti-inflammatory drug for osteoarthritis, aiming for higher potency but the same efficacy as ibuprofen to reduce gastrointestinal side effects. If ibuprofen is represented by curve C in the figure and the desired therapeutic effect is marked by the dashed line Y, what curve would represent the new drug that fulfills these criteria?',
 'Complex_CoT': "Okay, so I need to figure out the characteristics of this new drug compared to ibuprofen. Let's see, we're looking at two main things here: potency and efficacy. If I remember correctly, potency is about how much of a drug we need to get a particular effect. So if a drug is more potent, we need less of it to get the job done. Efficacy, on the other hand, is about how good the drug is at its best; it's the maximum effect it can achieve.\n\nNow, for this new drug, they want it to be more potent than ibuprofen. That means it should work at a lower dose. But they still want it to be 

In [15]:
# 500 data samples
len(dataset)

500

In [16]:
EOS_TOKEN = tokenizer.eos_token
EOS_TOKEN

'<｜end▁of▁sentence｜>'

In [17]:
# formatting dataset to the prompt
def formatting_prompts_function(examples):
    inputs = examples["Question"]
    cots = examples["Complex_CoT"]
    outputs = examples["Response"]

    texts = []

    for input, cot, output in zip(inputs, cots, outputs):
        text = train_prompt_style.format(input, cot, output) + EOS_TOKEN
        texts.append(text)

    return {
        "text": texts,
    }

In [19]:
dataset_finetune = dataset.map(formatting_prompts_function, batched = True)
dataset_finetune["text"][5]

Map:   0%|          | 0/500 [00:00<?, ? examples/s]

"\nBelow is an instruction that describes a task, paired with an input that provides further context.\nWrite a response that appropriately completes the request.\nBefore answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n### Instruction: You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.\nPlease answer the following medical question.\n\n### Question: Scientists are developing a new non-steroidal anti-inflammatory drug for osteoarthritis, aiming for higher potency but the same efficacy as ibuprofen to reduce gastrointestinal side effects. If ibuprofen is represented by curve C in the figure and the desired therapeutic effect is marked by the dashed line Y, what curve would represent the new drug that fulfills these criteria?\n\n### Response: <think> Okay, so I need to figure out the characteristics of this new drug compared to ibuprofen. Let's se

In [20]:
# load base model with lora for peft
model_lora = FastLanguageModel.get_peft_model(
    model,
    r=16,  # LoRA rank - no of trainable parameters in adapter layers
    target_modules=[  # layers to apply lora to
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_alpha=16,  # scaling factor for lora layers (higher values may stabilize training)
    lora_dropout=0,  # dropout rate for lora adapters (0 means no dropout)
    bias="none",  # no additional bias parameters are trained
    use_gradient_checkpointing="unsloth",  # saves GPU memory, reduces training speed
    random_state=3407,  # random seed
    use_rslora=False,  # rank-stabilized LoRA is disabled
    loftq_config=None,
)

Unsloth 2025.1.8 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


In [21]:
# initialize SFFTrainer from hf
trainer = SFTTrainer(
    model=model_lora,  # use lora model defined above
    tokenizer=tokenizer,
    train_dataset=dataset_finetune,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,

    # training arguments
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        num_train_epochs=1,
        warmup_steps=5,
        max_steps=60,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=5,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
    ),
)

Map (num_proc=2):   0%|          | 0/500 [00:00<?, ? examples/s]

In [22]:
# start model training
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 500 | Num Epochs = 1
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 60
 "-____-"     Number of trainable parameters = 41,943,040


Step,Training Loss
5,2.1745
10,1.7847
15,1.5232
20,1.4346
25,1.4174
30,1.3933
35,1.3242
40,1.3058
45,1.3473
50,1.3547


In [23]:
# model train stats
wandb.finish()

0,1
train/epoch,▁▂▂▃▄▄▅▅▆▇▇██
train/global_step,▁▂▂▃▄▄▅▅▆▇▇██
train/grad_norm,█▆▂▂▂▁▁▁▂▁▁▂
train/learning_rate,█▇▇▆▅▅▄▄▃▂▂▁
train/loss,█▅▃▂▂▂▁▁▁▁▁▁

0,1
total_flos,1.7950738039996416e+16
train/epoch,0.96
train/global_step,60.0
train/grad_norm,0.27243
train/learning_rate,0.0
train/loss,1.3337
train_loss,1.47516
train_runtime,1263.7152
train_samples_per_second,0.38
train_steps_per_second,0.047


In [24]:
# model inference after finetuning
question = """A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing
              but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings,
              what would cystometry most likely reveal about her residual volume and detrusor contractions?"""

FastLanguageModel.for_inference(model_lora)  # Unsloth has 2x faster inference!

inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model_lora.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)

response = tokenizer.batch_decode(outputs)

print(response[0].split("### Response:")[1])

 <think> 

Alright, let's think through this. This woman is 61 and has been dealing with involuntary urine loss for a while. She's coughing or sneezing, and that's causing her to lose urine involuntarily. But, interestingly, she doesn't leak at night, which is a bit unusual. 

When we look at the Q-tip test, it's designed to help figure out how much urine remains in her bladder after she coughs or sneezes. The Q-tip is placed in her urethra and then pulled out while she coughs. If it gets wet, that suggests she's leaking urine involuntarily. 

In her case, the test shows no wetting, which suggests she's not leaking urine. That's interesting because we'd expect some leakage if she's involuntarily urinating. Now, this makes me wonder: is there something else going on here? 

Now, let's consider what cystometry would reveal. Cystometry is a test where we fill the bladder and then empty it to see how much urine is left. It's usually used to measure residual volume and check for contraction