<a href="https://colab.research.google.com/github/bananighosh/mediBuddy-deepseek-r1/blob/main/mediBuddy_fine_tuned_deepseek_r1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Requirements

In [1]:
%%capture
!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

In [2]:
from google.colab import userdata
from huggingface_hub import login
login(token=userdata.get('HF_TOKEN'))

In [3]:
from unsloth import FastLanguageModel

# Load the model

In [4]:
max_seq_length = 2048
dtype = None
load_in_4bit = True

In [5]:
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name= "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length= max_seq_length,
    dtype= dtype,
    load_in_4bit= load_in_4bit,
    token = userdata.get('HF_TOKEN')
)

==((====))==  Unsloth 2025.1.8: Fast Llama patching. Transformers: 4.47.1.
   \\   /|    GPU: Tesla T4. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post1. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!




model.safetensors:   0%|          | 0.00/5.96G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/236 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/52.9k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

# Defining the Fine-tunes model

In [7]:
model_lora = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = 3407,
    use_rslora = False,
    loftq_config=None,
)

Unsloth 2025.1.8 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


# Running Inference before Fine tuning

In [31]:
prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.
Please answer the following medical question.

### Question:
{}

### Response:
<think>{}"""

In [34]:
question = """A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing
              but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings,
              what would cystometry most likely reveal about her residual volume and detrusor contractions?"""




FastLanguageModel.for_inference(model)

inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids = inputs.input_ids,
    attention_mask = inputs.attention_mask,
    max_new_tokens = 1200,
    use_cache = True
)

response= tokenizer.batch_decode(outputs)


In [35]:
print(response[0].split("### Response:")[1])


<think>
Okay, so I'm trying to figure out what cystometry would show for this 61-year-old woman. She's been dealing with involuntary urine loss when she coughs or sneezes but doesn't leak at night. From what I know, this sounds like a possible case of stress urinary incontinence, especially since the leakage happens during activities that put pressure on the bladder.

First, I should think about what causes stress incontinence. Usually, it's related to the muscles supporting the bladder not being strong enough, often due to weakened pelvic muscles or issues with the urethral sphincter. But since she's been having these episodes during activities like coughing, it's probably not a case of urgency incontinence because she doesn't leak at night, which is more common with urgency.

Now, the Q-tip test was mentioned. I recall that the Q-tip test involves inserting a Q-tip catheter into the urethra and then having the patient cough. If the catheter doesn't stay inside after the cough, it su

# Loading the Dataset

In [8]:
train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.
Please answer the following medical question.

### Question:
{}

### Response:
<think>
{}
</think>
{}"""

In [15]:
EOS_TOKEN = tokenizer.eos_token

def formatting_prompts_func(examples):
  inputs = examples["Question"]
  cots = examples["Complex_CoT"]
  outputs = examples["Response"]
  texts = []
  for input, cot, output in zip(inputs, cots, outputs):
    text = train_prompt_style.format(input, cot, output) + EOS_TOKEN
    texts.append(text)
  return {
      "text" : texts,
  }

In [21]:
from datasets import load_dataset
dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train[0:100]",trust_remote_code=True)
dataset_finetune = dataset.map(formatting_prompts_func, batched = True,)

In [22]:
dataset_finetune[0]

{'Question': 'A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?',
 'Complex_CoT': "Okay, let's think about this step by step. There's a 61-year-old woman here who's been dealing with involuntary urine leakages whenever she's doing something that ups her abdominal pressure like coughing or sneezing. This sounds a lot like stress urinary incontinence to me. Now, it's interesting that she doesn't have any issues at night; she isn't experiencing leakage while sleeping. This likely means her bladder's ability to hold urine is fine when she isn't under physical stress. Hmm, that's a clue that we're dealing with something related to pressure rather than a bladder muscle problem. \n\nThe fact that she underwent a Q-tip test is intriguing too. This 

In [24]:
dataset_finetune["text"][0]

"Below is an instruction that describes a task, paired with an input that provides further context.\nWrite a response that appropriately completes the request.\nBefore answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n### Instruction:\nYou are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.\nPlease answer the following medical question.\n\n### Question:\nA 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?\n\n### Response:\n<think>\nOkay, let's think about this step by step. There's a 61-year-old woman here who's been dealing with involuntary urine leakages whenever she's doing something that ups her abdomi

# Setting the Training Arguments

In [20]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

In [29]:
trainer = SFTTrainer(
  model_lora,
  tokenizer = tokenizer,
  train_dataset = dataset_finetune,
  dataset_text_field = "text",
  max_seq_length = max_seq_length,
  dataset_num_proc = 2,


  args = TrainingArguments(
      per_device_train_batch_size = 2,
      gradient_accumulation_steps = 4,
      num_train_epochs = 1,
      warmup_steps = 5,
      max_steps = 60,
      learning_rate = 2e-4,
      fp16=not is_bfloat16_supported(),
      bf16 = is_bfloat16_supported(),
      logging_steps = 1,
      optim="adamw_8bit",
      weight_decay=0.01,
      lr_scheduler_type="linear",
      seed=3407,
      output_dir="outputs",
  )
)

Map (num_proc=2):   0%|          | 0/100 [00:00<?, ? examples/s]

In [None]:
trainer_stats = trainer.train()

In [None]:
wandb.finish()

# Inferencing after finetuning

In [30]:
question = """A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing
              but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings,
              what would cystometry most likely reveal about her residual volume and detrusor contractions?"""




FastLanguageModel.for_inference(model_lora)

inputs = tokenizer([train_prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model_lora.generate(
    input_ids = inputs.input_ids,
    attention_mask = input.attention_mask,
    max_new_tokens = 1200,
    use_cache = True
)

response= tokenizer.batch_decode(outputs)

print(response[0].split("### Response:")[1])

NameError: name 'prompt_style' is not defined

In [None]:
()