<a href="https://colab.research.google.com/github/IEncryptSaad/LLAMA3-MEDICAL-COT-FINETUNE/blob/main/LLAMA3_MEDICAL_COT_FINETUNE.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# STEP 1: ENVIRONMENT SETUP

!pip install -q unsloth "transformers>=4.44.0" "accelerate>=0.33.0" bitsandbytes peft datasets evaluate rouge-score wandb

import torch, os
from getpass import getpass

# Authenticate Hugging Face
os.environ["HF_TOKEN"] = getpass("Enter your Hugging Face token: ")

# Authenticate Weights & Biases
os.environ["WANDB_API_KEY"] = getpass("Enter your Weights & Biases API key: ")

# Login
!huggingface-cli login --token $HF_TOKEN
!wandb login $WANDB_API_KEY

# Verify GPU
print("GPU available:", torch.cuda.is_available())
!nvidia-smi


Enter your Hugging Face token: ··········
Enter your Weights & Biases API key: ··········
The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `hf`CLI if you want to set the git credential as well.
Token is valid (permission: write).
The token `colab-access` has been saved to /root/.cache/huggingface/stored_tokens
Your token has been saved to /root/.cache/huggingface/token
Login successful.
Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin
GPU available: True
Tue Nov  4 14:40:07 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driv

In [5]:
# STEP 2 : LOAD PUBMEDQA AND PREPARE TRAIN/VAL DATASETS

from datasets import load_dataset
from sklearn.model_selection import train_test_split

# Load the labeled PubMedQA dataset (only has 'train' split)
ds = load_dataset("pubmed_qa", "pqa_labeled")

# Extract the full train data
full_data = ds["train"]

# Split manually: 100 samples for validation, rest for training
train_indices, val_indices = train_test_split(
    range(len(full_data)), test_size=100, random_state=42
)
train_raw = full_data.select(train_indices)
val_raw = full_data.select(val_indices)

def to_think_response(example):
    think = example.get("long_answer", "") or "Reasoning not provided."
    label = example.get("final_decision", "")
    question = example.get("question", "")
    response = f"Answer: {label}. Summary: Based on the evidence, the conclusion is {label} for: {question}"
    return {
        "text": f"<think>{think}</think>\n<response>{response}</response>"
    }

train_data = train_raw.map(to_think_response)
val_data   = val_raw.map(to_think_response)

print("Train samples:", len(train_data))
print("Val samples:", len(val_data))
print("\nFormatted sample:\n", train_data[0]["text"][:800])


Map:   0%|          | 0/900 [00:00<?, ? examples/s]

Map:   0%|          | 0/100 [00:00<?, ? examples/s]

Train samples: 900
Val samples: 100

Formatted sample:
 <think>Serum sample might be alternatively used in the difficult time of getting tumor tissue for analyzing the status of EGFR mutation status in patients with advanced NSCLC.</think>
<response>Answer: yes. Summary: Based on the evidence, the conclusion is yes for: Can serum be used for analyzing the EGFR mutation status in patients with advanced non-small cell lung cancer?</response>


In [7]:
!pip install -q evaluate rouge-score


  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for rouge-score (setup.py) ... [?25l[?25hdone


In [9]:
# STEP 3: BASELINE EVALUATION

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from evaluate import load
import numpy as np

# Load the base model and tokenizer (quantized for speed)
model_name = "unsloth/llama-3-3b-Instruct-bnb-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

# Create a lightweight text generation pipeline
generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=150
)

# Use a few validation samples for quick baseline scoring
sample_prompts = [ex["text"] for ex in val_data.select(range(5))]
rouge = load("rouge")

generated_responses = []
references = []

for text in sample_prompts:
    prompt = text.split("<think>")[0]  # remove reasoning part
    output = generator(prompt, do_sample=False)[0]["generated_text"]
    generated_responses.append(output)
    references.append(text)

# Compute ROUGE-L baseline
results = rouge.compute(predictions=generated_responses, references=references)
print("Baseline ROUGE-L score:", results["rougeL"])


OSError: unsloth/llama-3-3b-Instruct-bnb-4bit is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `hf auth login` or by passing `token=<your_token>`