# 🚀 Instruction Fine-Tuning Tutorial - Google Colab Edition

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/)

## 🎯 **Quick Start Guide for Google Colab**

### **Step 1**: Enable GPU
1. Go to `Runtime` → `Change runtime type`
2. Select `T4 GPU` (free tier)
3. Click `Save`

### **Step 2**: Run all cells
- Use `Runtime` → `Run all` or
- Run cells one by one with `Shift + Enter`

### **⚠️ Important Notes:**
- ⏱️ **Runtime Limit**: Colab free tier has ~12 hours max
- 💾 **Memory**: ~15GB RAM, manage your batch sizes
- 🔄 **Auto-disconnect**: Save your work periodically
- 📱 **Mobile-friendly**: Works on tablets/phones too!

---

## 📚 What You'll Learn:
✅ Transform a base model into an instruction-following assistant  
✅ Use LoRA for efficient fine-tuning  
✅ Evaluate model performance with BLEU scores  
✅ Practice with real code generation tasks  
✅ Compare before/after model performance  

## 🔧 **Colab Setup & Environment Check**

In [None]:
!pip install -q trl==0.9.6

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m245.8/245.8 kB[0m [31m20.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.3/18.3 MB[0m [31m51.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m73.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m42.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m29.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
pip install -q evaluate==0.4.2

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/84.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
!wget https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k/resolve/main/code_alpaca_20k.json \
     -O code_alpaca_20k.json


--2025-06-28 20:27:17--  https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k/resolve/main/code_alpaca_20k.json
Resolving huggingface.co (huggingface.co)... 13.35.202.34, 13.35.202.40, 13.35.202.97, ...
Connecting to huggingface.co (huggingface.co)|13.35.202.34|:443... connected.
HTTP request sent, awaiting response... 307 Temporary Redirect
Location: /api/resolve-cache/datasets/sahil2801/CodeAlpaca-20k/152bb5e9a29651266b018106053980070a0521a1/code_alpaca_20k.json?%2Fdatasets%2Fsahil2801%2FCodeAlpaca-20k%2Fresolve%2Fmain%2Fcode_alpaca_20k.json=&etag=%224599591b17572755907bd945e34d25a956dcab09%22 [following]
--2025-06-28 20:27:18--  https://huggingface.co/api/resolve-cache/datasets/sahil2801/CodeAlpaca-20k/152bb5e9a29651266b018106053980070a0521a1/code_alpaca_20k.json?%2Fdatasets%2Fsahil2801%2FCodeAlpaca-20k%2Fresolve%2Fmain%2Fcode_alpaca_20k.json=&etag=%224599591b17572755907bd945e34d25a956dcab09%22
Reusing existing connection to huggingface.co:443.
HTTP request sent, awaiting respon

In [None]:
import json

with open("code_alpaca_20k.json") as f:
    data = json.load(f)

print("Total examples:", len(data))
print("First example:", data[0])
# Contains keys: "instruction", "input", "output"


Total examples: 20022
First example: {'instruction': 'Create an array of length 5 which contains all even numbers between 1 and 10.', 'input': '', 'output': 'arr = [2, 4, 6, 8, 10]'}


In [None]:
import random
random.shuffle(data)

split = int(0.8 * len(data))
train = [ex for ex in data[:split] if ex.get("input", "") == ""]
val   = [ex for ex in data[split:] if ex.get("input", "") == ""]

print("Train:", len(train), "Val:", len(val))


Train: 7813 Val: 1951


In [None]:
with open("train.jsonl", "w") as tf:
    for ex in train:
        tf.write(json.dumps(ex) + "\n")

with open("validation.jsonl", "w") as vf:
    for ex in val:
        vf.write(json.dumps(ex) + "\n")


In [None]:
import json

def load_jsonl(filename):
    with open(filename) as f:
        return [json.loads(line) for line in f]

train_data = load_jsonl("train.jsonl")
val_data = load_jsonl("validation.jsonl")

def format_prompt(example):
    return {
        "text": f"### Instruction:\n{example['instruction']}\n\n### Response:\n{example['output']} </s>"
    }

train_formatted = list(map(format_prompt, train_data))
val_formatted = list(map(format_prompt, val_data))


In [None]:
!pip install -U transformers

Collecting transformers
  Downloading transformers-4.53.0-py3-none-any.whl.metadata (39 kB)
Downloading transformers-4.53.0-py3-none-any.whl (10.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.8/10.8 MB[0m [31m86.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: transformers
  Attempting uninstall: transformers
    Found existing installation: transformers 4.52.4
    Uninstalling transformers-4.52.4:
      Successfully uninstalled transformers-4.52.4
Successfully installed transformers-4.53.0


In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "facebook/opt-350m"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/685 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/644 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/441 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/663M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

In [None]:
from peft import LoraConfig, get_peft_model, TaskType

lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    lora_dropout=0.1,
    target_modules=["q_proj", "v_proj"],
    bias="none",
    task_type=TaskType.CAUSAL_LM
)

model = get_peft_model(model, lora_config)


In [None]:
def tokenize_fn(example):
    return tokenizer(
        example["text"],
        padding="max_length",
        truncation=True,
        max_length=512
    )

import datasets
train_dataset = datasets.Dataset.from_list(train_formatted).map(tokenize_fn, batched=True)
val_dataset = datasets.Dataset.from_list(val_formatted).map(tokenize_fn, batched=True)

train_dataset.set_format(type="torch", columns=["input_ids", "attention_mask"])
val_dataset.set_format(type="torch", columns=["input_ids", "attention_mask"])


Map:   0%|          | 0/7813 [00:00<?, ? examples/s]

Map:   0%|          | 0/1951 [00:00<?, ? examples/s]

In [None]:
from transformers import DataCollatorForLanguageModeling

data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False  # we're doing causal LM
)


In [None]:
pip install wandb



In [None]:
from transformers import TrainingArguments, Trainer

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./opt350m-lora-codealpaca",
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=1,
    logging_dir="./logs",
    logging_steps=10,
    save_steps=200,
    fp16=True
)
os.environ["WANDB_DISABLED"] = "true"

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator
)

trainer.train()


Using the `WANDB_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
  trainer = Trainer(
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


Step,Training Loss
10,2.7363
20,2.4292
30,2.7338
40,2.345
50,2.5395
60,2.2915
70,2.2185
80,2.1757
90,2.0214
100,2.0221


TrainOutput(global_step=1954, training_loss=1.7743696756470289, metrics={'train_runtime': 567.6315, 'train_samples_per_second': 13.764, 'train_steps_per_second': 3.442, 'total_flos': 7299932381773824.0, 'train_loss': 1.7743696756470289, 'epoch': 1.0})