# Fine-tuning Gemma 3 270M with CodeAlpaca-20k on Google Colab

This notebook provides a complete guide and runnable code to fine-tune the Gemma 3 270M model for coding tasks using the CodeAlpaca-20k dataset on Google Colab. We will leverage Hugging Face Transformers, PEFT (Parameter-Efficient Fine-tuning) with LoRA, and optionally Unsloth for optimized training.

**Important:** Before running this notebook, ensure you have accepted the terms and conditions for the `google/gemma-3-270m` model on its Hugging Face page: [https://huggingface.co/google/gemma-3-270m](https://huggingface.co/google/gemma-3-270m).

## 1. Setup Google Colab Environment

First, set up your Google Colab environment to use a GPU.

1.  Go to `Runtime > Change runtime type`.
2.  Under "Hardware accelerator," select `GPU` (e.g., T4 GPU).
3.  Ensure "Runtime shape" is set to `Standard` or `High-RAM`.

## 2. Install Dependencies

It\"s crucial to install `unsloth` (if you choose to use it) as the very first step after restarting the runtime to avoid dependency conflicts. If you encounter `ResolutionImpossible` errors, restart the runtime and run this cell first.



In [None]:
# Cell 1: Install Unsloth (Optional but Recommended for Optimization)
# If you encounter dependency conflicts, restart runtime and run this cell first.
!pip install -q "unsloth[colab] @ git+https://github.com/unslothai/unsloth.git"

# Cell 2: Install other necessary libraries
!pip install -q transformers accelerate peft datasets bitsandbytes


## 3. Authenticate with Hugging Face (for Gated Models and Pushing to Hub)

If you encounter a `GatedRepoError` (401 Unauthorized) when loading the Gemma model, it means you need to accept its terms on Hugging Face. After accepting, run this cell to authenticate your Colab session.



In [None]:
# Cell 3: Hugging Face Login
from huggingface_hub import notebook_login

notebook_login()


## 4. Verify GPU Setup



In [None]:
# Cell 4: Check GPU availability
import torch

print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"Device name: {torch.cuda.get_device_name(0)}")


## 5. Fine-tuning Gemma 3 270M with CodeAlpaca-20k

This section provides the core code for loading the model, preparing the dataset, and fine-tuning using PEFT (LoRA).



In [None]:
# Cell 5: Model and Tokenizer Loading, LoRA Configurationfrom transformers import (    AutoModelForCausalLM,     AutoTokenizer,     TrainingArguments,     Trainer,    DataCollatorForLanguageModeling)from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_trainingfrom datasets import load_datasetimport os# Model configurationmodel_id = "google/gemma-3-270m"max_seq_length = 1024# Load tokenizer and modelprint("Loading tokenizer and model...")tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained(    model_id,    torch_dtype=torch.bfloat16,    device_map="auto",    load_in_4bit=True,  # Use 4-bit quantization for memory efficiency)# Add padding token if not presentif tokenizer.pad_token is None:    tokenizer.pad_token = tokenizer.eos_token    tokenizer.pad_token_id = tokenizer.eos_token_id# Prepare model for k-bit trainingmodel = prepare_model_for_kbit_training(model)# Configure LoRAlora_config = LoraConfig(    r=16,  # LoRA attention dimension    lora_alpha=16,  # Alpha parameter for LoRA scaling    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],    bias="none",    task_type="CAUSAL_LM",    lora_dropout=0.05,)# Apply LoRA to the modelmodel = get_peft_model(model, lora_config)model.print_trainable_parameters()# Cell 6: Load and Preprocess Datasetprint("Loading CodeAlpaca-20k dataset...")dataset = load_dataset("sahil2801/CodeAlpaca-20k")def format_instruction(sample):    """Format the instruction-input-output into a single training example."""    if sample["input"] and sample["input"].strip():        return f"### Instruction:\n{sample["instruction"]}\n\n### Input:\n{sample["input"]}\n\n### Response:\n{sample["output"]}"    else:        return f"### Instruction:\n{sample["instruction"]}\n\n### Response:\n{sample["output"]}"def preprocess_function(examples):    """Tokenize the formatted examples."""    texts = [format_instruction(example) for example in examples["train"]]    tokenized = tokenizer(        texts,        truncation=True,        padding=False,        max_length=max_seq_length,        return_tensors=None,    )    return tokenized# Apply preprocessingprint("Preprocessing dataset...")tokenized_dataset = dataset.map(    preprocess_function,    batched=True,    remove_columns=dataset["train"].column_names,)# Split dataset (use 90% for training, 10% for validation)train_dataset = tokenized_dataset["train"].train_test_split(test_size=0.1)train_data = train_dataset["train"]eval_data = train_dataset["test"]print(f"Training samples: {len(train_data)}")print(f"Validation samples: {len(eval_data)}")# Cell 7: Training Arguments and Trainer Initializationtraining_args = TrainingArguments(    output_dir="./gemma-3-270m-codealpaca",    num_train_epochs=3,    per_device_train_batch_size=2,    per_device_eval_batch_size=2,    gradient_accumulation_steps=4,    optim="paged_adamw_8bit",    learning_rate=2e-4,    weight_decay=0.01,    fp16=False,    bf16=True,    max_grad_norm=0.3,    max_steps=-1,    warmup_ratio=0.03,    group_by_length=True,    lr_scheduler_type="cosine",    logging_steps=10,    save_steps=500,    save_total_limit=2,    evaluation_strategy="steps",    eval_steps=500,    load_best_model_at_end=True,    metric_for_best_model="eval_loss",    greater_is_better=False,    push_to_hub=False,    report_to=None,  # Disable wandb logging)# Data collatordata_collator = DataCollatorForLanguageModeling(    tokenizer=tokenizer,    mlm=False,)# Initialize trainertrainer = Trainer(    model=model,    args=training_args,    train_dataset=train_data,    eval_dataset=eval_data,    tokenizer=tokenizer,    data_collator=data_collator,)# Cell 8: Start Trainingprint("Starting training...")trainer.train()# Cell 9: Save the Fine-tuned Modelprint("Saving model...")trainer.save_model("./gemma-3-270m-codealpaca-final")tokenizer.save_pretrained("./gemma-3-270m-codealpaca-final")print("Training completed!")

## 6. Test Your Fine-tuned Model



In [None]:
# Cell 10: Test Inferencefrom transformers import AutoModelForCausalLM, AutoTokenizerimport torch# Load the fine-tuned model for inferencemodel_path = "./gemma-3-270m-codealpaca-final"model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map="auto")tokenizer = AutoTokenizer.from_pretrained(model_path)def generate_code(instruction, input_text=""):    """Generate code based on instruction and optional input."""    if input_text:        prompt = f"### Instruction:\n{instruction}\n\n### Input:\n{input_text}\n\n### Response:\n"    else:        prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"        inputs = tokenizer(prompt, return_tensors="pt").to(model.device)        with torch.no_grad():        outputs = model.generate(            **inputs,            max_new_tokens=256,            do_sample=True,            temperature=0.7,            top_p=0.9,            pad_token_id=tokenizer.eos_token_id        )        response = tokenizer.decode(outputs[0], skip_special_tokens=True)    return response.split("### Response:\n")[-1].strip()# Test examplestest_instructions = [    "Write a Python function to calculate the factorial of a number.",    "Create a function that finds the maximum element in a list.",    "Write a Python function to check if a string is a palindrome.",]for instruction in test_instructions:    print(f"Instruction: {instruction}")    print(f"Generated Code:\n{generate_code(instruction)}")    print("-" * 50)```

## 7. Push to Hugging Face Hub (Optional)



In [None]:
# Cell 11: Push to Hugging Face Hub# Replace "your-username/your-model-name" with your desired repository name# The repository will be created under your Hugging Face usernametrainer.push_to_hub("gemma-3-270m-codealpaca-fine-tuned")```