
# Lab 1 — Google Colab Setup for LoRA + Unsloth (SLM Dragon Trainer)

This notebook prepares a clean, reliable Colab environment for fine‑tuning Small Language Models (SLMs) with **LoRA + Unsloth** on a GPU runtime.

**What this does**
- Verifies GPU
- Installs a *matching* PyTorch trio (torch, torchvision, torchaudio) with auto‑fallback across CUDA wheels
- Installs Unsloth + Unsloth Zoo and other LLM libs **without** breaking Torch deps
- Loads a base model and runs a quick inference smoke test

> If a later cell or library upgrades Torch and breaks compatibility, just re‑run the **Install PyTorch** and **Install LLM libs** cells.



## Step 0 — Enable GPU in Colab
- Runtime → **Change runtime type** → Hardware accelerator: **GPU** → *Save*


## Step 1 — Verify GPU

In [1]:

!nvidia-smi || echo "No GPU detected. Enable GPU under Runtime → Change runtime type."


Mon Aug 11 13:57:01 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:00:04.0 Off |                    0 |
| N/A   32C    P0             45W /  400W |       0MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                


## Step 2 — Install a matched PyTorch build (auto‑fallback)
This cell:
- Removes conflicting stacks
- Tries official wheels in order: **cu124 → cu121 → cu118**
- Verifies import and CUDA availability


In [2]:

# Clean potential conflicts
!pip -q uninstall -y torch torchvision torchaudio fastai
!pip -q cache purge

def try_tag(tag):
    print(f"\nTrying {tag} wheels...")
    # Use the official index for speed and correctness
    rc = !pip install --no-cache-dir --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/{tag}
    try:
        import torch
        print("Torch:", torch.__version__, "| CUDA available:", torch.cuda.is_available())
        return True
    except Exception as e:
        print("Torch import failed:", e)
        return False

ok = False
for tag in ["cu124", "cu121", "cu118"]:
    if try_tag(tag):
        ok = True
        break

if not ok:
    raise SystemExit("Could not install a matching torch build. Make sure GPU is enabled, then rerun this cell.")


[0m
Trying cu124 wheels...
Torch: 2.6.0+cu124 | CUDA available: True



## Step 3 — Install LLM libraries (without touching Torch)
We use `--no-deps` to prevent accidental Torch upgrades by pip.


In [6]:

# Step 3 — Install LLM libraries (without touching Torch)
!pip install -U unsloth unsloth_zoo accelerate transformers peft datasets bitsandbytes sentencepiece trl --no-deps


Collecting trl
  Downloading trl-0.21.0-py3-none-any.whl.metadata (11 kB)
Downloading trl-0.21.0-py3-none-any.whl (511 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m511.9/511.9 kB[0m [31m31.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: trl
Successfully installed trl-0.21.0


## Step 4 — Sanity check Torch + GPU

In [7]:

import torch
print("Torch:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("Device:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU")


Torch: 2.6.0+cu124
CUDA available: True
Device: NVIDIA A100-SXM4-40GB



## Step 5 — Load a base model (Unsloth)
> If you hit permission errors with Llama‑2, switch to the permissive Mistral model line below.


In [8]:

import unsloth
from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = torch.float16

# Choose one:
# model_name = "unsloth/llama-2-7b-bf16"        # requires HF access
model_name = "unsloth/mistral-7b-v0.2-bf16"     # permissive fallback

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = True,
)

FastLanguageModel.for_inference(model)
print("Model loaded.")



Please restructure your imports with 'import unsloth' at the top of your file.
  from unsloth import FastLanguageModel


🦥 Unsloth Zoo will now patch everything to make training faster!


FileNotFoundError: unsloth/mistral-7b-v0.2-bf16/*.json (repository not found)

## Step 6 — Quick inference smoke test

In [None]:

prompt = "What is the capital of France?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=32)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))



---

## Troubleshooting

- **No GPU detected**: Enable GPU (Step 0) and **Runtime → Restart runtime**, then re‑run from Step 1.
- **Dependency conflicts**: If a later install upgrades Torch, re‑run **Step 2** then **Step 3**.
- **`ImportError: install unsloth_zoo`**: Re‑run **Step 3** to install `unsloth_zoo`.
- **`pip` resolver warnings**: These are expected in Colab’s mixed environment. We explicitly pin Torch trio in Step 2 and use `--no-deps` in Step 3 to avoid breakage.
