# Gemma-3 270M LoRA / QLoRA Quickstart (Colab)

This notebook helps you set up a Colab GPU runtime, install required packages, provide your Hugging Face token, choose a model variant, and run a small dry-run of the project's training CLI in `--mode lora` or `--mode qlora`.

Open the runtime type as `GPU` before running the install cell.

## 1) Install dependencies (run in a GPU runtime)
Run the cell below to install the core dependencies. For QLoRA you also need a compatible CUDA + bitsandbytes build.

In [None]:
# Install core dependencies. This may take a few minutes in Colab.
!pip install --upgrade pip
!pip install transformers accelerate datasets safetensors
!pip install -r requirements-dev.txt
# bitsandbytes and peft are optional but useful for LoRA/QLoRA runs; the pip may fail on some envs, so allow failure
!pip install git+https://github.com/huggingface/peft.git || true
!pip install bitsandbytes || true
print('Install step finished (errors for bitsandbytes may be expected on some Colab CUDA versions).')

## 2) Provide your Hugging Face token
Run the cell and paste your HF token when prompted. The token is stored in the environment for the notebook session only.

In [None]:
from getpass import getpass
import os
token = getpass('Enter your Hugging Face token (input hidden): ')
if token:
  os.environ['HF_TOKEN'] = token
  print('HF token set in environment for this session')
else:
  print('No token provided; some actions may be skipped')

## 3) Choose model variant and mode
Use the next cell to pick either `google/gemma-3-270m` (default) or another compatible Gemma checkpoint. Toggle QLoRA if you want to attempt 4-bit quantized flow (only if bitsandbytes is installed).

In [None]:
# Default model id (change if you have a different HF checkpoint)
model_id = 'google/gemma-3-270m'
# Set to True to attempt QLoRA (requires bitsandbytes + CUDA)
markdown
markdown
## 10) Push trained model to Hugging Face Hub
This cell will push a local model directory to the Hub. Ensure `HF_TOKEN` is set in the session and you have `huggingface_hub` installed. Edit `HUB_REPO_ID` to a repo you control (e.g., 'your-username/gemma3-finetuned').
code
python
# Push model to HF Hub (requires HF_TOKEN and huggingface_hub installed)
import os
from pathlib import Path
HUB_REPO_ID = 'your-username/gemma3-finetuned'  # EDIT_THIS
MODEL_DIR = 'outputs/gemma3-lora-run/best_checkpoint'  # set to your final model dir

if not os.environ.get('HF_TOKEN'):
    print('HF_TOKEN not set in environment; set it and re-run this cell')
else:
    try:
        from huggingface_hub import HfApi, Repository
        api = HfApi()
        print('Creating or ensuring repo exists...')
        api.create_repo(HUB_REPO_ID, private=False, exist_ok=True, token=os.environ['HF_TOKEN'])
        repo_local = Path('hf_repo')
        if not repo_local.exists():
            print('Cloning remote repo...')
            repo = Repository(local_dir=str(repo_local), clone_from=HUB_REPO_ID, use_auth_token=os.environ['HF_TOKEN'])
        else:
            from huggingface_hub import Repository as _Repo
            repo = _Repo(local_dir=str(repo_local))
        print('Copying model files...')
        import shutil
        shutil.rmtree(repo_local, ignore_errors=True)
        repo_local.mkdir(parents=True, exist_ok=True)
        # copy model checkpoint dir into repo_local
        shutil.copytree(MODEL_DIR, repo_local / Path(MODEL_DIR).name)
        print('Pushing to Hub... this may take a while')
        repo.push_to_hub(commit_message='Add fine-tuned Gemma adapter', blocking=True)
        print('Pushed to', HUB_REPO_ID)
    except Exception as e:
        print('Failed to push to Hub:', e)
use_qlora = False
# Example overrides: model_id = 'google/gemma-3-270m'
print(f'Model: {model_id}, QLoRA enabled: {use_qlora}')

## 4) Quick check: CUDA / wheel helper
This cell inspects the runtime CUDA version (if available) and prints recommended pip commands for installing PyTorch and bitsandbytes suitable for common Colab CUDA versions. Use the printed command to (re)install if needed.

In [None]:
import subprocess, sys, re
def get_nvidia_smi():
    try:
        out = subprocess.check_output(['nvidia-smi','--query-gpu=driver_version,cuda_version','--format=csv,noheader'])
        return out.decode().strip()
    except Exception as e:
        return None
info = get_nvidia_smi()
print('nvidia-smi info:', info)
cuda_tag = 'cpu'
if info:
    m = re.search(r'([0-9]+[0-9]+)', info)
    if m:
        v = m.group(1)
        # Map common Colab CUDA to torch wheel tags (best-effort)
        if v.startswith('12.1') or v.startswith('12.0'):
            cuda_tag = 'cu121'
        elif v.startswith('11.8'):
            cuda_tag = 'cu118'
        elif v.startswith('11.7'):
            cuda_tag = 'cu117'
        else:
            cuda_tag = 'cu118'
print('Detected CUDA tag suggestion:', cuda_tag)
if cuda_tag != 'cpu':
    print('Recommended pip install (example):')
    print(f