# Mobile LLM Lab - Google Colab Training

This notebook trains Hugging Face models using free GPU resources from Google Colab.

## Setup Instructions:
1. **Enable GPU**: Go to Runtime → Change runtime type → Select GPU
2. **Set Secrets**: Go to the key icon (🔑) on the left sidebar:
   - Add `HF_TOKEN` with your Hugging Face token
   - Add `GH_TOKEN` with your GitHub token (optional, for pushing results)
   - Add `HF_USERNAME` with your Hugging Face username
3. **Run All Cells**: Runtime → Run all

The notebook will automatically:
- Clone your GitHub repo
- Install dependencies
- Train the model
- Push results to Hugging Face Hub
- Commit checkpoints back to GitHub (optional)

## Configuration
Edit these variables to customize your training:

In [None]:
# Training Configuration
GITHUB_REPO = "YOUR_GITHUB_USERNAME/mobile-llm-lab"  # Change this!
MODEL_NAME = "assistant_v1"  # Name for your fine-tuned model
BASE_MODEL = "distilbert-base-uncased"  # Base model from Hugging Face
DATASET = "dataset/mydata.txt"  # Path to dataset in repo
TASK_TYPE = "causal_lm"  # causal_lm or classification
EPOCHS = 3
BATCH_SIZE = 8
LEARNING_RATE = 2e-5
MAX_LENGTH = 512

# Set to True to push model to Hugging Face Hub
PUSH_TO_HUB = True

# Set to True to commit checkpoints back to GitHub
PUSH_TO_GITHUB = False

## Check GPU Availability

In [None]:
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("⚠️ WARNING: No GPU detected. Training will be slow!")
    print("Go to Runtime → Change runtime type → Select GPU")

## Load Secrets

In [None]:
import os
from google.colab import userdata

# Load tokens from Colab secrets
try:
    HF_TOKEN = userdata.get('HF_TOKEN')
    print("✓ HF_TOKEN loaded")
except:
    print("❌ HF_TOKEN not found! Add it in Secrets (🔑 icon on left)")
    HF_TOKEN = None

try:
    HF_USERNAME = userdata.get('HF_USERNAME')
    print(f"✓ HF_USERNAME loaded: {HF_USERNAME}")
except:
    print("❌ HF_USERNAME not found! Add it in Secrets")
    HF_USERNAME = None

try:
    GH_TOKEN = userdata.get('GH_TOKEN')
    print("✓ GH_TOKEN loaded")
except:
    print("⚠️ GH_TOKEN not found (optional, only needed for private repos or pushing back)")
    GH_TOKEN = None

# Set environment variables
if HF_TOKEN:
    os.environ['HF_TOKEN'] = HF_TOKEN
if GH_TOKEN:
    os.environ['GH_TOKEN'] = GH_TOKEN

## Clone Repository

In [None]:
import os

# Remove existing repo if present
!rm -rf mobile-llm-lab

# Clone repo (use token for private repos)
if GH_TOKEN:
    repo_url = f"https://{GH_TOKEN}@github.com/{GITHUB_REPO}.git"
else:
    repo_url = f"https://github.com/{GITHUB_REPO}.git"

!git clone {repo_url} mobile-llm-lab
%cd mobile-llm-lab

# Configure git for commits (if pushing back)
!git config user.email "colab@training.ai"
!git config user.name "Colab Training Bot"

print("\n✓ Repository cloned successfully")
!ls -la

## Install Dependencies

In [None]:
# Install requirements
!pip install -q transformers datasets torch accelerate huggingface_hub

print("\n✓ Dependencies installed")

## Start Training

In [None]:
# Build training command
cmd = f"""python train.py \
    --model_name {MODEL_NAME} \
    --base_model {BASE_MODEL} \
    --dataset {DATASET} \
    --task_type {TASK_TYPE} \
    --epochs {EPOCHS} \
    --batch_size {BATCH_SIZE} \
    --learning_rate {LEARNING_RATE} \
    --max_length {MAX_LENGTH} \
    --hf_username {HF_USERNAME}"""

if PUSH_TO_HUB:
    cmd += " --push_to_hub"

print("Starting training with command:")
print(cmd)
print("\n" + "="*80)

# Run training
!{cmd}

## Push Results to GitHub (Optional)

In [None]:
if PUSH_TO_GITHUB and GH_TOKEN:
    print("Committing and pushing results to GitHub...")
    
    # Add model files
    !git add models/{MODEL_NAME}
    
    # Commit
    commit_msg = f"Training completed for {MODEL_NAME}"
    !git commit -m "{commit_msg}"
    
    # Push
    !git push
    
    print("✓ Results pushed to GitHub")
elif PUSH_TO_GITHUB:
    print("⚠️ Cannot push to GitHub: GH_TOKEN not set")
else:
    print("Skipping GitHub push (PUSH_TO_GITHUB = False)")

## Summary

In [None]:
import json
from pathlib import Path

print("="*80)
print("TRAINING SUMMARY")
print("="*80)

# Check if training completed
model_dir = Path(f"models/{MODEL_NAME}")
if model_dir.exists():
    print(f"✓ Model saved: models/{MODEL_NAME}")
    
    # Load metrics if available
    metrics_file = model_dir / "training_metrics.json"
    if metrics_file.exists():
        with open(metrics_file) as f:
            metrics = json.load(f)
        print("\nTraining Metrics:")
        for key, value in metrics.items():
            print(f"  {key}: {value}")
    
    # List saved files
    print("\nSaved Files:")
    for file in sorted(model_dir.glob("*")):
        size = file.stat().st_size / (1024*1024)  # MB
        print(f"  {file.name} ({size:.2f} MB)")
    
    if PUSH_TO_HUB and HF_USERNAME:
        print(f"\n✓ Model pushed to: https://huggingface.co/{HF_USERNAME}/{MODEL_NAME}")
else:
    print("❌ Model directory not found. Training may have failed.")

print("="*80)