# SmolLMv3 + TRM Training Notebook

This notebook demonstrates training the SmolLMv3 + TRM model with:
- PyTorch Lightning for clean training loops
- Weights & Biases (wandb) for logging
- Automatic checkpointing and early stopping
- Support for local and remote hardware (Colab, etc.)

## 1. Setup Environment

In [None]:
# Check if running on Colab
try:
    import google.colab
    IN_COLAB = True
    print("Running on Google Colab")
except:
    IN_COLAB = False
    print("Running locally")

# Install dependencies if on Colab
if IN_COLAB:
    !pip install -q torch transformers peft pytorch-lightning wandb datasets

In [None]:
import torch
print(f"PyTorch: {torch.__version__}")
print(f"CUDA: {torch.cuda.is_available()}")
print(f"MPS: {torch.backends.mps.is_available()}")

if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
elif torch.backends.mps.is_available():
    print("Device: Apple Silicon (MPS)")
else:
    print("Device: CPU")

## 2. Login to Weights & Biases

In [None]:
import wandb
wandb.login()

## 3. Import Training Module

In [None]:
from train_lightning import train, create_sample_dataset
print("âœ“ Imports successful")

## 4. Configure & Train

In [None]:
# Training configuration
config = {
    "batch_size": 2,
    "num_epochs": 3,
    "learning_rate": 2e-4,
    "num_latents": 256,
    "wandb_project": "smollm-trm",
    "wandb_name": "test-run"
}

# Start training
trainer, model = train(**config)