In [6]:
# Clone repository
!rm -rf motion_prediction/
!git clone https://github.com/kveps/motion_prediction.git

Cloning into 'motion_prediction'...
remote: Enumerating objects: 204, done.[K
remote: Counting objects: 100% (204/204), done.[K
remote: Compressing objects: 100% (129/129), done.[K
remote: Total 204 (delta 89), reused 182 (delta 68), pack-reused 0 (from 0)[K
Receiving objects: 100% (204/204), 492.92 KiB | 14.08 MiB/s, done.
Resolving deltas: 100% (89/89), done.
remote: Enumerating objects: 204, done.[K
remote: Counting objects: 100% (204/204), done.[K
remote: Compressing objects: 100% (129/129), done.[K
remote: Total 204 (delta 89), reused 182 (delta 68), pack-reused 0 (from 0)[K
Receiving objects: 100% (204/204), 492.92 KiB | 14.08 MiB/s, done.
Resolving deltas: 100% (89/89), done.


# Setup for Training Transformer on Colab GPU

This notebook clones your motion prediction code and trains the transformer using Waymo Open Dataset directly from Google Cloud Storage.

In [7]:
# Check GPU availability
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

CUDA available: True
GPU: Tesla T4
GPU Memory: 15.83 GB


In [None]:
# Authenticate for GCS access (required for Waymo dataset)
from google.colab import auth
auth.authenticate_user()
print("✓ Authenticated successfully!")

## Mount Google Drive (Optional but Recommended)

**Why mount Google Drive?**
- Model weights persist after runtime ends
- You can download models later

**If this cell hangs:**
1. Click the authorization link that appears
2. Choose your Google account and authorize
3. If still stuck after 30 seconds:
   - **Restart runtime** (Runtime → Restart runtime)
   - Or **skip this cell** - models will save to `/content/` (temporary)

In [None]:
# Optional: Mount Google Drive to save model weights persistently
from google.colab import drive
drive.mount('/content/drive')

# Create directory for saving models
!mkdir -p /content/drive/MyDrive/motion_prediction/models/trained_weights/

KeyboardInterrupt: 

In [None]:
# Run training using the unified script with --colab flag
!cd motion_prediction/ && python -m scripts.transformer_train --colab --epochs 100 --batch-size 5

Traceback (most recent call last):
  File "/content/motion_prediction/./scripts/transformer_train.py", line 1, in <module>
    from models.loss.nll_loss import NLL_Loss
ModuleNotFoundError: No module named 'models'


## Training Notes

**Key advantages of this setup:**
- ✅ No gcsfuse mounting needed - TensorFlow reads directly from GCS
- ✅ Model weights saved to Google Drive for persistence
- ✅ Works seamlessly after authentication
- ✅ Uses Colab GPU automatically
- ✅ Same script works locally and on Colab with just a flag

**Command options:**
```bash
# Colab training (uses GCS paths)
python scripts/transformer_train.py --colab --epochs 100

# Local training (default, uses local paths)
python scripts/transformer_train.py --epochs 50

# Testing mode
python scripts/transformer_train.py --test --model-path path/to/model.pt

# All options
python scripts/transformer_train.py --colab --epochs 100 --batch-size 8 --lr 0.001
```

**To monitor training:**
- Watch the loss values printed every 10 batches
- Models are saved after each epoch to Google Drive (or local folder)
- You can interrupt and resume training anytime