# NASA Cloud-ML Training on Google Colab

This notebook sets up remote training for the Cloud-ML project using Google Colab's free GPU. Note: Colab sessions are limited to ~12 hours, so this is best for short experiments or ablation runs. For longer training, consider Colab Pro or other services.

## Prerequisites
- Your GitHub repo: `https://github.com/rylanmalarchick/cloudMLPublic.git`
- Data: Upload to Google Drive or provide links (CPL and camera/nav data as per README).
- PAT for private access if needed.

## Steps
1. Mount Google Drive.
2. Clone the repo.
3. Install dependencies.
4. Set up data.
5. Run training.
6. Save results.

In [None]:
# Step 1: Mount Google Drive for persistent storage
from google.colab import drive
drive.mount('/content/drive')

# Create a directory for the project
!mkdir -p /content/drive/MyDrive/CloudML
%cd /content/drive/MyDrive/CloudML

In [None]:
# Step 2: Clone or update the repo
# If private, use PAT: !git clone https://<YOUR_PAT>@github.com/rylanmalarchick/cloudMLPublic.git /content/repo
import os
if not os.path.exists('README.md'):  # Check if already cloned
    !git clone https://github.com/rylanmalarchick/cloudMLPublic.git /content/repo
    %cd /content/repo
else:
    print('Repo already exists, pulling latest changes.')
    !git pull origin main
    %cd /content/repo  # Ensure we're in the repo directory

In [None]:
# Step 3: Install dependencies
!pip install -r requirements.txt

# Install additional Colab-specific packages if needed
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121  # Ensure CUDA compatibility

In [None]:
# Step 4: Set up data
# Upload data to Drive or download from links
# Example: Download from HAR and CPL (replace with actual links/paths)
# !wget -P data/ <HAR_DATA_LINK>
# !wget -P data/ <CPL_DATA_LINK>

# Update config.yaml paths to point to Drive
# E.g., data_directory: "/content/drive/MyDrive/CloudML/data/"
# output_directory: "/content/drive/MyDrive/CloudML/plots/"

import yaml
with open('config.yaml', 'r') as f:
    config = yaml.safe_load(f)
config['data_directory'] = '/content/drive/MyDrive/CloudML/data/'
config['output_directory'] = '/content/drive/MyDrive/CloudML/plots/'
with open('config.yaml', 'w') as f:
    yaml.dump(config, f)

# Ensure data folder exists
!mkdir -p /content/drive/MyDrive/CloudML/data

In [None]:
# Step 5: Run training
# Example: Run a single experiment
!python main.py --config bestComboConfig.yaml --epochs 10  # Short run for testing

# For ablation: Loop with command-line overrides (no extra YAML files)
ablations = [
    '--angles_mode both',  # All angles
    '--angles_mode sza_only',  # Zenith only
    '--use_spatial_attention false --use_temporal_attention false',  # No attention
    '--loss_type mae',  # Plain MAE loss
    '',  # Baseline (attention on, etc.)
    '--augment false',  # No augmentation
    '--architecture.name gnn'  # GNN architecture
]
for i, override in enumerate(ablations):
    print(f'Running ablation {i+1}: {override or "baseline"}')
    !python main.py --config configs/bestComboConfig.yaml {override} --epochs 5  # Short epochs for testing; increase for full run

# Monitor GPU usage
!nvidia-smi

In [None]:
# Step 6: Save results
# Results are already in Drive (plots/, logs/, etc.)
# Download manually or use Colab's file browser

# Optional: Zip and download
# !zip -r results.zip plots/ logs/ models/
# from google.colab import files
# files.download('results.zip')

print("Training complete! Check Drive for outputs.")

## Tips
- **Runtime Limits:** Colab disconnects after inactivity. Use a Colab Pro account for longer sessions if needed (paid, but affordable).
- **Data Size:** Upload data to Drive first; Colab has storage limits.
- **Persistence:** Use Drive to save models/plots between sessions.
- **Free Alternatives:** Kaggle Notebooks or Paperspace Gradient (free tiers available).
- **Cost:** Colab is free for basic use; monitor GPU hours.

If issues arise, check the README for setup details.

In [None]:
# Step 7: Aggregate ablation results
# After all ablations, run this to combine metrics into a summary CSV
!python scripts/aggregate_results.py
# Results saved to ablation_summary.csv in Drive
print("Ablation summary generated.")