# Axion-Sat Stage 1 Precompute (Colab)

This notebook runs Stage 1 precompute on Google Colab with your data from Google Drive.

**Estimated time:** ~15-18 hours on T4 GPU (free tier)

**Setup:**
1. Upload your `data/tiles/benv2_catalog` folder to Google Drive
2. Run all cells below
3. Download outputs from Google Drive when done

## 1. Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## 2. Check GPU

In [None]:
!nvidia-smi

## 3. Clone Repository

In [None]:
!git clone https://github.com/yourusername/Axion-Sat.git /content/Axion-Sat
%cd /content/Axion-Sat

## 4. Install Dependencies

In [None]:
!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!pip install -q terratorch albumentations tqdm numpy pillow scipy timm transformers diffusers

## 5. Set up Data Paths

**IMPORTANT:** Update these paths to match where you uploaded your data in Google Drive

In [None]:
# Update these paths!
DATA_DIR = '/content/drive/MyDrive/Axion-Sat-Data/tiles/benv2_catalog'
OUTPUT_DIR = '/content/drive/MyDrive/Axion-Sat-Data/stage1_outputs'

# Check if data exists
import os
if os.path.exists(DATA_DIR):
    print(f"✓ Found data at {DATA_DIR}")
    !ls -lh "{DATA_DIR}" | head -20
else:
    print(f"❌ Data not found at {DATA_DIR}")
    print(f"   Please upload your tiles to Google Drive first!")

## 6. Copy Precompute Script

In [None]:
%%writefile /content/Axion-Sat/scripts/00_precompute_stage1_fast.py
# Your precompute script will be written here
# (The full script from your local version)

## 7. Run Stage 1 Precompute

This will take ~15-18 hours on T4 GPU (free tier)

In [None]:
!python /content/Axion-Sat/scripts/00_precompute_stage1_fast.py \
    --data-dir "{DATA_DIR}" \
    --output-dir "{OUTPUT_DIR}" \
    --batch-size 64 \
    --timesteps 3 \
    --device cuda

## 8. Verify Outputs

In [None]:
import os
output_files = [f for f in os.listdir(OUTPUT_DIR) if f.endswith('.npz')]
print(f"✓ Generated {len(output_files)} output files")
print(f"  Outputs saved to: {OUTPUT_DIR}")
print(f"\nFirst 10 files:")
for f in output_files[:10]:
    print(f"  - {f}")

## 9. Done!

Your precomputed Stage 1 outputs are saved in Google Drive.

Next steps:
1. Download outputs from Google Drive to your local D: drive
2. Run Stage 2 training locally (or on Colab with another notebook)