# 🌧️ Timer-XL Peru Rainfall Prediction - Google Colab

This notebook demonstrates the complete pipeline for training Timer-XL on Peru rainfall data.

**Steps:**
1. Setup environment
2. Upload ERA5 data
3. Preprocess data
4. Train Timer-XL with transfer learning
5. Evaluate results

## 1. Setup Environment

In [None]:
# Check GPU
!nvidia-smi

In [None]:
# Clone repository
!git clone https://github.com/YOUR_USERNAME/AdaptationOpenLTM.git
%cd AdaptationOpenLTM

In [None]:
# Install dependencies
!pip install -r requirements.txt

In [None]:
# Mount Google Drive (to save checkpoints)
from google.colab import drive
drive.mount('/content/drive')

# Create checkpoint directory in Drive
!mkdir -p '/content/drive/MyDrive/timer_xl_checkpoints/'

## 2. Upload ERA5 Data

**Option A**: Upload manually via Colab interface  
**Option B**: Download from your Google Drive

In [None]:
# Option A: Upload files
from google.colab import files

print("Upload your ERA5 .zip files (era5_peru_2023.zip, era5_peru_2024.zip)")
uploaded = files.upload()

# Move to raw_era5 directory
!mkdir -p datasets/raw_era5
!mv era5_peru_*.zip datasets/raw_era5/

In [None]:
# Option B: Copy from Google Drive
!cp '/content/drive/MyDrive/ERA5_Data/era5_peru_2023.zip' datasets/raw_era5/
!cp '/content/drive/MyDrive/ERA5_Data/era5_peru_2024.zip' datasets/raw_era5/

In [None]:
# Verify files
!ls -lh datasets/raw_era5/

## 3. Preprocess Data

In [None]:
# Run preprocessing script
!python preprocessing/preprocess_era5_peru.py \
    --input_dir datasets/raw_era5 \
    --output_dir datasets/processed \
    --years 2023,2024 \
    --target_horizon 24 \
    --threshold 0.1

In [None]:
# Load processed data for quick inspection
import pandas as pd
import json

df = pd.read_csv('datasets/processed/peru_rainfall.csv')
print(f"Dataset shape: {df.shape}")
print(f"\nFirst few rows:")
print(df.head())

# Load statistics
with open('datasets/processed/preprocessing_stats.json') as f:
    stats = json.load(f)
print(f"\nStatistics:")
print(json.dumps(stats, indent=2))

## 4. Train Timer-XL

In [None]:
# Download pre-trained checkpoint (if not already downloaded)
import os
import gdown

checkpoint_path = 'checkpoints/timer_xl/checkpoint.pth'

if not os.path.exists(checkpoint_path):
    print("Downloading pre-trained Timer-XL checkpoint...")
    os.makedirs('checkpoints/timer_xl', exist_ok=True)
    
    # URL from Tsinghua Cloud (you may need to download manually and upload)
    # https://cloud.tsinghua.edu.cn/f/01c35ca13f474176be7b/
    
    # For now, upload manually
    print("Please download checkpoint.pth from:")
    print("https://cloud.tsinghua.edu.cn/f/01c35ca13f474176be7b/")
    print("And upload it to checkpoints/timer_xl/")
else:
    print(f"✅ Checkpoint found: {checkpoint_path}")

In [None]:
# Train model
# Note: Adjust batch_size if you get OOM errors

!python run.py \
  --task_name forecast \
  --is_training 1 \
  --root_path datasets/processed/ \
  --data_path peru_rainfall.csv \
  --model_id peru_rainfall_transfer_learning \
  --model timer_xl_classifier \
  --data PeruRainfall \
  --seq_len 1440 \
  --input_token_len 96 \
  --output_token_len 96 \
  --test_seq_len 1440 \
  --test_pred_len 2 \
  --e_layers 8 \
  --d_model 1024 \
  --d_ff 2048 \
  --n_heads 8 \
  --dropout 0.1 \
  --activation relu \
  --batch_size 128 \
  --learning_rate 1e-5 \
  --train_epochs 50 \
  --patience 10 \
  --gpu 0 \
  --cosine \
  --tmax 50 \
  --use_norm \
  --adaptation \
  --pretrain_model_path checkpoints/timer_xl/checkpoint.pth \
  --loss CE \
  --checkpoints results/peru_rainfall/ \
  --use_focal_loss \
  --itr 1

## 5. Save Checkpoint to Drive

Prevent losing your trained model!

In [None]:
# Copy best checkpoint to Google Drive
!cp results/peru_rainfall/peru_rainfall_transfer_learning/checkpoint.pth \
   '/content/drive/MyDrive/timer_xl_checkpoints/peru_rainfall_best.pth'

print("✅ Checkpoint saved to Google Drive!")

## 6. Quick Evaluation

In [None]:
# Load test results (if available)
import os
import json

results_dir = 'results/peru_rainfall/peru_rainfall_transfer_learning/'

# Find metrics file
for file in os.listdir(results_dir):
    if file.startswith('test_metrics') and file.endswith('.json'):
        with open(os.path.join(results_dir, file)) as f:
            metrics = json.load(f)
        print("Test Metrics:")
        print(json.dumps(metrics, indent=2))

## 🎉 Training Complete!

**Next steps:**
1. Download results from `results/peru_rainfall/`
2. Analyze confusion matrix and classification report
3. Try different context lengths (seq_len)
4. Experiment with different hyperparameters