This notebook will compare model architecture, hyperparameters, inputs, and training/validation error

In [1]:
import xarray as xr
import numpy as np
from src.crossval import run_crossval

In [2]:
input_dir = "/glade/work/milesep/convective_outlook_ml"
target_dir = "data/processed_data"
stats_dir = "data/processed_data"

In [4]:
# compare all models, save training and validation curves along with corresponding model name, data level, model/training hyperparameters
model_names = ["cnn3d_3_layer_big", "cnn3d_fewer_channels", "cnn3d_3_layer_fewer_channels"]

levels = ["full", "full", "full"]

lrs = [1e-3, 1e-3, 1e-3]

batch_sizes = [8, 8, 8]

epochs = [50] * len(model_names)
# restarts = [True, False, False, False, False, False, False, False, False, False, False]
# Don't do linear regression with full dataset--too many parameters
for name, level, lr, batch_size, epoch in zip(model_names, levels, lrs, batch_sizes, epochs):
    if level[:4] == 'slgt':
        slgt_mod_str = '_slgt'
    else:
        slgt_mod_str = ''
    print(name, level, lr, batch_size)
    inputs = xr.open_zarr(f"{input_dir}/train_inputs_{level}.zarr")
    targets = xr.open_dataset(f"{target_dir}/train_targets{slgt_mod_str}.nc")
    stats = xr.open_dataset(f"{stats_dir}/daily_input_stats_{level}.nc")
    scores, train_counts, val_counts = run_crossval(inputs, targets, stats, name, batch_size = batch_size, lr = lr, epochs = epoch, level = level, restart = True)
    print(f"{name}: {np.average(scores, weights = val_counts):.3f} ± {np.std(scores):.3f}")

cnn3d_3_layer_big full 0.001 8
Using device: cuda

Fold 0:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:41<00:00, 14.03s/epoch, train_loss=0.8414, val_loss=1.0289]



Fold 1:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:38<00:00, 13.97s/epoch, train_loss=0.8030, val_loss=1.1571]



Fold 2:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:25<00:00, 13.71s/epoch, train_loss=0.8575, val_loss=1.0144]



Fold 3:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [12:02<00:00, 14.46s/epoch, train_loss=0.8823, val_loss=0.9088]



Fold 4:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:38<00:00, 13.97s/epoch, train_loss=0.8613, val_loss=0.8661]


Logging average losses...
Reading fold 0
Reading fold 1
Reading fold 2
Reading fold 3
Reading fold 4
cnn3d_3_layer_big: 0.945 ± 0.113
cnn3d_fewer_channels full 0.001 8
Using device: cuda

Fold 0:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:39<00:00, 13.99s/epoch, train_loss=0.9445, val_loss=0.9634]



Fold 1:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:37<00:00, 13.95s/epoch, train_loss=0.8908, val_loss=1.2080]



Fold 2:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:39<00:00, 13.98s/epoch, train_loss=0.9381, val_loss=1.0020]



Fold 3:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [12:07<00:00, 14.54s/epoch, train_loss=0.9598, val_loss=0.8944]



Fold 4:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:42<00:00, 14.05s/epoch, train_loss=0.9819, val_loss=0.8258]


Logging average losses...
Reading fold 0
Reading fold 1
Reading fold 2
Reading fold 3
Reading fold 4
cnn3d_fewer_channels: 0.966 ± 0.132
cnn3d_3_layer_fewer_channels full 0.001 8
Using device: cuda

Fold 0:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:42<00:00, 14.06s/epoch, train_loss=0.9194, val_loss=1.0003]



Fold 1:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:41<00:00, 14.03s/epoch, train_loss=0.8711, val_loss=1.2028]



Fold 2:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:47<00:00, 14.14s/epoch, train_loss=0.9131, val_loss=0.9853]



Fold 3:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:30<00:00, 13.80s/epoch, train_loss=0.9542, val_loss=0.8761]



Fold 4:
Loading data...
Standardizing data...
Setting up datasets...
Model device: cuda:0
Starting training from scratch


Training: 100%|██████████| 50/50 [11:39<00:00, 13.99s/epoch, train_loss=0.9767, val_loss=0.8286]


Logging average losses...
Reading fold 0
Reading fold 1
Reading fold 2
Reading fold 3
Reading fold 4
cnn3d_3_layer_fewer_channels: 0.957 ± 0.128
