### Having a look at the SHERPA results
Dashboard itself can unfortunately not load the results

In [27]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

In [62]:
def browse_load_dir(basepath: Path, nbest: int):
    results = {}
    result_paths = basepath.glob('tg*/results.csv')
    for path in result_paths:
        full = pd.read_csv(path,header = 0)
        parts = path.parts[-2].split('_')
        try:
            extremity = parts[-5]
            timeagg = parts[-6]
            results.update({(timeagg,extremity):get_best_trials(full, nbest = nbest)})
        except IndexError:
            results.update({('',''):get_best_trials(full, nbest = nbest)})
    results = pd.concat(results, axis = 0)
    results.index.names = ['timeagg','extremity','rownr'] # Last level is a bit redundant
    return results

def get_best_trials(sherpa_results: pd.DataFrame, nbest:int = 11):
    """
    In the sherpa results I average after each iteration, so the final iteration carries the mean score.
    We need to disregard the 'complete' row registered by sherpa (as that takes the minimum of the iterations).
    """
    max_iter = sherpa_results['Iteration'].max()
    final = sherpa_results.loc[np.logical_and(test['Iteration'] == max_iter,test['Status'] == 'INTERMEDIATE'),:]
    best = final.iloc[final['Objective'].rank().values <= nbest,:] # We sort ascending and lower is better
    return best

### All combinations
With their four best sequential foreward predictors. Selected for a model with

|param|value|
|---|---|
|batch size|32|
|patience|7|
|epochs|40|
|lr|0.0014|
|nhidden layers | 1|
|hidden nodes|4|

In [54]:
params = ['batch_size','earlystop_patience','lr','n_hidden_layers','n_hiddenlayer_nodes']

In [51]:
combs = browse_load_dir(Path('/nobackup/users/straaten/hyperparams/'), nbest = 11)
med_of_best = combs.groupby(['timeagg','extremity'], axis = 0).median()

In [69]:
med_of_best

Unnamed: 0_level_0,Unnamed: 1_level_0,Trial-ID,Iteration,batch_size,earlystop_patience,lr,n_hidden_layers,n_hiddenlayer_nodes,Objective,earliest_stop,latest_stop
timeagg,extremity,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
21D-roll-mean,q0.5,64.0,7.0,32.0,8.0,0.001392,2.0,5.0,0.38529,11.0,37.0
21D-roll-mean,q0.66,126.0,7.0,64.0,10.0,0.00105,2.0,5.0,0.305637,14.0,102.0
21D-roll-mean,q0.75,106.0,7.0,16.0,9.0,0.000708,2.0,6.0,0.222335,22.0,32.0
21D-roll-mean,q0.9,111.0,7.0,32.0,8.0,0.001164,2.0,5.0,0.091315,22.0,33.0
31D-roll-mean,q0.5,74.0,7.0,32.0,8.0,0.001484,2.0,6.0,0.352266,11.0,21.0
31D-roll-mean,q0.66,123.0,7.0,16.0,8.0,0.001253,2.0,7.0,0.259604,12.0,25.0
31D-roll-mean,q0.75,77.0,7.0,32.0,9.0,0.001239,2.0,2.0,0.205782,12.0,45.0
31D-roll-mean,q0.9,67.0,7.0,64.0,8.0,0.00128,2.0,6.0,0.100925,10.0,33.0
tg-ex-q0.75-21D,ge11D,115.0,7.0,16.0,9.0,0.000997,2.0,6.0,0.092871,13.0,56.0
tg-ex-q0.75-21D,ge5D,121.0,7.0,64.0,8.0,0.001457,2.0,6.0,0.367933,15.0,18.0


In [56]:
med_of_best[params].median()

batch_size             32.000000
earlystop_patience      8.500000
lr                      0.001246
n_hidden_layers         2.000000
n_hiddenlayer_nodes     6.000000
dtype: float64

There does not seem to be a strong dependence on the predictand and its extremity. Overall the values tend to more complexity that under jmeasure. But this is with predictors selected specifically for performance in the same train-validation set. It might be too optimistic.

### J-measure (8) + dynamical
predicting (old / non-smooth) tgex in 21D > 5. With the 8 most important jmeasure predictands. Preference here seems to be less deep.

In [66]:
browse_load_dir(Path('/nobackup/users/straaten/hyperparams/nonsmooth/'), nbest = 11).median()[params]

  browse_load_dir(Path('/nobackup/users/straaten/hyperparams/nonsmooth/'), nbest = 11).median()[params]


batch_size             32.000000
earlystop_patience      7.000000
lr                      0.001549
n_hidden_layers         1.000000
n_hiddenlayer_nodes     5.000000
dtype: float64

### 21D > q0.5, lead 12-15
With 4 most important multipass predictors.
Batch of new predictor selection runs after estimation of the smooth climatologies. But not with full access to 21day and 31day dynamic predictors

In [67]:
browse_load_dir(Path('/nobackup/users/straaten/hyperparams/swvlshortscale/'), nbest = 11).median()[params]

  browse_load_dir(Path('/nobackup/users/straaten/hyperparams/swvlshortscale/'), nbest = 11).median()[params]


batch_size             16.000000
earlystop_patience      6.000000
lr                      0.001579
n_hidden_layers         2.000000
n_hiddenlayer_nodes     5.000000
dtype: float64

### Final choice


|param|value|
|---|---|
|batch size|32|
|early stop patience|7|
|epochs|200|
|learning rate|0.0014|
|n hidden layers | 1|
|n hidden nodes|4|