## Cross validation example

Cross validation can provide a better estimate of performance than a single split of your dataset. We have often observed that running Glimr with a single split produces a configuration that is highly overfit to this validation dataset, and that generalizes poorly to independent testing data. Glimr provides tools to perform cross validation to address this.

When performing a cross validation, each model configuration is run in multiple trials with different cross-validation folds. Post experiment analysis can be used to identify the model configuration with the best average performance, or to build ensembles of models trained on different portions of the data.

Revisiting the MNIST example, we demonstrate the formulation of cross validation dataloaders and the experiment analysis tools.

In [1]:
!pip install ../../glimr

Processing /Users/lac5440/Desktop/glimr
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: glimr
  Building wheel for glimr (pyproject.toml) ... [?25ldone
[?25h  Created wheel for glimr: filename=glimr-0.1.dev154+g25953a5.d20231214-py3-none-any.whl size=25100 sha256=9be33169ce38a692821313178ddc4d3576fb6fc28044df6f3b774fa14721ce4f
  Stored in directory: /private/var/folders/tz/qttd962d27n1g_l3f83f9s95byzb9n/T/pip-ephem-wheel-cache-_y_ao52w/wheels/17/71/17/3520291f6e42aef9e08bebc18ab3d238ca66e5490443565920
Successfully built glimr
Installing collected packages: glimr
  Attempting uninstall: glimr
    Found existing installation: glimr 0.1.dev154+g25953a5.d20231214
    Uninstalling glimr-0.1.dev154+g25953a5.d20231214:
      Successfully uninstalled glimr-0.1.dev154+g25953a5.d202312

# Create a cross validation data loader

Cross validation requires a dataloader that accepts `cv_index` and `cv_folds` arguments that represent the fold index and number of folds. The `Search` class will populate your data search space with these arguments automatically.

This data loader below uses stratified k-fold cross validation to build class-balanced folds. Since each trial will run a separate fold, random arguments like the split seed must be fixed across trials. 

In [1]:
import numpy as np
from sklearn.model_selection import StratifiedKFold


def cv_dataloader(batch_size, random_brightness, max_delta, cv_index, cv_folds):
    """Cross-validation MNIST data loader.

    Parameters
    ----------
    batch_size : int
        The number of samples to batch.
    random_brightness : bool
        Whether to apply random brightness augmentation.
    max_delta : float
        The random brightness augmentation parameter.
    cv_index : int
        The index of the requested fold.
    cv_folds : int
        The number of folds in the cross validation.

    Returns
    -------
    train_ds : tf.data.Dataset
        A batched training set for fold `cv_index` used to build models.
    validation_ds : tf.data.Dataset.
        A batched validation set for fold `cv_index` used to evaluate models.
    """

    # load mnist data
    train, validation = tf.keras.datasets.mnist.load_data(path="mnist.npz")

    # combine training, validation sets
    merged = (
        np.concatenate((train[0], validation[0]), axis=0),
        np.concatenate((train[1], validation[1]), axis=0),
    )

    # flattening function
    def mnist_flat(features):
        return features.reshape(
            features.shape[0], features.shape[1] * features.shape[2]
        )

    # stratified k-fold cross validation
    skf = StratifiedKFold(n_splits=cv_folds, shuffle=True, random_state=0)
    train_index, validation_index = [
        (i, o) for (i, o) in skf.split(merged[0], merged[1])
    ][cv_index]

    # extract features, labels
    train_features = tf.cast(mnist_flat(merged[0][train_index]), tf.float32) / 255.0
    train_labels = merged[1][train_index]
    validation_features = (
        tf.cast(mnist_flat(merged[0][validation_index]), tf.float32) / 255.0
    )
    validation_labels = merged[1][validation_index]

    # build datasets
    train_ds = tf.data.Dataset.from_tensor_slices(
        (train_features, {"mnist": tf.one_hot(train_labels, 10)})
    )
    validation_ds = tf.data.Dataset.from_tensor_slices(
        (validation_features, {"mnist": tf.one_hot(validation_labels, 10)})
    )

    # batch
    train_ds = train_ds.shuffle(len(train_labels), reshuffle_each_iteration=True)
    train_ds = train_ds.batch(batch_size)
    validation_ds = validation_ds.batch(batch_size)

    # apply augmentation
    if random_brightness:
        train_ds = train_ds.map(
            lambda x, y: (tf.image.random_brightness(x, max_delta), y)
        )

    return train_ds, validation_ds

# Setting up the search space and model building funciton

The search space and model building function are not impacted by the choice to use cross validation. Reuse everything from the starter example.

In [2]:
from glimr.optimization import optimization_space
from pprint import pprint
from ray import tune
import tensorflow as tf

# define the possible layer activations
activations = tune.choice(
    ["elu", "gelu", "linear", "relu", "selu", "sigmoid", "softplus"]
)

# define the layer 1 hyperparameters
layer1 = {
    "activation": activations,
    "dropout": tune.quniform(0.0, 0.2, 0.05),
    "units": tune.choice([64, 48, 32, 16]),
}

# set the loss as a hyperparameter
loss = tune.choice(
    [
        {"name": "categorical_hinge", "loss": tf.keras.losses.CategoricalHinge},
        {"name": "categorical_crossentropy", "loss": tf.keras.losses.CategoricalCrossentropy},
    ]
)

# use a fixed loss weight
loss_weight = (1.0,)

# set fixed metrics for reporting to Ray Tune
metrics = {
    "name": "auc",
    "metric": tf.keras.metrics.AUC,
    "kwargs": {"from_logits": True},
}

# define the task
task = {
    "activation": activations,
    "dropout": tune.quniform(0.0, 0.2, 0.05),
    "units": 10,
    "loss": loss,
    "loss_weight": loss_weight,
    "metrics": metrics,
}

# optimizer search space
optimization = optimization_space()

# data loader keyword arguments to control loading, augmentation, and batching
data = {
    "batch_size": tune.choice([32, 64, 128]),
    "random_brightness": tune.choice(
        [True, False]
    ),  # whether to perform random brightness transformation
    "max_delta": tune.quniform(0.01, 0.15, 0.01),
}


from glimr.keras import keras_losses, keras_metrics


def builder(config):
    # a helper function for building layers
    def _build_layer(x, units, activation, dropout, name):
        # dense layer
        x = tf.keras.layers.Dense(units, activation=activation, name=name)(x)

        # add dropout if necessary
        if dropout > 0.0:
            x = tf.keras.layers.Dropout(dropout)(x)

        return x

    # create input layer
    input_layer = tf.keras.Input([784], name="input")

    # build layer 1
    x = _build_layer(
        input_layer,
        config["layer1"]["units"],
        config["layer1"]["activation"],
        config["layer1"]["dropout"],
        "layer1",
    )

    # build output / task layer
    task_name = list(config["tasks"].keys())[0]
    output = _build_layer(
        input_layer,
        config["tasks"][task_name]["units"],
        config["tasks"][task_name]["activation"],
        config["tasks"][task_name]["dropout"],
        task_name,
    )

    # build named output dict
    named = {f"{task_name}": output}

    # create model
    model = tf.keras.Model(inputs=input_layer, outputs=named)

    # create a loss dictionary
    losses, loss_weights = keras_losses(config)

    # create a metric dictionary
    metrics = keras_metrics(config)

    return model, losses, loss_weights, metrics

In [3]:
# put it all together
space = {
    "layer1": layer1,
    "optimization": optimization_space(),
    "tasks": {"mnist": task},
    "data": data,
}

# display search space
pprint(space, indent=4)

{   'data': {   'batch_size': <ray.tune.search.sample.Categorical object at 0x7f0468f7c910>,
                'max_delta': <ray.tune.search.sample.Float object at 0x7f0468f7c8b0>,
                'random_brightness': <ray.tune.search.sample.Categorical object at 0x7f0468f7ca00>},
    'layer1': {   'activation': <ray.tune.search.sample.Categorical object at 0x7f04f48a1d60>,
                  'dropout': <ray.tune.search.sample.Float object at 0x7f04f48a1c70>,
                  'units': <ray.tune.search.sample.Categorical object at 0x7f0469039eb0>},
    'optimization': {   'beta_1': <ray.tune.search.sample.Float object at 0x7f046a35e2e0>,
                        'beta_2': <ray.tune.search.sample.Float object at 0x7f046a35e280>,
                        'ema_momentum': <ray.tune.search.sample.Float object at 0x7f0469039bb0>,
                        'ema_overwrite_frequency': <ray.tune.search.sample.Categorical object at 0x7f0469039f70>,
                        'epochs': 100,
                

# Using Search with `cv_folds`

Creating a `Search` instance with the `cv_folds` argument is all that is needed to instruct `ray.tune` to perform a cross validation.

Since `cv_folds` trials will be run for each configuration, the total number of trials will be `cv_folds` * `num_samples`.

In [4]:
import contextlib
from glimr import Search
import os
import tempfile

# pass `cv_folds` parameter to Search for cross validation
tuner = Search(space, builder, cv_dataloader, "mnist_auc", cv_folds=5)

# make a temporary directory to store outputs - cleanup at end
temp_dir = tempfile.TemporaryDirectory()

# run trials using default settings
with contextlib.redirect_stderr(open(os.devnull, "w")):
    results = tuner.experiment(local_dir=temp_dir.name, name="default", num_samples=10)

0,1
Current time:,2023-12-14 18:27:16
Running for:,00:01:30.67
Memory:,13.9/251.4 GiB

Trial name,status,loc,data/batch_size,data/cv_index,data/max_delta,data/random_brightne ss,layer1/activation,layer1/dropout,layer1/units,optimization/beta_1,optimization/beta_2,optimization/ema_mom entum,optimization/ema_ove rwrite_frequency,optimization/learnin g_rate,optimization/method,optimization/momentu m,optimization/rho,optimization/use_ema,tasks/mnist/activati on,tasks/mnist/dropout,tasks/mnist/loss,iter,total time (s),mnist_auc,mnist_loss
trainable_2ec17_00000,TERMINATED,172.17.0.2:71991,32,0,0.07,False,gelu,0.2,48,0.94,0.61,0.99,3.0,0.00545,adam,0.03,0.97,False,softplus,0.2,{'name': 'categ_0380,4,33.3222,0.511286,0.97757
trainable_2ec17_00001,TERMINATED,172.17.0.2:71992,32,1,0.07,False,gelu,0.2,48,0.94,0.61,0.99,3.0,0.00545,adam,0.03,0.97,False,softplus,0.2,{'name': 'categ_66c0,4,32.3612,0.5,1.0
trainable_2ec17_00002,TERMINATED,172.17.0.2:72057,32,2,0.07,False,gelu,0.2,48,0.94,0.61,0.99,3.0,0.00545,adam,0.03,0.97,False,softplus,0.2,{'name': 'categ_c640,4,32.4649,0.500214,0.999571
trainable_2ec17_00003,TERMINATED,172.17.0.2:72090,32,3,0.07,False,gelu,0.2,48,0.94,0.61,0.99,3.0,0.00545,adam,0.03,0.97,False,softplus,0.2,{'name': 'categ_c5c0,4,31.6427,0.5,1.0
trainable_2ec17_00004,TERMINATED,172.17.0.2:72427,32,4,0.07,False,gelu,0.2,48,0.94,0.61,0.99,3.0,0.00545,adam,0.03,0.97,False,softplus,0.2,{'name': 'categ_c900,4,35.2889,0.5,1.0
trainable_2ec17_00005,TERMINATED,172.17.0.2:72428,128,0,0.11,True,selu,0.1,16,0.62,0.65,0.94,4.0,0.00846,adadelta,0.03,0.95,False,gelu,0.2,{'name': 'categ_4b00,11,44.491,0.94405,0.614504
trainable_2ec17_00006,TERMINATED,172.17.0.2:72429,128,1,0.11,True,selu,0.1,16,0.62,0.65,0.94,4.0,0.00846,adadelta,0.03,0.95,False,gelu,0.2,{'name': 'categ_cc00,12,40.4898,0.941704,0.575526
trainable_2ec17_00007,TERMINATED,172.17.0.2:72430,128,2,0.11,True,selu,0.1,16,0.62,0.65,0.94,4.0,0.00846,adadelta,0.03,0.95,False,gelu,0.2,{'name': 'categ_05c0,10,40.9527,0.908735,0.707902
trainable_2ec17_00008,TERMINATED,172.17.0.2:72433,128,3,0.11,True,selu,0.1,16,0.62,0.65,0.94,4.0,0.00846,adadelta,0.03,0.95,False,gelu,0.2,{'name': 'categ_cfc0,10,42.7386,0.944839,0.613343
trainable_2ec17_00009,TERMINATED,172.17.0.2:72435,128,4,0.11,True,selu,0.1,16,0.62,0.65,0.94,4.0,0.00846,adadelta,0.03,0.95,False,gelu,0.2,{'name': 'categ_13c0,10,40.2089,0.901613,0.725342


2023-12-14 18:27:16,708	INFO tune.py:1148 -- Total run time: 90.78 seconds (90.66 seconds for the tuning loop).


In [223]:
from glimr.analysis import _parse_experiment, _checkpoints, _filter_checkpoints

exp_dir = temp_dir.name + "/default"
metric = "mnist_auc"
df = _parse_experiment(exp_dir)
rates = [c["optimization"]["learning_rate"] for c in list(df["config"])]
print(len(set(rates)))
print(len(rates))

# add column where configurations are enumerated
from copy import deepcopy
import json


def _enumerate_configs(df):
    cleaned = [deepcopy(c) for c in list(df["config"])]
    for clean in cleaned:
        del clean["data"]["cv_index"]
    mapping = {}
    for clean in cleaned:
        if json.dumps(clean) not in mapping.keys():
            mapping[json.dumps(clean)] = len(mapping) + 1
    df["config_enum"] = [mapping[json.dumps(clean)] for clean in cleaned]
    return df

10
271


In [226]:
from glimr.analysis import top_cv_trials

# all trials
selected_models = top_cv_trials(exp_dir , metric='mnist_auc', mode="max", model_selection=None)
selected_models = selected_models.sort_values(by=['trial_id']).reset_index(drop=True)
selected_models.head(5)

Unnamed: 0,cv_index,trial_id,training_iteration,checkpoint_path,config,mnist_auc
0,0,2ec17_00000,1,/tmp/tmp2p1cv0h3/default/trainable_2ec17_00000...,"{'layer1': {'activation': 'gelu', 'dropout': 0...",0.520635
1,1,2ec17_00001,4,/tmp/tmp2p1cv0h3/default/trainable_2ec17_00001...,"{'layer1': {'activation': 'gelu', 'dropout': 0...",0.5
2,2,2ec17_00002,4,/tmp/tmp2p1cv0h3/default/trainable_2ec17_00002...,"{'layer1': {'activation': 'gelu', 'dropout': 0...",0.500214
3,3,2ec17_00003,4,/tmp/tmp2p1cv0h3/default/trainable_2ec17_00003...,"{'layer1': {'activation': 'gelu', 'dropout': 0...",0.5
4,4,2ec17_00004,4,/tmp/tmp2p1cv0h3/default/trainable_2ec17_00004...,"{'layer1': {'activation': 'gelu', 'dropout': 0...",0.5


In [238]:
from glimr.analysis import _sortGroup

# group based on cv index to find the best fold
gdf = _sortGroup(selected_models, gb='cv_index', metric='mnist_auc', agr="max")
gdf

Unnamed: 0,cv_index,mnist_auc,config
0,0,0.977419,"{'layer1': {'activation': 'gelu', 'dropout': 0.2, 'units': 48}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.005450000000000001, 'rho': 0.97, 'momentum': 0.03, 'beta_1': 0.9400000000000001, 'beta_2': 0.61, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'softplus', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': False, 'max_delta': 0.07, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
1,1,0.977871,"{'layer1': {'activation': 'gelu', 'dropout': 0.2, 'units': 48}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.005450000000000001, 'rho': 0.97, 'momentum': 0.03, 'beta_1': 0.9400000000000001, 'beta_2': 0.61, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'softplus', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': False, 'max_delta': 0.07, 'cv_index': 1, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
2,2,0.978333,"{'layer1': {'activation': 'gelu', 'dropout': 0.2, 'units': 48}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.005450000000000001, 'rho': 0.97, 'momentum': 0.03, 'beta_1': 0.9400000000000001, 'beta_2': 0.61, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'softplus', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': False, 'max_delta': 0.07, 'cv_index': 2, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
3,3,0.978321,"{'layer1': {'activation': 'gelu', 'dropout': 0.2, 'units': 48}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.005450000000000001, 'rho': 0.97, 'momentum': 0.03, 'beta_1': 0.9400000000000001, 'beta_2': 0.61, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'softplus', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': False, 'max_delta': 0.07, 'cv_index': 3, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
4,4,0.978091,"{'layer1': {'activation': 'gelu', 'dropout': 0.2, 'units': 48}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.005450000000000001, 'rho': 0.97, 'momentum': 0.03, 'beta_1': 0.9400000000000001, 'beta_2': 0.61, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'softplus', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': False, 'max_delta': 0.07, 'cv_index': 4, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"


In [239]:
# group based on configs to find the best config
gdf = _sortGroup(selected_models, gb='config', metric='mnist_auc', agr="max")
gdf

Unnamed: 0,mnist_auc,config
0,0.975418,"{'layer1': {'activation': 'gelu', 'dropout': 0.15000000000000002, 'units': 64}, 'optimization': {'epochs': 100, 'method': 'rms', 'learning_rate': 0.005920000000000001, 'rho': 0.73, 'momentum': 0.06, 'beta_1': 0.56, 'beta_2': 0.81, 'use_ema': True, 'ema_momentum': 0.96, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'softplus', 'dropout': 0.15000000000000002, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 128, 'random_brightness': True, 'max_delta': 0.1, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
1,0.520635,"{'layer1': {'activation': 'gelu', 'dropout': 0.2, 'units': 48}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.005450000000000001, 'rho': 0.97, 'momentum': 0.03, 'beta_1': 0.9400000000000001, 'beta_2': 0.61, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'softplus', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': False, 'max_delta': 0.07, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
2,0.833717,"{'layer1': {'activation': 'relu', 'dropout': 0.05, 'units': 32}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.0031300000000000004, 'rho': 0.56, 'momentum': 0.04, 'beta_1': 0.78, 'beta_2': 0.86, 'use_ema': True, 'ema_momentum': 0.93, 'ema_overwrite_frequency': 2}, 'tasks': {'mnist': {'activation': 'gelu', 'dropout': 0.05, 'units': 10, 'loss': {'name': 'categorical_crossentropy', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': True, 'max_delta': 0.08, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
3,0.755112,"{'layer1': {'activation': 'relu', 'dropout': 0.05, 'units': 32}, 'optimization': {'epochs': 100, 'method': 'rms', 'learning_rate': 0.00901, 'rho': 0.72, 'momentum': 0.1, 'beta_1': 0.78, 'beta_2': 0.72, 'use_ema': True, 'ema_momentum': 0.91, 'ema_overwrite_frequency': None}, 'tasks': {'mnist': {'activation': 'relu', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 128, 'random_brightness': False, 'max_delta': 0.02, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
4,0.826314,"{'layer1': {'activation': 'relu', 'dropout': 0.0, 'units': 16}, 'optimization': {'epochs': 100, 'method': 'rms', 'learning_rate': 0.0022, 'rho': 0.56, 'momentum': 0.1, 'beta_1': 0.75, 'beta_2': 0.56, 'use_ema': True, 'ema_momentum': 0.98, 'ema_overwrite_frequency': 4}, 'tasks': {'mnist': {'activation': 'relu', 'dropout': 0.15000000000000002, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 32, 'random_brightness': False, 'max_delta': 0.1, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
5,0.944839,"{'layer1': {'activation': 'selu', 'dropout': 0.1, 'units': 16}, 'optimization': {'epochs': 100, 'method': 'adadelta', 'learning_rate': 0.00846, 'rho': 0.9500000000000001, 'momentum': 0.03, 'beta_1': 0.62, 'beta_2': 0.65, 'use_ema': False, 'ema_momentum': 0.9400000000000001, 'ema_overwrite_frequency': 4}, 'tasks': {'mnist': {'activation': 'gelu', 'dropout': 0.2, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 128, 'random_brightness': True, 'max_delta': 0.11, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
6,0.978333,"{'layer1': {'activation': 'sigmoid', 'dropout': 0.0, 'units': 64}, 'optimization': {'epochs': 100, 'method': 'sgd', 'learning_rate': 0.00233, 'rho': 0.89, 'momentum': 0.1, 'beta_1': 0.54, 'beta_2': 0.78, 'use_ema': False, 'ema_momentum': 0.97, 'ema_overwrite_frequency': 4}, 'tasks': {'mnist': {'activation': 'selu', 'dropout': 0.15000000000000002, 'units': 10, 'loss': {'name': 'categorical_hinge', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 64, 'random_brightness': False, 'max_delta': 0.01, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
7,0.5,"{'layer1': {'activation': 'softplus', 'dropout': 0.05, 'units': 64}, 'optimization': {'epochs': 100, 'method': 'adagrad', 'learning_rate': 0.00334, 'rho': 0.91, 'momentum': 0.05, 'beta_1': 0.8300000000000001, 'beta_2': 0.93, 'use_ema': False, 'ema_momentum': 0.93, 'ema_overwrite_frequency': 4}, 'tasks': {'mnist': {'activation': 'relu', 'dropout': 0.1, 'units': 10, 'loss': {'name': 'categorical_crossentropy', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 128, 'random_brightness': False, 'max_delta': 0.13, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
8,0.580728,"{'layer1': {'activation': 'softplus', 'dropout': 0.15000000000000002, 'units': 16}, 'optimization': {'epochs': 100, 'method': 'adadelta', 'learning_rate': 0.006940000000000001, 'rho': 0.5700000000000001, 'momentum': 0.02, 'beta_1': 0.66, 'beta_2': 0.78, 'use_ema': False, 'ema_momentum': 0.91, 'ema_overwrite_frequency': 2}, 'tasks': {'mnist': {'activation': 'linear', 'dropout': 0.15000000000000002, 'units': 10, 'loss': {'name': 'categorical_crossentropy', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 64, 'random_brightness': True, 'max_delta': 0.15, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
9,0.608201,"{'layer1': {'activation': 'softplus', 'dropout': 0.1, 'units': 32}, 'optimization': {'epochs': 100, 'method': 'adam', 'learning_rate': 0.00068, 'rho': 0.66, 'momentum': 0.01, 'beta_1': 0.54, 'beta_2': 0.78, 'use_ema': True, 'ema_momentum': 0.9500000000000001, 'ema_overwrite_frequency': 3}, 'tasks': {'mnist': {'activation': 'elu', 'dropout': 0.05, 'units': 10, 'loss': {'name': 'categorical_crossentropy', 'loss': """"}, 'loss_weight': [1.0], 'metrics': {'name': 'auc', 'metric': """", 'kwargs': {'from_logits': True}}}}, 'data': {'batch_size': 64, 'random_brightness': True, 'max_delta': 0.11, 'cv_index': 0, 'cv_folds': 5}, 'builder': '', 'fit_kwargs': {}, 'loader': ''}"
