## Data Downloading and loading

Code below downloads data via Kaggle API if the files are
not already present in the directory (see `/src/utilities/download_data.sh`).
Additionally sets up necessary global variables for this notebook:

In [1]:
import pathlib
import pandas as pd

# Where data will be stored if not present
DATA_PATH = "../input"
! ./utilities/download_data.sh "$DATA_PATH"

PREPROCESSING_PATH = pathlib.Path("../preprocessing")

# Where models are/will be stored
PREDICTIONS_PATH = pathlib.Path("../predictions")
PREDICTIONS_PATH.mkdir(parents=True, exist_ok=True)

Downloading data to: ../input
test_data.csv.zip: Skipping, found more recently modified local copy (use --force to force download)
train_data.csv.zip: Skipping, found more recently modified local copy (use --force to force download)
train_labels.csv: Skipping, found more recently modified local copy (use --force to force download)
sample_submission.csv: Skipping, found more recently modified local copy (use --force to force download)
/home/vyz/projects/Kaggle1NN2019/src
Script ran successfully

Data loaded below has been constructed in `preprocessing` notebook:

In [2]:
from utilities.general import train_data, test_data

_, y = train_data(pathlib.Path(DATA_PATH))
X_test = test_data(pathlib.Path(DATA_PATH))

X_train=pd.read_csv(PREPROCESSING_PATH / pathlib.Path("algorithmic_data.csv"), index_col=0)
X_test = X_test[X_train.columns]

variance_dataset=pd.read_csv(PREPROCESSING_PATH / pathlib.Path("variance_data.csv"), index_col=0)

Settings below were found to achieve highest score on public leaderboard.

Various were tried (more ensemble datasets, less, more models, different seed each time), those tries are not documented in this repository (though they are saved if anyone wants to see).

In [3]:
import numpy as np

HOW_MANY_MODELS = 80
HOW_MANY_DATASETS = 1

MAIN_SEED = 42

## Create N models

Using utilities we may easily create many different configurations of models.

`create_configs` will create `HOW_MANY_MODELS` configurations of models (those configurations are `input` and `output` independent, hence can be used for varying amount of features (see: `HOW_MANY_DATASETS`).

For information about how those networks are created, consult `utilities.generator.create_configs` function.

In [4]:
from utilities.training import generate_configs

model_configs = generate_configs(
    max_layers=5,
    max_width=900,
    min_width=100,
    how_many=HOW_MANY_MODELS,
    seed=MAIN_SEED,
)

`ensemble_datasets` creates generator of datasets with varying count of features.

`split_feature` indicates how many best features (as indicated by analysis) will be used for "ensemble dataset". Additionally 96 - `split_feature` will be randomly chosen (say `25` features were chosen from the less important features in range `30-96` if we were to follow the example below):

In [5]:
from utilities.training import ensemble_datasets

datasets = ensemble_datasets(
    X_train,
    X_test,
    split_feature=30,
    how_many_models=HOW_MANY_MODELS,
    how_many_datasets=HOW_MANY_DATASETS,
    seed=MAIN_SEED,
)

Once configurations of random models are created (see appropriate functions in utils) and dataset generators setup we can train and predict using third part library [`skorch`](https://github.com/skorch-dev/skorch) which removes mental load nicely.

Characteristics of training loop applied to every generated neural network:

  - Train neural network with maximum of 40 epochs
  - Optimizer used: `Adam` with default learning rate
  - Batch size equal to 64
  - Stratified Validation set being 20% of the train data
  - Train data shuffled after each epoch
  - Early Stopping if no validation accuracy improvement after `8` epochs
  - Learning rate multiplied by `0.6` if no validation accuracy improvement after 2 epochs

Additionally, no models are saved but make a prediction on test set using state achieved as the best in validation accuracy.

Logits of predictions are saved for each model with it's descriptive name and validation accuracy achieved (see `prediction` data) so we can try different ensembling techniques if we so wish (e.g. thresholding on accuracy and using only the best models).

In [6]:
from utilities.training import predict_with_models

predict_with_models(
    PREDICTIONS_PATH,
    y,
    X_test,
    model_configs,
    datasets,
    10,
    MAIN_SEED,
)

Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6802[0m        [32m0.5876[0m                 [35m0.7774[0m  7.4093
Current best validation, making test prediction
      2        [36m0.5529[0m        [32m0.5225[0m                 [35m0.7979[0m  5.6897
Current best validation, making test prediction
      3        [36m0.5098[0m        [32m0.5210[0m                 [35m0.8033[0m  5.5162
Current best validation, making test prediction
      4        [36m0.4793[0m        [32m0.4585[0m                 [35m0.8235[0m  4.9231
Current best validation, making test prediction
      5        [36m0.4582[0m        [32m0.4558[0m                 [35m0.8291[0m  5.6262
      6        [36m0.4372[0m        0.4701                 0.8196  5.0378
      7        [36m0.4192[0m        0.4601                 0.8265  5.3469
Curr

Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8546492111446794
Saving predictions in ../predictions/0.8546492111446794_0_Layers=[869, 869, 869, 869],Batchnorm,Activation=ReLU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6008[0m        [32m0.5116[0m                 [35m0.8000[0m  7.9101
Current best validation, making test prediction
      2        [36m0.4800[0m        [32m0.4491[0m                 [35m0.8288[0m  7.5377
      3        [36m0.4314[0m        0.4604                 0.8269  7.3128
Current best validation, making test prediction
      4        [36m0.3943[0m        [32m0.4192[0m                 [35m0.8411[0m  6.8846
Current best validation, making test prediction
      5        [36m0.3644[0m        0.4334                 [35m0.8458[0m  4.3207
      6        [36m0.

Current best validation, making test prediction
     13        [36m0.3468[0m        0.4557                 [35m0.8395[0m  3.3737
     14        [36m0.3375[0m        0.4836                 0.8345  3.9352
     15        [36m0.3280[0m        0.4867                 0.8357  3.9125
     16        [36m0.3222[0m        0.4722                 0.8394  4.0333
Epoch    16: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     17        [36m0.2756[0m        [32m0.4540[0m                 [35m0.8442[0m  4.1669
Current best validation, making test prediction
     18        [36m0.2653[0m        0.4547                 [35m0.8464[0m  3.3019
     19        [36m0.2576[0m        0.4667                 0.8422  4.0161
     20        [36m0.2507[0m        0.4718                 0.8436  4.0608
     21        [36m0.2438[0m        0.4884                 0.8395  4.0491
Epoch    21: reducing learning rate of group 0 to 3.6000e-04.
Current best

     17        [36m0.0363[0m        1.0047                 0.8513  4.8450
Epoch    17: reducing learning rate of group 0 to 3.6000e-04.
     18        [36m0.0149[0m        1.0359                 0.8520  4.6645
Current best validation, making test prediction
     19        [36m0.0079[0m        1.1403                 [35m0.8538[0m  4.6736
     20        0.0086        1.1815                 0.8512  4.8061
     21        0.0113        1.1939                 0.8537  4.7502
     22        0.0145        1.2033                 0.8500  4.7950
Epoch    22: reducing learning rate of group 0 to 2.1600e-04.
     23        [36m0.0046[0m        1.2458                 0.8535  4.7341
Current best validation, making test prediction
     24        [36m0.0009[0m        1.3063                 [35m0.8560[0m  4.3728
Current best validation, making test prediction
     25        [36m0.0004[0m        1.3731                 [35m0.8561[0m  4.0033
     26        [36m0.0002[0m        1.4344    

     25        [36m0.1699[0m        0.4891                 0.8504  3.8813
     26        [36m0.1668[0m        0.4911                 0.8504  3.7455
     27        [36m0.1627[0m        0.4967                 0.8491  3.2811
Epoch    27: reducing learning rate of group 0 to 1.2960e-04.
     28        [36m0.1529[0m        0.4985                 0.8507  3.9473
     29        [36m0.1502[0m        0.5024                 0.8517  4.0516
     30        [36m0.1474[0m        0.5034                 0.8495  3.8373
Epoch    30: reducing learning rate of group 0 to 7.7760e-05.
     31        [36m0.1417[0m        0.5060                 0.8494  4.0630
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8528029540114133
Saving predictions in ../predictions/0.8528029540114133_0_Layers=[341, 341],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---

Current best validation, making test prediction
      4        [36m0.4638[0m        [32m0.4775[0m                 [35m0.8212[0m  4.0158
Current best validation, making test prediction
      5        [36m0.4465[0m        [32m0.4723[0m                 [35m0.8226[0m  3.3397
Current best validation, making test prediction
      6        [36m0.4304[0m        [32m0.4643[0m                 [35m0.8283[0m  3.7282
Current best validation, making test prediction
      7        [36m0.4188[0m        [32m0.4590[0m                 [35m0.8284[0m  4.0186
Current best validation, making test prediction
      8        [36m0.4079[0m        [32m0.4487[0m                 [35m0.8344[0m  3.9767
      9        [36m0.3980[0m        0.4536                 0.8312  3.3256
     10        [36m0.3882[0m        [32m0.4466[0m                 0.8327  3.9718
Current best validation, making test prediction
     11        [36m0.3775[0m        [32m0.4459[0m                 [35m0.8348

     24        [36m0.0352[0m        0.8058                 0.8427  6.1017
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8487747566297416
Saving predictions in ../predictions/0.8487747566297416_1_Layers=[501, 501, 501],Dropout:[0, 0, 0],Batchnorm,Activation=ReLU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.5892[0m        [32m0.5168[0m                 [35m0.8011[0m  5.8405
Current best validation, making test prediction
      2        [36m0.4624[0m        [32m0.4584[0m                 [35m0.8273[0m  5.7761
Current best validation, making test prediction
      3        [36m0.4076[0m        [32m0.4264[0m                 [35m0.8375[0m  6.3558
      4        [36m0.3690[0m        0.4322                 0.8374  5.1626
Current best validation, making test prediction
      5        [3

Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.7138[0m        [32m0.6440[0m                 [35m0.7549[0m  3.8151
Current best validation, making test prediction
      2        [36m0.5928[0m        [32m0.5690[0m                 [35m0.7863[0m  3.7852
Current best validation, making test prediction
      3        [36m0.5438[0m        [32m0.5489[0m                 [35m0.7957[0m  3.8869
Current best validation, making test prediction
      4        [36m0.5113[0m        [32m0.5116[0m                 [35m0.8113[0m  3.1420
      5        [36m0.4889[0m        0.5193                 0.8100  3.6709
Current best validation, making test prediction
      6        [36m0.4730[0m        [32m0.4798[0m                 [35m0.8233[0m  3.5452
      7        [36m0.4564[0m        0.4876                 0.8195  2.2031
Curr

Current best validation, making test prediction
      3        [36m0.5134[0m        [32m0.5200[0m                 [35m0.8035[0m  3.5890
Current best validation, making test prediction
      4        [36m0.4816[0m        [32m0.5036[0m                 [35m0.8103[0m  3.8742
Current best validation, making test prediction
      5        [36m0.4573[0m        [32m0.4675[0m                 [35m0.8283[0m  3.7960
Current best validation, making test prediction
      6        [36m0.4421[0m        [32m0.4644[0m                 [35m0.8287[0m  3.6148
      7        [36m0.4217[0m        0.4866                 0.8202  3.7537
      8        [36m0.4070[0m        0.5148                 0.8077  3.1837
      9        [36m0.3929[0m        0.4785                 0.8265  3.9052
Epoch     9: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     10        [36m0.3412[0m        [32m0.4346[0m                 [35m0.8414[0m  3.6393


     21        [36m0.1526[0m        0.5090                 0.8486  4.9582
     22        [36m0.1394[0m        0.5243                 0.8510  3.6267
     23        [36m0.1309[0m        0.5464                 0.8480  4.1348
Epoch    23: reducing learning rate of group 0 to 2.1600e-04.
     24        [36m0.1075[0m        0.5629                 0.8503  5.3996
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8520476670023498
Saving predictions in ../predictions/0.8520476670023498_1_Layers=[827, 827, 827, 827],Batchnorm,Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.7086[0m        [32m0.5805[0m                 [35m0.7811[0m  6.6457
Current best validation, making test prediction
      2        [36m0.5609[0m        [32m0.5207[0m                 [35m0.8118[0m  7.0897
Current

     30        [36m0.2580[0m        0.4308                 0.8509  4.9793
     31        [36m0.2511[0m        0.4455                 0.8499  6.7116
Epoch    31: reducing learning rate of group 0 to 2.1600e-04.
     32        [36m0.2402[0m        0.4370                 0.8529  7.6012
     33        [36m0.2361[0m        0.4378                 0.8537  6.1952
     34        [36m0.2324[0m        0.4446                 0.8557  4.9077
Epoch    34: reducing learning rate of group 0 to 1.2960e-04.
     35        [36m0.2250[0m        0.4427                 0.8541  7.5322
Current best validation, making test prediction
     36        [36m0.2235[0m        0.4453                 [35m0.8567[0m  5.9048
     37        [36m0.2210[0m        0.4453                 0.8548  5.4106
     38        [36m0.2188[0m        0.4531                 0.8534  5.9328
     39        [36m0.2155[0m        0.4567                 0.8519  6.5165
Epoch    39: reducing learning rate of group 0 to 7.7760e-0

Epoch    23: reducing learning rate of group 0 to 1.2960e-04.
     24        [36m0.1598[0m        0.5175                 0.8406  3.7823
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8434038267875126
Saving predictions in ../predictions/0.8434038267875126_1_Layers=[833, 833],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6857[0m        [32m0.5965[0m                 [35m0.7788[0m  4.0402
Current best validation, making test prediction
      2        [36m0.5687[0m        [32m0.5207[0m                 [35m0.8063[0m  3.9568
Current best validation, making test prediction
      3        [36m0.5147[0m        [32m0.5043[0m                 [35m0.8165[0m  3.8861
      4        [36m0.4788[0m        [32m0.4936[0m                 0.8144  3.2335
Current best validation, mak

     28        [36m0.2587[0m        0.4437                 0.8460  4.4715
     29        [36m0.2553[0m        0.4447                 0.8477  4.2402
Epoch    29: reducing learning rate of group 0 to 2.1600e-04.
Current best validation, making test prediction
     30        [36m0.2416[0m        0.4396                 [35m0.8508[0m  4.0202
     31        [36m0.2389[0m        0.4407                 0.8491  2.7908
     32        [36m0.2369[0m        0.4422                 0.8496  4.0163
     33        [36m0.2345[0m        0.4436                 0.8475  3.9933
Epoch    33: reducing learning rate of group 0 to 1.2960e-04.
     34        [36m0.2263[0m        0.4425                 0.8494  3.0563
     35        [36m0.2243[0m        0.4425                 0.8499  3.7937
     36        [36m0.2226[0m        0.4441                 0.8499  2.4284
Epoch    36: reducing learning rate of group 0 to 7.7760e-05.
Current best validation, making test prediction
     37        [36m0.217

     22        [36m0.3584[0m        0.4340                 0.8405  4.7444
Current best validation, making test prediction
     23        [36m0.3536[0m        0.4294                 [35m0.8473[0m  3.8423
     24        [36m0.3470[0m        0.4310                 0.8437  5.0963
     25        [36m0.3451[0m        0.4348                 0.8413  3.8896
     26        [36m0.3425[0m        0.4280                 0.8448  5.1781
Epoch    26: reducing learning rate of group 0 to 6.0000e-04.
     27        [36m0.3242[0m        [32m0.4195[0m                 0.8463  4.6278
     28        [36m0.3219[0m        0.4247                 0.8473  4.2697
Current best validation, making test prediction
     29        [36m0.3170[0m        0.4219                 [35m0.8483[0m  3.8650
     30        [36m0.3135[0m        0.4259                 0.8481  4.4515
     31        0.3141        0.4282                 0.8470  2.9622
     32        [36m0.3097[0m        0.4258                 0.8

     23        [36m0.0268[0m        0.7225                 0.8472  4.5585
Epoch    23: reducing learning rate of group 0 to 1.2960e-04.
Current best validation, making test prediction
     24        [36m0.0202[0m        0.7251                 [35m0.8509[0m  4.5116
     25        [36m0.0185[0m        0.7347                 0.8495  4.6439
     26        [36m0.0174[0m        0.7415                 0.8507  4.7566
     27        [36m0.0160[0m        0.7503                 0.8505  4.2205
Epoch    27: reducing learning rate of group 0 to 7.7760e-05.
     28        [36m0.0134[0m        0.7542                 0.8490  4.8354
     29        [36m0.0127[0m        0.7668                 0.8482  4.0189
     30        [36m0.0113[0m        0.7787                 0.8482  4.5351
Epoch    30: reducing learning rate of group 0 to 4.6656e-05.
     31        [36m0.0103[0m        0.7829                 0.8489  3.9635
Stopping since validation_accuracy has not improved in the last 8 epochs.

Current best validation, making test prediction
      3        [36m0.4989[0m        [32m0.5112[0m                 [35m0.8112[0m  3.6702
Current best validation, making test prediction
      4        [36m0.4657[0m        [32m0.4919[0m                 [35m0.8224[0m  3.4140
      5        [36m0.4413[0m        [32m0.4870[0m                 0.8190  2.6651
Current best validation, making test prediction
      6        [36m0.4211[0m        [32m0.4609[0m                 [35m0.8301[0m  2.7936
Current best validation, making test prediction
      7        [36m0.4034[0m        [32m0.4575[0m                 [35m0.8310[0m  3.6970
Current best validation, making test prediction
      8        [36m0.3893[0m        [32m0.4489[0m                 [35m0.8339[0m  3.9266
Current best validation, making test prediction
      9        [36m0.3719[0m        0.4585                 [35m0.8355[0m  3.5516
Current best validation, making test prediction
     10        [36m0.35

      9        [36m0.3656[0m        0.4907                 0.8322  4.5453
     10        [36m0.3481[0m        0.4996                 0.8320  4.7070
Epoch    10: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     11        [36m0.2832[0m        [32m0.4585[0m                 [35m0.8476[0m  4.3032
     12        [36m0.2662[0m        0.4761                 0.8440  4.2917
     13        [36m0.2536[0m        0.4805                 0.8403  4.3013
     14        [36m0.2415[0m        0.4899                 0.8436  4.1206
Epoch    14: reducing learning rate of group 0 to 3.6000e-04.
Current best validation, making test prediction
     15        [36m0.2030[0m        0.4777                 [35m0.8490[0m  4.2029
     16        [36m0.1910[0m        0.4877                 0.8468  4.2940
     17        [36m0.1837[0m        0.5012                 0.8466  4.1847
     18        [36m0.1753[0m        0.5218                 0.8421  

Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.7026[0m        [32m0.6500[0m                 [35m0.7679[0m  4.9002
Current best validation, making test prediction
      2        [36m0.5847[0m        [32m0.5712[0m                 [35m0.7954[0m  4.9458
      3        [36m0.5463[0m        [32m0.5591[0m                 0.7909  5.0411
Current best validation, making test prediction
      4        [36m0.5272[0m        [32m0.4941[0m                 [35m0.8212[0m  5.9696
      5        [36m0.5015[0m        0.5191                 0.8132  5.9453
      6        [36m0.4782[0m        0.5191                 0.8112  5.9742
      7        [36m0.4692[0m        0.5571                 0.7975  5.0219
Epoch     7: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
      8        [

     13        [36m0.3519[0m        0.4606                 0.8332  4.4644
     14        [36m0.3388[0m        0.4650                 0.8321  4.5753
Current best validation, making test prediction
     15        [36m0.3281[0m        0.4603                 [35m0.8349[0m  4.6450
Current best validation, making test prediction
     16        [36m0.3201[0m        0.4479                 [35m0.8400[0m  4.5881
     17        [36m0.3073[0m        0.4584                 0.8374  2.0024
     18        [36m0.2986[0m        0.4702                 0.8379  4.2212
     19        [36m0.2850[0m        0.4846                 0.8345  4.5365
Epoch    19: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     20        [36m0.2385[0m        0.4615                 [35m0.8458[0m  4.5824
     21        [36m0.2254[0m        0.4746                 0.8433  4.7616
     22        [36m0.2168[0m        0.4890                 0.8447  4.8363
     23

     22        [36m0.0426[0m        0.8326                 0.8457  5.2371
     23        [36m0.0372[0m        0.8680                 0.8463  4.7092
     24        [36m0.0327[0m        0.9266                 0.8430  4.4829
Epoch    24: reducing learning rate of group 0 to 1.2960e-04.
     25        [36m0.0198[0m        0.9754                 0.8442  4.5776
     26        [36m0.0154[0m        0.9961                 0.8440  5.0729
     27        [36m0.0130[0m        1.0362                 0.8453  5.1324
Epoch    27: reducing learning rate of group 0 to 7.7760e-05.
     28        [36m0.0088[0m        1.0640                 0.8444  4.9022
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8489425981873112
Saving predictions in ../predictions/0.8489425981873112_1_Layers=[445, 445, 445, 445],Dropout:[0, 0, 0, 0],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  

     31        [36m0.2058[0m        0.5106                 0.8427  3.4921
Epoch    31: reducing learning rate of group 0 to 1.2960e-04.
     32        [36m0.1950[0m        0.5077                 0.8415  4.1310
     33        [36m0.1926[0m        0.5132                 0.8411  3.6146
     34        [36m0.1906[0m        0.5188                 0.8404  3.8499
Epoch    34: reducing learning rate of group 0 to 7.7760e-05.
     35        [36m0.1845[0m        0.5135                 0.8421  3.4105
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.844159113796576
Saving predictions in ../predictions/0.844159113796576_1_Layers=[572, 572],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6149[0m        [32m0.4801[0m                 [35m0.8187[0m  4.6518
Current best validation, making t

     11        [36m0.3455[0m        0.5360                 0.8249  4.5778
     12        [36m0.3318[0m        0.5417                 0.8352  4.5397
Epoch    12: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     13        [36m0.2394[0m        0.5226                 [35m0.8444[0m  4.5590
     14        [36m0.2028[0m        0.5589                 0.8338  4.4781
Current best validation, making test prediction
     15        [36m0.1876[0m        0.5732                 [35m0.8451[0m  4.5906
     16        [36m0.1702[0m        0.6185                 0.8405  5.5315
     17        [36m0.1568[0m        0.6515                 0.8400  5.5364
     18        [36m0.1407[0m        0.7125                 0.8409  5.5327
Epoch    18: reducing learning rate of group 0 to 3.6000e-04.
Current best validation, making test prediction
     19        [36m0.0851[0m        0.7087                 [35m0.8494[0m  5.5709
Current best validati

Epoch    21: reducing learning rate of group 0 to 2.1600e-04.
     22        [36m0.1439[0m        0.4731                 0.8567  5.6636
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8601040617656932
Saving predictions in ../predictions/0.8601040617656932_0_Layers=[491, 491, 491],Dropout:[0, 0, 0],Batchnorm,Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.7032[0m        [32m0.6247[0m                 [35m0.7653[0m  2.9691
Current best validation, making test prediction
      2        [36m0.5718[0m        [32m0.5588[0m                 [35m0.7858[0m  5.5179
Current best validation, making test prediction
      3        [36m0.5199[0m        [32m0.5084[0m                 [35m0.8095[0m  5.8841
Current best validation, making test prediction
      4        [36m0.4834[0m  

Current best validation, making test prediction
      5        [36m0.4634[0m        [32m0.4700[0m                 [35m0.8233[0m  6.8318
Current best validation, making test prediction
      6        [36m0.4395[0m        [32m0.4488[0m                 [35m0.8329[0m  5.5005
Current best validation, making test prediction
      7        [36m0.4195[0m        [32m0.4360[0m                 [35m0.8364[0m  4.3363
      8        [36m0.4038[0m        [32m0.4305[0m                 0.8355  3.3572
      9        [36m0.3932[0m        0.4473                 0.8322  3.7508
Current best validation, making test prediction
     10        [36m0.3766[0m        [32m0.4257[0m                 [35m0.8418[0m  6.0151
Current best validation, making test prediction
     11        [36m0.3643[0m        0.4270                 [35m0.8420[0m  3.3976
Current best validation, making test prediction
     12        [36m0.3501[0m        [32m0.4197[0m                 [35m0.8451[0m  4.8

     15        [36m0.3233[0m        0.4910                 0.8343  3.4500
     16        [36m0.3152[0m        0.4920                 0.8346  3.7135
Epoch    16: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     17        [36m0.2597[0m        0.4643                 [35m0.8430[0m  2.9387
Current best validation, making test prediction
     18        [36m0.2496[0m        0.4645                 [35m0.8445[0m  3.8421
     19        [36m0.2391[0m        0.4890                 0.8437  3.7821
     20        [36m0.2339[0m        0.4926                 0.8428  3.8703
     21        [36m0.2253[0m        0.5003                 0.8409  3.1940
Epoch    21: reducing learning rate of group 0 to 3.6000e-04.
Current best validation, making test prediction
     22        [36m0.1900[0m        0.4877                 [35m0.8482[0m  3.4614
     23        [36m0.1823[0m        0.5004                 0.8445  3.5918
     24        [36m0

Epoch     9: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     10        [36m0.2476[0m        0.4443                 [35m0.8490[0m  3.6409
     11        [36m0.2308[0m        0.4646                 0.8447  4.2779
     12        [36m0.2170[0m        0.4900                 0.8442  4.0920
     13        [36m0.2057[0m        0.4891                 0.8444  3.9682
Epoch    13: reducing learning rate of group 0 to 3.6000e-04.
     14        [36m0.1781[0m        0.5093                 0.8474  4.1541
     15        [36m0.1663[0m        0.5305                 0.8447  3.5214
     16        [36m0.1579[0m        0.5412                 0.8437  4.2900
Epoch    16: reducing learning rate of group 0 to 2.1600e-04.
     17        [36m0.1393[0m        0.5570                 0.8456  4.3187
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.849026518966096
Saving predictions in ../predictions/0.849026518966096_0_

      5        [36m0.4602[0m        0.5051                 0.8133  2.9083
Current best validation, making test prediction
      6        [36m0.4400[0m        [32m0.4763[0m                 [35m0.8232[0m  1.6162
Current best validation, making test prediction
      7        [36m0.4239[0m        [32m0.4748[0m                 [35m0.8233[0m  3.0599
Current best validation, making test prediction
      8        [36m0.4081[0m        [32m0.4699[0m                 [35m0.8302[0m  3.5880
Current best validation, making test prediction
      9        [36m0.4014[0m        [32m0.4574[0m                 [35m0.8343[0m  3.5981
     10        [36m0.3873[0m        0.5087                 0.8155  3.5869
     11        [36m0.3782[0m        0.4624                 0.8315  3.7439
     12        [36m0.3645[0m        0.4907                 0.8278  3.6282
Epoch    12: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     13        [3

      9        [36m0.3763[0m        0.4732                 0.8331  3.2617
Epoch     9: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     10        [36m0.3276[0m        [32m0.4363[0m                 [35m0.8454[0m  4.1171
     11        [36m0.3130[0m        0.4429                 0.8446  3.4309
     12        [36m0.3017[0m        [32m0.4361[0m                 0.8446  4.3039
     13        [36m0.2924[0m        0.4472                 0.8446  3.5125
Epoch    13: reducing learning rate of group 0 to 3.6000e-04.
Current best validation, making test prediction
     14        [36m0.2565[0m        0.4429                 [35m0.8485[0m  3.7692
Current best validation, making test prediction
     15        [36m0.2471[0m        0.4417                 [35m0.8538[0m  2.8920
     16        [36m0.2381[0m        0.4538                 0.8522  3.5180
     17        [36m0.2302[0m        0.4556                 0.8494  4.2251
   

     28        [36m0.0095[0m        0.8912                 0.8531  3.7598
     29        [36m0.0084[0m        0.9230                 0.8536  4.1947
Epoch    29: reducing learning rate of group 0 to 7.7760e-05.
Current best validation, making test prediction
     30        [36m0.0057[0m        0.9324                 [35m0.8548[0m  4.4705
     31        [36m0.0044[0m        0.9594                 0.8531  2.6440
     32        [36m0.0038[0m        0.9809                 0.8532  4.1659
     33        [36m0.0033[0m        1.0010                 0.8524  3.7003
Epoch    33: reducing learning rate of group 0 to 4.6656e-05.
     34        [36m0.0025[0m        1.0192                 0.8545  3.8546
     35        [36m0.0021[0m        1.0309                 0.8528  3.9291
     36        [36m0.0019[0m        1.0503                 0.8528  4.1436
Epoch    36: reducing learning rate of group 0 to 2.7994e-05.
     37        [36m0.0016[0m        1.0656                 0.8530  3.72

     24        [36m0.1261[0m        0.5087                 0.8509  3.8069
     25        [36m0.1224[0m        0.5157                 0.8537  3.7266
Epoch    25: reducing learning rate of group 0 to 1.2960e-04.
     26        [36m0.1090[0m        0.5185                 0.8517  3.5811
     27        [36m0.1053[0m        0.5195                 0.8515  3.4402
     28        [36m0.1026[0m        0.5279                 0.8509  3.7113
Epoch    28: reducing learning rate of group 0 to 7.7760e-05.
     29        [36m0.0949[0m        0.5293                 0.8534  3.4747
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.856327626720376
Saving predictions in ../predictions/0.856327626720376_0_Layers=[669, 669],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6855[0m        [32m0.5706[

Current best validation, making test prediction
      5        [36m0.4549[0m        [32m0.4755[0m                 [35m0.8273[0m  3.6762
Current best validation, making test prediction
      6        [36m0.4385[0m        0.4767                 [35m0.8326[0m  2.6068
      7        [36m0.4199[0m        [32m0.4701[0m                 0.8326  3.2246
Current best validation, making test prediction
      8        [36m0.4053[0m        0.4755                 [35m0.8336[0m  3.5764
      9        [36m0.3936[0m        0.4825                 0.8305  3.2874
     10        [36m0.3753[0m        0.4763                 0.8301  3.7183
Current best validation, making test prediction
     11        [36m0.3644[0m        0.4852                 [35m0.8381[0m  3.7325
     12        [36m0.3516[0m        0.5194                 0.8260  3.6251
     13        [36m0.3392[0m        0.5021                 0.8312  3.1827
Current best validation, making test prediction
     14        [36m0.

     10        [36m0.1934[0m        0.5411                 0.8463  4.1677
     11        [36m0.1681[0m        0.5956                 0.8431  3.3851
     12        [36m0.1461[0m        0.6152                 0.8403  4.3675
Epoch    12: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     13        [36m0.0893[0m        0.7322                 [35m0.8468[0m  3.8617
     14        [36m0.0637[0m        0.8245                 0.8464  4.0683
     15        [36m0.0592[0m        0.8810                 0.8405  4.0565
     16        [36m0.0518[0m        0.9640                 0.8435  2.9788
Epoch    16: reducing learning rate of group 0 to 3.6000e-04.
     17        [36m0.0267[0m        1.0381                 0.8424  4.3409
     18        [36m0.0149[0m        1.1584                 0.8447  4.0433
Current best validation, making test prediction
     19        [36m0.0147[0m        1.2037                 [35m0.8469[0m  3.7291
Ep

Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6824[0m        [32m0.5997[0m                 [35m0.7770[0m  4.2157
Current best validation, making test prediction
      2        [36m0.5691[0m        [32m0.5371[0m                 [35m0.8030[0m  2.8907
Current best validation, making test prediction
      3        [36m0.5189[0m        [32m0.5130[0m                 [35m0.8132[0m  1.8783
Current best validation, making test prediction
      4        [36m0.4823[0m        [32m0.4814[0m                 [35m0.8247[0m  2.5538
      5        [36m0.4573[0m        0.4859                 0.8242  2.5706
Current best validation, making test prediction
      6        [36m0.4374[0m        [32m0.4644[0m                 [35m0.8331[0m  3.2837
Current best validation, making test prediction
      7        [36m0.4214[0m  

     37        [36m0.2455[0m        0.4319                 0.8511  3.3036
     38        [36m0.2442[0m        0.4324                 0.8500  3.2077
     39        [36m0.2433[0m        0.4346                 0.8499  2.9907
Epoch    39: reducing learning rate of group 0 to 7.7760e-05.
     40        [36m0.2384[0m        0.4315                 0.8515  3.7206
0.8516280631084256
Saving predictions in ../predictions/0.8516280631084256_1_Layers=[226, 226],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6824[0m        [32m0.6318[0m                 [35m0.7654[0m  3.5082
Current best validation, making test prediction
      2        [36m0.5673[0m        [32m0.5587[0m                 [35m0.7949[0m  3.8918
Current best validation, making test prediction
      3        [36m0.5190[0m        [32m0.5259[0m

Current best validation, making test prediction
     30        [36m0.2471[0m        0.4434                 [35m0.8479[0m  3.7088
Current best validation, making test prediction
     31        [36m0.2455[0m        0.4415                 [35m0.8486[0m  3.5444
     32        [36m0.2435[0m        0.4442                 0.8475  3.2818
Current best validation, making test prediction
     33        [36m0.2404[0m        0.4495                 [35m0.8491[0m  3.8576
     34        [36m0.2387[0m        0.4488                 0.8472  3.5731
     35        [36m0.2367[0m        0.4513                 0.8484  3.6192
     36        [36m0.2341[0m        0.4519                 0.8483  2.1765
Epoch    36: reducing learning rate of group 0 to 1.2960e-04.
     37        [36m0.2271[0m        0.4485                 0.8469  3.4075
     38        [36m0.2249[0m        0.4525                 0.8474  3.2200
Current best validation, making test prediction
     39        [36m0.2235[0m     

      9        [36m0.3951[0m        0.4170                 0.8441  4.4541
     10        [36m0.3820[0m        0.4167                 0.8438  5.1628
     11        [36m0.3707[0m        [32m0.4119[0m                 0.8431  6.5630
Epoch    11: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     12        [36m0.3372[0m        [32m0.4035[0m                 [35m0.8492[0m  5.7382
Current best validation, making test prediction
     13        [36m0.3259[0m        [32m0.3922[0m                 [35m0.8548[0m  6.7447
     14        [36m0.3182[0m        0.3944                 0.8533  6.2672
Current best validation, making test prediction
     15        [36m0.3109[0m        0.3958                 [35m0.8572[0m  6.0618
     16        [36m0.3029[0m        0.3988                 0.8541  6.4380
Current best validation, making test prediction
     17        [36m0.2919[0m        0.3958                 [35m0.8583[0m  5.9992

Current best validation, making test prediction
     12        [36m0.3845[0m        [32m0.4079[0m                 [35m0.8489[0m  4.0571
Current best validation, making test prediction
     13        [36m0.3747[0m        [32m0.4069[0m                 [35m0.8494[0m  5.8405
Current best validation, making test prediction
     14        [36m0.3665[0m        0.4080                 [35m0.8527[0m  5.7270
     15        [36m0.3576[0m        0.4157                 0.8490  5.7025
Current best validation, making test prediction
     16        [36m0.3513[0m        [32m0.4049[0m                 [35m0.8551[0m  4.4507
     17        [36m0.3410[0m        0.4098                 0.8500  4.0702
     18        [36m0.3357[0m        0.4090                 0.8526  5.9005
     19        [36m0.3263[0m        0.4128                 0.8517  4.9341
Epoch    19: reducing learning rate of group 0 to 6.0000e-04.
     20        [36m0.3029[0m        0.4067                 0.8548  4.2285

Current best validation, making test prediction
      2        [36m0.5648[0m        [32m0.5535[0m                 [35m0.8002[0m  4.1409
      3        [36m0.5166[0m        0.5611                 0.7944  4.6408
Current best validation, making test prediction
      4        [36m0.4929[0m        [32m0.5099[0m                 [35m0.8157[0m  4.2911
Current best validation, making test prediction
      5        [36m0.4656[0m        [32m0.4923[0m                 [35m0.8177[0m  4.9464
Current best validation, making test prediction
      6        [36m0.4468[0m        [32m0.4690[0m                 [35m0.8249[0m  4.7024
Current best validation, making test prediction
      7        [36m0.4299[0m        0.4853                 [35m0.8303[0m  4.8430
      8        [36m0.4075[0m        0.4761                 0.8263  5.0557
Current best validation, making test prediction
      9        [36m0.3948[0m        0.4719                 [35m0.8320[0m  5.0139
Current best v

     21        [36m0.1873[0m        0.4740                 0.8506  5.2352
Epoch    21: reducing learning rate of group 0 to 3.6000e-04.
     22        [36m0.1649[0m        0.4664                 0.8553  6.0339
     23        [36m0.1525[0m        0.4827                 0.8513  5.3274
     24        [36m0.1451[0m        0.4886                 0.8537  4.1901
Epoch    24: reducing learning rate of group 0 to 2.1600e-04.
     25        [36m0.1292[0m        0.4999                 0.8520  5.2700
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8569150721718698
Saving predictions in ../predictions/0.8569150721718698_0_Layers=[424, 424, 424],Dropout:[0, 0, 0],Batchnorm,Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.7101[0m        [32m0.6015[0m                 [35m0.7707[0m  2.919

Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8475998657267539
Saving predictions in ../predictions/0.8475998657267539_0_Layers=[644, 644, 644],Dropout:[0, 0, 0],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6971[0m        [32m0.5887[0m                 [35m0.7763[0m  4.3319
      2        [36m0.5714[0m        0.6033                 0.7706  4.3645
Current best validation, making test prediction
      3        [36m0.5365[0m        [32m0.5150[0m                 [35m0.8045[0m  3.4333
      4        [36m0.5093[0m        0.5246                 0.7928  3.6771
Current best validation, making test prediction
      5        [36m0.4871[0m        [32m0.5075[0m                 [35m0.8179[0m  3.9313
Current best validation, making test prediction
      6        [36m0.4729[

Current best validation, making test prediction
     13        [36m0.3225[0m        [32m0.4597[0m                 [35m0.8388[0m  3.6935
     14        [36m0.3141[0m        0.4747                 0.8387  3.4056
     15        [36m0.3020[0m        0.4941                 0.8316  2.6771
Current best validation, making test prediction
     16        [36m0.2953[0m        0.4733                 [35m0.8424[0m  3.6796
     17        [36m0.2838[0m        0.4869                 0.8351  3.6588
     18        [36m0.2783[0m        0.4977                 0.8379  2.9660
     19        [36m0.2674[0m        0.5059                 0.8378  3.4440
Epoch    19: reducing learning rate of group 0 to 3.6000e-04.
Current best validation, making test prediction
     20        [36m0.2325[0m        0.4772                 [35m0.8449[0m  3.4430
     21        [36m0.2223[0m        0.4900                 0.8430  3.5089
     22        [36m0.2174[0m        0.4952                 0.8418  3.686

Epoch    15: reducing learning rate of group 0 to 3.6000e-04.
     16        [36m0.2453[0m        0.4275                 0.8544  5.7915
     17        [36m0.2315[0m        0.4394                 0.8524  6.2580
Current best validation, making test prediction
     18        [36m0.2219[0m        0.4368                 [35m0.8562[0m  7.0436
     19        [36m0.2098[0m        0.4642                 0.8507  6.2701
     20        [36m0.1973[0m        0.4686                 0.8550  5.9205
     21        [36m0.1895[0m        0.4857                 0.8501  6.7185
Epoch    21: reducing learning rate of group 0 to 2.1600e-04.
     22        [36m0.1617[0m        0.4835                 0.8552  7.6017
     23        [36m0.1536[0m        0.5070                 0.8510  5.5085
     24        [36m0.1478[0m        0.5183                 0.8536  6.1731
Epoch    24: reducing learning rate of group 0 to 1.2960e-04.
     25        [36m0.1311[0m        0.5356                 0.8518  4.79

Current best validation, making test prediction
      6        [36m0.4340[0m        [32m0.4667[0m                 [35m0.8253[0m  3.8361
Current best validation, making test prediction
      7        [36m0.4133[0m        [32m0.4569[0m                 [35m0.8296[0m  3.7779
      8        [36m0.3990[0m        [32m0.4505[0m                 0.8296  3.0162
Current best validation, making test prediction
      9        [36m0.3844[0m        [32m0.4464[0m                 [35m0.8369[0m  3.2116
     10        [36m0.3709[0m        0.4548                 0.8295  2.9744
     11        [36m0.3599[0m        0.4505                 0.8365  3.8413
Current best validation, making test prediction
     12        [36m0.3477[0m        [32m0.4397[0m                 [35m0.8398[0m  1.7833
Current best validation, making test prediction
     13        [36m0.3366[0m        0.4431                 [35m0.8435[0m  3.4748
     14        [36m0.3264[0m        [32m0.4385[0m         

Current best validation, making test prediction
      3        [36m0.3758[0m        [32m0.4339[0m                 [35m0.8379[0m  3.5530
Current best validation, making test prediction
      4        [36m0.3335[0m        [32m0.4206[0m                 [35m0.8457[0m  3.3128
Current best validation, making test prediction
      5        [36m0.2923[0m        [32m0.4189[0m                 [35m0.8495[0m  3.6605
      6        [36m0.2584[0m        0.4381                 0.8449  3.8918
Current best validation, making test prediction
      7        [36m0.2253[0m        0.4501                 [35m0.8518[0m  3.8767
      8        [36m0.1949[0m        0.4710                 0.8489  3.8088
      9        [36m0.1695[0m        0.5163                 0.8469  4.0781
     10        [36m0.1409[0m        0.5351                 0.8483  3.0661
Epoch    10: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     11        [36m0.0859

      4        [36m0.5069[0m        [32m0.4847[0m                 0.8097  8.3064
Current best validation, making test prediction
      5        [36m0.4836[0m        [32m0.4636[0m                 [35m0.8159[0m  4.8952
Current best validation, making test prediction
      6        [36m0.4635[0m        [32m0.4572[0m                 [35m0.8270[0m  4.4751
      7        [36m0.4480[0m        [32m0.4565[0m                 0.8207  7.5756
Current best validation, making test prediction
      8        [36m0.4323[0m        [32m0.4409[0m                 [35m0.8312[0m  6.4141
Current best validation, making test prediction
      9        [36m0.4230[0m        [32m0.4331[0m                 [35m0.8341[0m  6.0995
     10        [36m0.4095[0m        0.4343                 0.8328  6.3909
     11        [36m0.3983[0m        0.4362                 0.8329  6.0073
Current best validation, making test prediction
     12        [36m0.3899[0m        [32m0.4268[0m         

     11        [36m0.3209[0m        0.4465                 0.8422  3.8591
     12        [36m0.3111[0m        0.4450                 0.8453  2.8529
     13        [36m0.3003[0m        0.4492                 0.8445  3.3364
Epoch    13: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     14        [36m0.2727[0m        0.4467                 [35m0.8459[0m  3.8577
     15        [36m0.2630[0m        0.4567                 0.8446  3.7182
Current best validation, making test prediction
     16        [36m0.2570[0m        0.4569                 [35m0.8466[0m  3.6621
     17        [36m0.2508[0m        0.4637                 0.8436  3.4182
     18        [36m0.2444[0m        0.4687                 0.8466  3.6040
     19        [36m0.2374[0m        0.4697                 0.8437  3.5435
Epoch    19: reducing learning rate of group 0 to 3.6000e-04.
     20        [36m0.2189[0m        0.4762                 0.8460  2.7370
  

     29        [36m0.2104[0m        0.5145                 0.8397  4.1317
Epoch    29: reducing learning rate of group 0 to 2.1600e-04.
     30        [36m0.1925[0m        0.5060                 0.8443  4.4880
Current best validation, making test prediction
     31        [36m0.1885[0m        0.5104                 [35m0.8459[0m  2.1497
     32        [36m0.1853[0m        0.5177                 0.8451  1.7114
     33        [36m0.1827[0m        0.5261                 0.8429  1.7560
     34        [36m0.1805[0m        0.5296                 0.8425  1.7373
Epoch    34: reducing learning rate of group 0 to 1.2960e-04.
     35        [36m0.1693[0m        0.5261                 0.8436  1.8111
     36        [36m0.1669[0m        0.5292                 0.8429  2.8828
     37        [36m0.1653[0m        0.5297                 0.8456  3.3774
Epoch    37: reducing learning rate of group 0 to 7.7760e-05.
     38        [36m0.1583[0m        0.5296                 0.8447  2.54

     15        [36m0.1997[0m        0.5928                 0.8416  4.3330
     16        [36m0.1719[0m        0.6705                 0.8393  3.0314
Epoch    16: reducing learning rate of group 0 to 3.6000e-04.
     17        [36m0.1130[0m        0.6447                 0.8470  3.1040
Current best validation, making test prediction
     18        [36m0.0888[0m        0.6974                 [35m0.8507[0m  4.4778
     19        [36m0.0759[0m        0.7294                 0.8451  4.1836
     20        [36m0.0704[0m        0.7866                 0.8434  4.2699
     21        [36m0.0640[0m        0.8308                 0.8447  2.8402
Epoch    21: reducing learning rate of group 0 to 2.1600e-04.
     22        [36m0.0334[0m        0.8400                 0.8499  3.8141
     23        [36m0.0224[0m        0.8875                 0.8487  3.6566
     24        [36m0.0189[0m        0.9102                 0.8499  4.3427
Epoch    24: reducing learning rate of group 0 to 1.2960e-0

Epoch    12: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     13        [36m0.2792[0m        0.4160                 [35m0.8533[0m  4.8849
     14        [36m0.2645[0m        0.4269                 0.8510  3.6293
     15        [36m0.2564[0m        0.4237                 0.8524  3.2735
     16        [36m0.2465[0m        0.4241                 0.8516  4.8370
Epoch    16: reducing learning rate of group 0 to 3.6000e-04.
     17        [36m0.2208[0m        0.4274                 0.8530  4.0587
Current best validation, making test prediction
     18        [36m0.2131[0m        0.4259                 [35m0.8543[0m  3.9428
Current best validation, making test prediction
     19        [36m0.2062[0m        0.4285                 [35m0.8562[0m  3.6899
     20        [36m0.2000[0m        0.4371                 0.8545  3.0510
     21        [36m0.1931[0m        0.4409                 0.8548  4.3171
     22        [36m0

     17        [36m0.1322[0m        0.5926                 0.8481  4.2583
     18        [36m0.1194[0m        0.6095                 0.8505  4.6806
Epoch    18: reducing learning rate of group 0 to 2.1600e-04.
     19        [36m0.0862[0m        0.6338                 0.8502  2.9277
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8505370929842229
Saving predictions in ../predictions/0.8505370929842229_0_Layers=[531, 531, 531],Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6873[0m        [32m0.5932[0m                 [35m0.7851[0m  3.4967
Current best validation, making test prediction
      2        [36m0.5658[0m        [32m0.5643[0m                 [35m0.7852[0m  4.8102
Current best validation, making test prediction
      3        [36m0.5214[0m        [32m0.5341

Current best validation, making test prediction
     16        [36m0.2099[0m        0.4871                 [35m0.8446[0m  6.3848
     17        [36m0.2042[0m        0.5066                 0.8418  7.3903
     18        [36m0.1960[0m        0.5092                 0.8411  7.0161
     19        [36m0.1876[0m        0.5126                 0.8421  5.2695
Epoch    19: reducing learning rate of group 0 to 3.6000e-04.
     20        [36m0.1630[0m        0.5319                 0.8409  4.5783
     21        [36m0.1563[0m        0.5455                 0.8413  5.5558
     22        [36m0.1511[0m        0.5515                 0.8399  6.1716
Epoch    22: reducing learning rate of group 0 to 2.1600e-04.
     23        [36m0.1367[0m        0.5544                 0.8423  4.1569
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8445787176905002
Saving predictions in ../predictions/0.8445787176905002_1_Layers=[157, 157, 157],Dropout:[0, 0, 0],Batchnorm,Activatio

Current best validation, making test prediction
     32        [36m0.1508[0m        0.5261                 [35m0.8492[0m  4.0261
     33        [36m0.1480[0m        0.5301                 0.8473  4.6042
     34        [36m0.1462[0m        0.5347                 0.8455  4.8108
     35        [36m0.1444[0m        0.5343                 0.8469  4.7096
Epoch    35: reducing learning rate of group 0 to 7.7760e-05.
     36        [36m0.1374[0m        0.5384                 0.8474  3.8982
     37        [36m0.1362[0m        0.5403                 0.8465  4.6041
     38        [36m0.1346[0m        0.5451                 0.8473  4.6215
Epoch    38: reducing learning rate of group 0 to 4.6656e-05.
     39        [36m0.1307[0m        0.5432                 0.8483  5.4680
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8491943605236657
Saving predictions in ../predictions/0.8491943605236657_1_Layers=[468, 468],Activation=SELU.csv
Current best validatio

Current best validation, making test prediction
      2        [36m0.5584[0m        [32m0.5262[0m                 [35m0.8033[0m  4.1751
Current best validation, making test prediction
      3        [36m0.5077[0m        [32m0.5010[0m                 [35m0.8145[0m  3.9920
Current best validation, making test prediction
      4        [36m0.4759[0m        [32m0.4780[0m                 [35m0.8215[0m  3.9471
Current best validation, making test prediction
      5        [36m0.4475[0m        [32m0.4610[0m                 [35m0.8345[0m  3.9622
Current best validation, making test prediction
      6        [36m0.4281[0m        [32m0.4504[0m                 [35m0.8359[0m  3.4133
      7        [36m0.4064[0m        [32m0.4502[0m                 0.8340  3.6748
Current best validation, making test prediction
      8        [36m0.3933[0m        [32m0.4464[0m                 [35m0.8363[0m  3.9332
Current best validation, making test prediction
      9        

      5        [36m0.4654[0m        [32m0.5197[0m                 0.8118  4.2019
Current best validation, making test prediction
      6        [36m0.4436[0m        [32m0.4801[0m                 [35m0.8263[0m  3.2033
Current best validation, making test prediction
      7        [36m0.4209[0m        [32m0.4758[0m                 [35m0.8292[0m  4.7491
      8        [36m0.4004[0m        0.4999                 0.8268  4.4670
Current best validation, making test prediction
      9        [36m0.3861[0m        0.5005                 [35m0.8311[0m  3.7848
Current best validation, making test prediction
     10        [36m0.3645[0m        0.4948                 [35m0.8355[0m  4.4789
     11        [36m0.3528[0m        0.4987                 0.8355  3.5154
     12        [36m0.3294[0m        0.4987                 0.8319  3.8904
     13        [36m0.3134[0m        0.5361                 0.8335  3.9506
Epoch    13: reducing learning rate of group 0 to 6.0000e-04.

     11        [36m0.3406[0m        0.4676                 0.8345  3.0659
Current best validation, making test prediction
     12        [36m0.3292[0m        0.4520                 [35m0.8430[0m  3.7110
     13        [36m0.3205[0m        0.4669                 0.8391  3.7018
     14        [36m0.3053[0m        0.4697                 0.8381  3.3232
     15        [36m0.2936[0m        0.4701                 0.8429  3.1942
Epoch    15: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     16        [36m0.2488[0m        0.4541                 [35m0.8458[0m  3.2551
     17        [36m0.2341[0m        0.4765                 0.8435  3.1635
     18        [36m0.2250[0m        0.4805                 0.8422  3.2296
     19        [36m0.2174[0m        0.4893                 0.8420  3.5032
Epoch    19: reducing learning rate of group 0 to 3.6000e-04.
     20        [36m0.1884[0m        0.4876                 0.8454  3.0033
Cu

     23        [36m0.1373[0m        0.4789                 0.8546  5.0336
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8557401812688822
Saving predictions in ../predictions/0.8557401812688822_0_Layers=[803, 803],Batchnorm,Activation=SELU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6959[0m        [32m0.5827[0m                 [35m0.7797[0m  3.3107
Current best validation, making test prediction
      2        [36m0.5634[0m        [32m0.5156[0m                 [35m0.8035[0m  3.8658
Current best validation, making test prediction
      3        [36m0.5112[0m        [32m0.5116[0m                 [35m0.8077[0m  4.5443
Current best validation, making test prediction
      4        [36m0.4829[0m        [32m0.4776[0m                 [35m0.8201[0m  2.0739
Current best validation

Current best validation, making test prediction
      9        [36m0.2688[0m        0.4387                 [35m0.8509[0m  3.8196
Current best validation, making test prediction
     10        [36m0.2525[0m        0.4485                 [35m0.8510[0m  4.1383
     11        [36m0.2306[0m        0.4667                 0.8463  4.2595
     12        [36m0.2128[0m        0.4704                 0.8472  4.1122
Epoch    12: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     13        [36m0.1633[0m        0.4686                 [35m0.8524[0m  4.7840
Current best validation, making test prediction
     14        [36m0.1369[0m        0.4943                 [35m0.8548[0m  3.7855
     15        [36m0.1264[0m        0.5238                 0.8492  4.5902
     16        [36m0.1165[0m        0.5407                 0.8523  4.8527
     17        [36m0.1061[0m        0.5664                 0.8484  4.8497
Epoch    17: reducing lear

Epoch    25: reducing learning rate of group 0 to 1.2960e-04.
     26        [36m0.1833[0m        0.5189                 0.8435  5.5433
     27        [36m0.1784[0m        0.5307                 0.8414  6.7619
     28        0.1791        0.5337                 0.8416  6.2394
Epoch    28: reducing learning rate of group 0 to 7.7760e-05.
     29        [36m0.1741[0m        0.5301                 0.8417  4.0542
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8475159449479691
Saving predictions in ../predictions/0.8475159449479691_1_Layers=[109, 109, 109, 109],Dropout:[0, 0, 0, 0],Batchnorm,Activation=ReLU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6732[0m        [32m0.5780[0m                 [35m0.7829[0m  5.9040
Current best validation, making test prediction
      2        [36m0.5304

Current best validation, making test prediction
      2        [36m0.4387[0m        [32m0.4126[0m                 [35m0.8486[0m  3.8575
Current best validation, making test prediction
      3        [36m0.3847[0m        [32m0.3998[0m                 [35m0.8526[0m  4.5406
      4        [36m0.3428[0m        0.4016                 0.8517  5.1173
Current best validation, making test prediction
      5        [36m0.3119[0m        [32m0.3962[0m                 [35m0.8546[0m  4.9067
Current best validation, making test prediction
      6        [36m0.2779[0m        0.4135                 [35m0.8548[0m  4.6703
      7        [36m0.2512[0m        0.4172                 0.8548  4.7574
      8        [36m0.2218[0m        0.4360                 0.8540  5.4558
      9        [36m0.1956[0m        0.4401                 0.8543  4.8872
Epoch     9: reducing learning rate of group 0 to 6.0000e-04.
Current best validation, making test prediction
     10        [36m0.1371

     27        [36m0.1724[0m        0.5356                 0.8406  4.6930
Stopping since validation_accuracy has not improved in the last 8 epochs.
0.8442430345753609
Saving predictions in ../predictions/0.8442430345753609_0_Layers=[100, 100, 100],Batchnorm,Activation=ReLU.csv
Current best validation, making test prediction
  epoch    train_loss    valid_loss    validation_accuracy     dur
-------  ------------  ------------  ---------------------  ------
      1        [36m0.6848[0m        [32m0.5290[0m                 [35m0.7958[0m  5.4796
Current best validation, making test prediction
      2        [36m0.5175[0m        [32m0.4805[0m                 [35m0.8196[0m  5.2238
      3        [36m0.4752[0m        [32m0.4698[0m                 0.8191  4.9527
Current best validation, making test prediction
      4        [36m0.4504[0m        [32m0.4588[0m                 [35m0.8251[0m  3.6591
Current best validation, making test prediction
      5        [36m0.4295

      6        [36m0.4766[0m        0.5104                 0.8060  3.6263
Current best validation, making test prediction
      7        [36m0.4614[0m        [32m0.4916[0m                 [35m0.8149[0m  2.9128
Current best validation, making test prediction
      8        [36m0.4510[0m        [32m0.4864[0m                 [35m0.8156[0m  3.6618
      9        [36m0.4414[0m        0.4880                 0.8151  3.9509
     10        [36m0.4308[0m        0.4934                 0.8135  3.3836
Current best validation, making test prediction
     11        [36m0.4235[0m        [32m0.4646[0m                 [35m0.8228[0m  3.8426
     12        [36m0.4158[0m        0.4755                 0.8217  3.7638
Current best validation, making test prediction
     13        [36m0.4075[0m        0.4678                 [35m0.8254[0m  3.4980
     14        [36m0.4014[0m        0.4652                 0.8239  2.8003
Current best validation, making test prediction
     15      

      5        [36m0.4529[0m        [32m0.4531[0m                 0.8218  5.7111
Current best validation, making test prediction
      6        [36m0.4339[0m        [32m0.4504[0m                 [35m0.8279[0m  6.1758
Current best validation, making test prediction
      7        [36m0.4107[0m        [32m0.4466[0m                 [35m0.8329[0m  4.7494
Current best validation, making test prediction
      8        [36m0.3959[0m        [32m0.4460[0m                 [35m0.8367[0m  5.4413
Current best validation, making test prediction
      9        [36m0.3787[0m        [32m0.4347[0m                 [35m0.8387[0m  4.1743
     10        [36m0.3612[0m        0.4434                 0.8384  5.1916
Current best validation, making test prediction
     11        [36m0.3459[0m        [32m0.4300[0m                 [35m0.8428[0m  6.1082
     12        [36m0.3318[0m        0.4439                 0.8372  5.5866
     13        [36m0.3154[0m        0.4468         

     17        [36m0.2371[0m        0.4979                 0.8408  3.8573
     18        [36m0.2295[0m        0.5208                 0.8350  3.8323
     19        [36m0.2203[0m        0.5419                 0.8360  4.0304
Epoch    19: reducing learning rate of group 0 to 3.6000e-04.
     20        [36m0.1802[0m        0.5095                 0.8472  3.9780
     21        [36m0.1677[0m        0.5204                 0.8431  4.5302
     22        [36m0.1618[0m        0.5362                 0.8440  4.9604
Epoch    22: reducing learning rate of group 0 to 2.1600e-04.
     23        [36m0.1386[0m        0.5240                 0.8476  4.3862
Current best validation, making test prediction
     24        [36m0.1316[0m        0.5291                 [35m0.8494[0m  4.6855
     25        [36m0.1268[0m        0.5453                 0.8479  2.0811
     26        [36m0.1221[0m        0.5527                 0.8488  2.1749
     27        [36m0.1173[0m        0.5604               

Current best validation, making test prediction
      2        [36m0.5402[0m        [32m0.5175[0m                 [35m0.8026[0m  4.3603
Current best validation, making test prediction
      3        [36m0.4891[0m        [32m0.4810[0m                 [35m0.8182[0m  4.1627
Current best validation, making test prediction
      4        [36m0.4625[0m        [32m0.4731[0m                 [35m0.8241[0m  4.5553
Current best validation, making test prediction
      5        [36m0.4364[0m        [32m0.4588[0m                 [35m0.8281[0m  4.3679
Current best validation, making test prediction
      6        [36m0.4145[0m        [32m0.4453[0m                 [35m0.8321[0m  3.9356
Current best validation, making test prediction
      7        [36m0.3977[0m        [32m0.4445[0m                 [35m0.8344[0m  4.7269
Current best validation, making test prediction
      8        [36m0.3826[0m        [32m0.4378[0m                 [35m0.8356[0m  3.4714
      

     15        [36m0.1990[0m        0.5252                 0.8452  3.9088
     16        [36m0.1853[0m        0.5618                 0.8379  3.3465
     17        [36m0.1707[0m        0.5874                 0.8468  3.3995
Epoch    17: reducing learning rate of group 0 to 3.6000e-04.
     18        [36m0.1216[0m        0.5840                 0.8478  3.0948
Current best validation, making test prediction
     19        [36m0.1030[0m        0.5992                 [35m0.8489[0m  4.2925
     20        [36m0.0925[0m        0.6473                 0.8459  4.0530
     21        [36m0.0844[0m        0.6897                 0.8432  3.9999
     22        [36m0.0742[0m        0.7210                 0.8464  2.4604
Epoch    22: reducing learning rate of group 0 to 2.1600e-04.
     23        [36m0.0501[0m        0.7229                 0.8484  4.0173
     24        [36m0.0411[0m        0.7540                 0.8489  2.9395
     25        [36m0.0362[0m        0.7809               

Current best validation, making test prediction
      5        [36m0.4549[0m        [32m0.4486[0m                 [35m0.8284[0m  5.3750
Current best validation, making test prediction
      6        [36m0.4298[0m        [32m0.4302[0m                 [35m0.8384[0m  7.2278
      7        [36m0.4147[0m        [32m0.4259[0m                 0.8366  7.2891
      8        [36m0.3952[0m        0.4397                 0.8343  5.6236
Current best validation, making test prediction
      9        [36m0.3797[0m        [32m0.4224[0m                 [35m0.8419[0m  6.6514
     10        [36m0.3636[0m        0.4496                 0.8306  5.1183
Current best validation, making test prediction
     11        [36m0.3524[0m        0.4244                 [35m0.8429[0m  5.7897
Current best validation, making test prediction
     12        [36m0.3389[0m        [32m0.4135[0m                 [35m0.8463[0m  5.7401
     13        [36m0.3265[0m        0.4257                 0

Current best validation, making test prediction
      9        [36m0.4220[0m        [32m0.4746[0m                 [35m0.8246[0m  4.9744
Current best validation, making test prediction
     10        [36m0.4123[0m        [32m0.4621[0m                 [35m0.8254[0m  3.4829
     11        [36m0.4014[0m        0.4754                 0.8218  5.3965
Current best validation, making test prediction
     12        [36m0.3900[0m        [32m0.4571[0m                 [35m0.8313[0m  5.3640
     13        [36m0.3816[0m        0.4743                 0.8249  5.1570
Current best validation, making test prediction
     14        [36m0.3697[0m        [32m0.4568[0m                 [35m0.8320[0m  5.3808
     15        [36m0.3608[0m        0.4650                 0.8314  5.8588
Current best validation, making test prediction
     16        [36m0.3523[0m        [32m0.4493[0m                 [35m0.8345[0m  5.3837
     17        [36m0.3438[0m        0.4598                 0

Current best validation, making test prediction
      4        [36m0.5024[0m        [32m0.4922[0m                 [35m0.8113[0m  4.1462
Current best validation, making test prediction
      5        [36m0.4799[0m        [32m0.4705[0m                 [35m0.8180[0m  6.0030
Current best validation, making test prediction
      6        [36m0.4560[0m        0.4726                 [35m0.8208[0m  3.7645
Current best validation, making test prediction
      7        [36m0.4384[0m        [32m0.4441[0m                 [35m0.8315[0m  5.5632
      8        [36m0.4294[0m        [32m0.4430[0m                 0.8304  5.5944
Current best validation, making test prediction
      9        [36m0.4168[0m        [32m0.4346[0m                 [35m0.8367[0m  4.4495
     10        [36m0.4028[0m        [32m0.4283[0m                 0.8364  4.0771
     11        [36m0.3952[0m        0.4324                 0.8367  3.8740
Current best validation, making test prediction
     