## Sprint 9 - updated version

**-------Fixes for sprint 9-------**

After getting a feedback, we applied suggested fixes for sprint 9 deep learning part:

1.   Selected all the data (instead of a portion of it)
2.   Scaled the data
3.   Ran experiments with more than one activation functions (included SELU and tanh)
4.   Tried out different batch sizes
5.   Avoided larger networks, as they do not seem to improve the performance of the model
6.   Included batch normalization

After these changes were applied, we executed experiments again. The following parameters and options were tested:

* batch_sizes  - 16, 32, 64, 128
* epochs  -100, 500, 1000
* activations - 'relu', 'selu', 'tanh'
* architecture (neurons in each layer) - 64; 128; 256; (64, 32); (128, 64)

**Deep Learning Steps**

*1.Load Kinect movement data: Kinect frame sequences*

All the provided Kinect frame sequences were loaded into a single dataset. Then, all the data was randomized. All the X and Y coordinates were selected as features while all the Z coordinates - as labels. Then features and labels were scaled.


*2.Define a Deep Learning network (model)*

Our deep learning network consists of one input layer with an input shape of 26 (same as the number of features) and an output layer with 13 neurons (this is the number of the outputs - Z coordinates). In addition, we experimented by adding one or several hidden layers and changing their number of neurons to see if the model performs better.

*3.Compile the DL model*

We used mean absolute error loss function and experimented with Adam, Stochastic gradient descent and Nadam optimizers. The metrics we used to evaluate model were mean absolute error and mean squared error.

*4.Split training and testing sets*

The selected data sample was split into 3 parts: 70% training set, 20% validation set and 10% testing set. 

*5.Train the DL model*

For training the model, we used early stopping based on the validation set. This helped the model not overfit the training set. On top of that, we used 100, 500 and 1000 epochs for all optimizers when testing. All tests were performed with 16, 32, 64, 128 respectively and activation functions ReLU, SELU, Tanh.   

*6.Evaluate DL model predictions*

evaluate() method was used to evaluate the model. We ran this method on both, test and training sets to see if the model performed much worse on the test set.


### Adam optimizer
 
The best result with Adam optimizer was a model with 2 hidden layers, first layer having 128 neurons and second - 64, the activation function ReLU and a learning rate of 1e-4. The batch size in the best result is 128 and 500 epochs. This resulted in the mean absolute error being 0.0382 and mean squared error being 0.0028.



### Stochastic gradient descent optimizer

The best result with SGD optimizer was a model with 2 hidden layers, first layer having 128 neurons and second - 64, the activation function Tanh and a learning rate of 1e-4. The batch size in the best result is 64 and 1000 epochs. This resulted in the mean absolute error being 0.0446 and mean squared error being 0.0040.  

### Nestorov Adam optimizer

The table at the end of the document show the results of all tests. The best result with Nadam optimizer was a model with 2 hidden layers, first layer having 128 neurons and second - 64, the activation function ReLu and a learning rate of 1e-4. The batch size in the best result is 32 and 1000 epochs. This resulted in the mean absolute error being 0.0384 and mean squared error being 0.0028.  

From the automated tests it seems the the best architecture and the best parameters for the model is:
- Adam optimizer (*Adam(learning_rate = 1e-4)*)
- fitted with batch size of 128, 500 epochs
- 2 hidden layers, 128 and 64 neurons with the ReLU activation function


In [None]:
import tensorflow as tf
import numpy as np
import pandas as pd
import glob
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

In [None]:
data = pd.DataFrame()
for files in glob.glob('*_kinect.csv'):
  d = pd.read_csv(files)
  data = pd.concat([data,d],axis=0)
data.shape

(24005, 40)

In [None]:
data.head()

Unnamed: 0,FrameNo,head_x,head_y,head_z,left_shoulder_x,left_shoulder_y,left_shoulder_z,left_elbow_x,left_elbow_y,left_elbow_z,right_shoulder_x,right_shoulder_y,right_shoulder_z,right_elbow_x,right_elbow_y,right_elbow_z,left_hand_x,left_hand_y,left_hand_z,right_hand_x,right_hand_y,right_hand_z,left_hip_x,left_hip_y,left_hip_z,right_hip_x,right_hip_y,right_hip_z,left_knee_x,left_knee_y,left_knee_z,right_knee_x,right_knee_y,right_knee_z,left_foot_x,left_foot_y,left_foot_z,right_foot_x,right_foot_y,right_foot_z
0,54,-0.000362,0.77182,0.038403,-0.14051,0.55397,0.019,-0.22268,0.7611,-0.041707,0.14028,0.53996,0.013771,0.21296,0.75192,-0.045553,-0.26133,0.97581,-0.11709,0.25002,0.96937,-0.11229,-0.070782,0.058053,-0.036604,0.073053,0.052,-0.030421,-0.12288,-0.35631,-0.04834,0.11515,-0.38198,-0.027392,-0.12338,-0.65977,-0.050811,0.12554,-0.68488,-0.057272
1,55,-0.000749,0.77233,0.037489,-0.14042,0.55524,0.018899,-0.22249,0.76242,-0.041701,0.14023,0.53996,0.012913,0.21235,0.75195,-0.045572,-0.26108,0.97718,-0.11708,0.24824,0.96951,-0.11235,-0.07095,0.059132,-0.036741,0.072963,0.053121,-0.030419,-0.12286,-0.3563,-0.048592,0.11517,-0.38089,-0.027927,-0.12338,-0.65973,-0.050477,0.12547,-0.68383,-0.057571
2,56,-0.001093,0.77294,0.036558,-0.14035,0.55651,0.018697,-0.2223,0.76374,-0.041695,0.14025,0.53996,0.012135,0.21187,0.75195,-0.045587,-0.26079,0.97849,-0.11708,0.24661,0.96951,-0.1124,-0.071111,0.060158,-0.036746,0.072893,0.054178,-0.030421,-0.12281,-0.35627,-0.048846,0.11518,-0.37993,-0.028485,-0.12338,-0.6597,-0.050198,0.12542,-0.68291,-0.057858
3,57,-0.001415,0.7731,0.036013,-0.14035,0.55701,0.018478,-0.22208,0.76445,-0.041688,0.14026,0.53982,0.011722,0.21146,0.75203,-0.0456,-0.26043,0.97905,-0.11706,0.24499,0.96951,-0.11245,-0.071132,0.06048,-0.036747,0.072892,0.054551,-0.03041,-0.12278,-0.35605,-0.049004,0.1152,-0.37956,-0.029059,-0.12339,-0.65947,-0.050053,0.12543,-0.68254,-0.058122
4,58,-0.001701,0.77322,0.035327,-0.14034,0.55748,0.018235,-0.22194,0.76476,-0.043394,0.14027,0.53953,0.011416,0.21115,0.752,-0.046141,-0.25977,0.97929,-0.11945,0.24358,0.96951,-0.11303,-0.071132,0.060772,-0.036747,0.072892,0.054903,-0.030408,-0.12277,-0.35581,-0.049096,0.11526,-0.37926,-0.029677,-0.12339,-0.65924,-0.049983,0.12544,-0.68226,-0.058466


In [None]:
# Shuffle data
data = data.sample(frac=1).reset_index(drop=True)
shuffled_data = data.sample(n=(data.shape[0]), random_state=42)
# Drop 'FrameNo' column, because it's not needed
shuffled_data.drop('FrameNo', axis=1, inplace=True)

In [None]:
# Split into features and labels
X = shuffled_data.filter(regex='_x|_y')
y = shuffled_data.filter(regex='_z')

In [None]:
# Scale features and labels
feature_scaler = MinMaxScaler()
label_scaler = MinMaxScaler()
features = feature_scaler.fit_transform(X)
labels = label_scaler.fit_transform(y)

In [None]:
# Split into train, test, validation
x_train, x_test, y_train, y_test = train_test_split(features, labels, test_size=0.3, random_state=42)
x_val, x_test, y_val, y_test = train_test_split(x_test, y_test, test_size=0.2, random_state=42)

In [None]:
print(X.shape)
print(x_train.shape)
print(x_val.shape)
print(x_test.shape)

(24005, 26)
(16803, 26)
(5761, 26)
(1441, 26)


In [None]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=(26,)))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dense(13))

In [None]:
model.compile(loss='mean_absolute_error', optimizer=tf.keras.optimizers.Adam(learning_rate = 1e-4), metrics = ['mse'])

In [None]:
es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=6)
model.fit(x_train, y_train, validation_data=(x_val, y_val), epochs=500, verbose=0, batch_size=128, callbacks=[es])

<tensorflow.python.keras.callbacks.History at 0x7f1c25a47f60>

In [None]:
model.evaluate(x_test, y_test, batch_size=128, verbose=0)

[0.04554207995533943, 0.004041305743157864]

## Software Development part

1. Designed deep learning pipeline to simplify model training, testing and deployment for conducting the inferance. The pipeline consists of the following modules:
* Data creator - Reads the data from the given folder. Processes it into separate datasets in order to train and test the model.
* Model - Serves as a wrapper for the model architecture to simplify the model compilation, saving, serving.
* Trainer - Defines the pipeline for the model training. Integrates Model, Data creator with the configurate hyperparameters to start the training. Includes early stopping callback to stop the traininng session if no loss decrease was made in the given N epochs. Aslo contains the Tensorboard callback for interactive visualization of optimization process and logs saving.
* Estimator - the module to serve the model and conduct the inferance on a given set of data
* Config - Includes multi-level fancy dictionary for project configuration.
2. Implemented the wrapper class for PoseNet. This step was made to maintain code clean and flexible as it will serve as one of key components during end-product service integration. The optimization of model and session components was also performed.

### Adam optimizer results

- Before Feedback:  

| Optimizer |	Hidden layer neurons (in sequence) |	Train set |	Test set |
|---------------|---------------|---------------------------|---------------------------|
| Adam, lr=3e-4 | 128            | mse: 0.0028 - mae: 0.0360     | mse: 0.0027 - mae: 0.0358     |
| Adam, lr=3e-5 | 128            | mse: 0.0029 - mae: 0.0366     | mse: 0.0028 - mae: 0.0363     |
| Adam, lr=3e-6 | 128            | mse: 0.0039 - mae: 0.0433     | mse: 0.0037 - mae: 0.0427     |
| Adam, lr=3e-4 | 512, 256, 128  | mse: 0.0009 - mae: 0.0207     | mse: 0.0011 - mae: 0.0232     |
| Adam, lr=3e-5 | 512, 256, 128  | mse: 0.0015 - mae: 0.0261     | mse: 0.0018 - mae: 0.0281     |
| Adam, lr=3e-6 | 512, 256, 128  | mse: 0.0020 - mae: 0.0301     | mse: 0.0023 - mae: 0.0320     |
|**Adam, lr=3e-4** | 1024, 512, 128 | mse: 4.5195e-04 - mae: 0.0146 | **mse: 6.1760e-04 - mae: 0.0167** |
| Adam, lr=3e-5 | 1024, 512, 128 | mse: 0.0014 - mae: 0.0248     | mse: 0.0017 - mae: 0.0272     |
| Adam, lr=3e-6 | 1024, 512, 1288 | mse: 0.0023 - mae: 0.0317     | mse: 0.0022 - mae: 0.0317     |


- After Feedback:  
Adam(learning_rate = 1e-4) 

| Neurons | Activation function | Epochs | Batch size | Results on test set|
|---------|---------------------|--------|------------|--------------------|
|64, None|relu | 100 | 16 |  mae (loss): 0.0537, mse: 0.0054 |
|128, None|relu | 100 | 16 |  mae (loss): 0.0498, mse: 0.0046 |
|256, None|relu | 100 | 16 |  mae (loss): 0.0515, mse: 0.0048 |
|64, 32|relu | 100 | 16 |  mae (loss): 0.0471, mse: 0.0042 |
|128, 64|relu | 100 | 16 |  mae (loss): 0.0390, mse: 0.0028 |
|64, None|selu | 100 | 16 |  mae (loss): 0.0541, mse: 0.0057 |
|128, None|selu | 100 | 16 |  mae (loss): 0.0607, mse: 0.0064 |
|256, None|selu | 100 | 16 |  mae (loss): 0.0898, mse: 0.0131 |
|64, 32|selu | 100 | 16 |  mae (loss): 0.0524, mse: 0.0052 |
|128, 64|selu | 100 | 16 |  mae (loss): 0.0465, mse: 0.0041 |
|64, None|tanh | 100 | 16 |  mae (loss): 0.0577, mse: 0.0063 |
|128, None|tanh | 100 | 16 |  mae (loss): 0.0682, mse: 0.0081 |
|256, None|tanh | 100 | 16 |  mae (loss): 0.0698, mse: 0.0085 |
|64, 32|tanh | 100 | 16 |  mae (loss): 0.0511, mse: 0.0047 |
|128, 64|tanh | 100 | 16 |  mae (loss): 0.0517, mse: 0.0048 |
|64, None|relu | 500 | 16 |  mae (loss): 0.0536, mse: 0.0055 |
|128, None|relu | 500 | 16 |  mae (loss): 0.0574, mse: 0.0061 |
|256, None|relu | 500 | 16 |  mae (loss): 0.0485, mse: 0.0043 |
|64, 32|relu | 500 | 16 |  mae (loss): 0.0448, mse: 0.0041 |
|128, 64|relu | 500 | 16 |  mae (loss): 0.0422, mse: 0.0033 |
|64, None|selu | 500 | 16 |  mae (loss): 0.0610, mse: 0.0069 |
|128, None|selu | 500 | 16 |  mae (loss): 0.0607, mse: 0.0067 |
|256, None|selu | 500 | 16 |  mae (loss): 0.0743, mse: 0.0092 |
|64, 32|selu | 500 | 16 |  mae (loss): 0.0504, mse: 0.0048 |
|128, 64|selu | 500 | 16 |  mae (loss): 0.0488, mse: 0.0045 |
|64, None|tanh | 500 | 16 |  mae (loss): 0.0578, mse: 0.0062 |
|128, None|tanh | 500 | 16 |  mae (loss): 0.0741, mse: 0.0095 |
|256, None|tanh | 500 | 16 |  mae (loss): 0.0700, mse: 0.0088 |
|64, 32|tanh | 500 | 16 |  mae (loss): 0.0527, mse: 0.0051 |
|128, 64|tanh | 500 | 16 |  mae (loss): 0.0589, mse: 0.0061 |
|64, None|relu | 1000 | 16 |  mae (loss): 0.0525, mse: 0.0053 |
|128, None|relu | 1000 | 16 |  mae (loss): 0.0549, mse: 0.0055 |
|256, None|relu | 1000 | 16 |  mae (loss): 0.0510, mse: 0.0045 |
|64, 32|relu | 1000 | 16 |  mae (loss): 0.0508, mse: 0.0050 |
|128, 64|relu | 1000 | 16 |  mae (loss): 0.0432, mse: 0.0035 |
|64, None|selu | 1000 | 16 |  mae (loss): 0.0594, mse: 0.0065 |
|128, None|selu | 1000 | 16 |  mae (loss): 0.0624, mse: 0.0070 |
|256, None|selu | 1000 | 16 |  mae (loss): 0.0717, mse: 0.0085 |
|64, 32|selu | 1000 | 16 |  mae (loss): 0.0529, mse: 0.0051 |
|128, 64|selu | 1000 | 16 |  mae (loss): 0.0462, mse: 0.0040 |
|64, None|tanh | 1000 | 16 |  mae (loss): 0.0612, mse: 0.0070 |
|128, None|tanh | 1000 | 16 |  mae (loss): 0.0574, mse: 0.0059 |
|256, None|tanh | 1000 | 16 |  mae (loss): 0.0708, mse: 0.0091 |
|64, 32|tanh | 1000 | 16 |  mae (loss): 0.0531, mse: 0.0053 |
|128, 64|tanh | 1000 | 16 |  mae (loss): 0.0547, mse: 0.0052 |
|64, None|relu | 100 | 32 |  mae (loss): 0.0532, mse: 0.0053 |
|128, None|relu | 100 | 32 |  mae (loss): 0.0506, mse: 0.0048 |
|256, None|relu | 100 | 32 |  mae (loss): 0.0480, mse: 0.0041 |
|64, 32|relu | 100 | 32 |  mae (loss): 0.0463, mse: 0.0042 |
|128, 64|relu | 100 | 32 |  mae (loss): 0.0433, mse: 0.0034 |
|64, None|selu | 100 | 32 |  mae (loss): 0.0550, mse: 0.0059 |
|128, None|selu | 100 | 32 |  mae (loss): 0.0584, mse: 0.0062 |
|256, None|selu | 100 | 32 |  mae (loss): 0.0628, mse: 0.0069 |
|64, 32|selu | 100 | 32 |  mae (loss): 0.0555, mse: 0.0056 |
|128, 64|selu | 100 | 32 |  mae (loss): 0.0504, mse: 0.0045 |
|64, None|tanh | 100 | 32 |  mae (loss): 0.0555, mse: 0.0060 |
|128, None|tanh | 100 | 32 |  mae (loss): 0.0623, mse: 0.0071 |
|256, None|tanh | 100 | 32 |  mae (loss): 0.0885, mse: 0.0129 |
|64, 32|tanh | 100 | 32 |  mae (loss): 0.0483, mse: 0.0043 |
|128, 64|tanh | 100 | 32 |  mae (loss): 0.0524, mse: 0.0049 |
|64, None|relu | 500 | 32 |  mae (loss): 0.0510, mse: 0.0050 |
|128, None|relu | 500 | 32 |  mae (loss): 0.0481, mse: 0.0044 |
|256, None|relu | 500 | 32 |  mae (loss): 0.0520, mse: 0.0047 |
|64, 32|relu | 500 | 32 |  mae (loss): 0.0494, mse: 0.0046 |
|128, 64|relu | 500 | 32 |  mae (loss): 0.0444, mse: 0.0036 |
|64, None|selu | 500 | 32 |  mae (loss): 0.0591, mse: 0.0066 |
|128, None|selu | 500 | 32 |  mae (loss): 0.0576, mse: 0.0063 |
|256, None|selu | 500 | 32 |  mae (loss): 0.0702, mse: 0.0083 |
|64, 32|selu | 500 | 32 |  mae (loss): 0.0511, mse: 0.0051 |
|128, 64|selu | 500 | 32 |  mae (loss): 0.0479, mse: 0.0043 |
|64, None|tanh | 500 | 32 |  mae (loss): 0.0562, mse: 0.0061 |
|128, None|tanh | 500 | 32 |  mae (loss): 0.0574, mse: 0.0060 |
|256, None|tanh | 500 | 32 |  mae (loss): 0.0680, mse: 0.0082 |
|64, 32|tanh | 500 | 32 |  mae (loss): 0.0456, mse: 0.0040 |
|128, 64|tanh | 500 | 32 |  mae (loss): 0.0530, mse: 0.0052 |
|64, None|relu | 1000 | 32 |  mae (loss): 0.0503, mse: 0.0048 |
|128, None|relu | 1000 | 32 |  mae (loss): 0.0503, mse: 0.0048 |
|256, None|relu | 1000 | 32 |  mae (loss): 0.0490, mse: 0.0044 |
|64, 32|relu | 1000 | 32 |  mae (loss): 0.0467, mse: 0.0042 |
|128, 64|relu | 1000 | 32 |  mae (loss): 0.0403, mse: 0.0031 |
|64, None|selu | 1000 | 32 |  mae (loss): 0.0539, mse: 0.0056 |
|128, None|selu | 1000 | 32 |  mae (loss): 0.0563, mse: 0.0060 |
|256, None|selu | 1000 | 32 |  mae (loss): 0.0584, mse: 0.0060 |
|64, 32|selu | 1000 | 32 |  mae (loss): 0.0498, mse: 0.0047 |
|128, 64|selu | 1000 | 32 |  mae (loss): 0.0528, mse: 0.0049 |
|64, None|tanh | 1000 | 32 |  mae (loss): 0.0562, mse: 0.0060 |
|128, None|tanh | 1000 | 32 |  mae (loss): 0.0573, mse: 0.0062 |
|256, None|tanh | 1000 | 32 |  mae (loss): 0.0697, mse: 0.0084 |
|64, 32|tanh | 1000 | 32 |  mae (loss): 0.0506, mse: 0.0047 |
|128, 64|tanh | 1000 | 32 |  mae (loss): 0.0530, mse: 0.0051 |
|64, None|relu | 100 | 64 |  mae (loss): 0.0542, mse: 0.0058 |
|128, None|relu | 100 | 64 |  mae (loss): 0.0542, mse: 0.0055 |
|256, None|relu | 100 | 64 |  mae (loss): 0.0478, mse: 0.0041 |
|64, 32|relu | 100 | 64 |  mae (loss): 0.0448, mse: 0.0040 |
|128, 64|relu | 100 | 64 |  mae (loss): 0.0418, mse: 0.0033 |
|64, None|selu | 100 | 64 |  mae (loss): 0.0653, mse: 0.0078 |
|128, None|selu | 100 | 64 |  mae (loss): 0.0587, mse: 0.0065 |
|256, None|selu | 100 | 64 |  mae (loss): 0.0677, mse: 0.0076 |
|64, 32|selu | 100 | 64 |  mae (loss): 0.0519, mse: 0.0051 |
|128, 64|selu | 100 | 64 |  mae (loss): 0.0449, mse: 0.0039 |
|64, None|tanh | 100 | 64 |  mae (loss): 0.0582, mse: 0.0065 |
|128, None|tanh | 100 | 64 |  mae (loss): 0.0593, mse: 0.0064 |
|256, None|tanh | 100 | 64 |  mae (loss): 0.0865, mse: 0.0121 |
|64, 32|tanh | 100 | 64 |  mae (loss): 0.0541, mse: 0.0052 |
|128, 64|tanh | 100 | 64 |  mae (loss): 0.0552, mse: 0.0055 |
|64, None|relu | 500 | 64 |  mae (loss): 0.0639, mse: 0.0078 |
|128, None|relu | 500 | 64 |  mae (loss): 0.0585, mse: 0.0061 |
|256, None|relu | 500 | 64 |  mae (loss): 0.0511, mse: 0.0046 |
|64, 32|relu | 500 | 64 |  mae (loss): 0.0477, mse: 0.0043 |
|128, 64|relu | 500 | 64 |  mae (loss): 0.0414, mse: 0.0032 |
|64, None|selu | 500 | 64 |  mae (loss): 0.0555, mse: 0.0060 |
|128, None|selu | 500 | 64 |  mae (loss): 0.0594, mse: 0.0064 |
|256, None|selu | 500 | 64 |  mae (loss): 0.0625, mse: 0.0066 |
|64, 32|selu | 500 | 64 |  mae (loss): 0.0503, mse: 0.0049 |
|128, 64|selu | 500 | 64 |  mae (loss): 0.0502, mse: 0.0047 |
|64, None|tanh | 500 | 64 |  mae (loss): 0.0603, mse: 0.0067 |
|128, None|tanh | 500 | 64 |  mae (loss): 0.0637, mse: 0.0072 |
|256, None|tanh | 500 | 64 |  mae (loss): 0.0660, mse: 0.0081 |
|64, 32|tanh | 500 | 64 |  mae (loss): 0.0530, mse: 0.0053 |
|128, 64|tanh | 500 | 64 |  mae (loss): 0.0506, mse: 0.0046 |
|64, None|relu | 1000 | 64 |  mae (loss): 0.0560, mse: 0.0059 |
|128, None|relu | 1000 | 64 |  mae (loss): 0.0550, mse: 0.0056 |
|256, None|relu | 1000 | 64 |  mae (loss): 0.0518, mse: 0.0047 |
|64, 32|relu | 1000 | 64 |  mae (loss): 0.0456, mse: 0.0040 |
|128, 64|relu | 1000 | 64 |  mae (loss): 0.0411, mse: 0.0030 |
|64, None|selu | 1000 | 64 |  mae (loss): 0.0536, mse: 0.0057 |
|128, None|selu | 1000 | 64 |  mae (loss): 0.0600, mse: 0.0062 |
|256, None|selu | 1000 | 64 |  mae (loss): 0.0682, mse: 0.0079 |
|64, 32|selu | 1000 | 64 |  mae (loss): 0.0491, mse: 0.0047 |
|128, 64|selu | 1000 | 64 |  mae (loss): 0.0450, mse: 0.0037 |
|64, None|tanh | 1000 | 64 |  mae (loss): 0.0564, mse: 0.0062 |
|128, None|tanh | 1000 | 64 |  mae (loss): 0.0654, mse: 0.0072 |
|256, None|tanh | 1000 | 64 |  mae (loss): 0.0620, mse: 0.0068 |
|64, 32|tanh | 1000 | 64 |  mae (loss): 0.0462, mse: 0.0041 |
|128, 64|tanh | 1000 | 64 |  mae (loss): 0.0477, mse: 0.0041 |
|64, None|relu | 100 | 128 |  mae (loss): 0.0592, mse: 0.0066 |
|128, None|relu | 100 | 128 |  mae (loss): 0.0566, mse: 0.0060 |
|256, None|relu | 100 | 128 |  mae (loss): 0.0507, mse: 0.0046 |
|64, 32|relu | 100 | 128 |  mae (loss): 0.0468, mse: 0.0042 |
|128, 64|relu | 100 | 128 |  mae (loss): 0.0431, mse: 0.0036 |
|64, None|selu | 100 | 128 |  mae (loss): 0.0604, mse: 0.0070 |
|128, None|selu | 100 | 128 |  mae (loss): 0.0661, mse: 0.0078 |
|256, None|selu | 100 | 128 |  mae (loss): 0.0585, mse: 0.0061 |
|64, 32|selu | 100 | 128 |  mae (loss): 0.0497, mse: 0.0047 |
|128, 64|selu | 100 | 128 |  mae (loss): 0.0498, mse: 0.0045 |
|64, None|tanh | 100 | 128 |  mae (loss): 0.0559, mse: 0.0060 |
|128, None|tanh | 100 | 128 |  mae (loss): 0.0631, mse: 0.0072 |
|256, None|tanh | 100 | 128 |  mae (loss): 0.0708, mse: 0.0086 |
|64, 32|tanh | 100 | 128 |  mae (loss): 0.0479, mse: 0.0044 |
|128, 64|tanh | 100 | 128 |  mae (loss): 0.0573, mse: 0.0057 |
|64, None|relu | 500 | 128 |  mae (loss): 0.0553, mse: 0.0060 |
|128, None|relu | 500 | 128 |  mae (loss): 0.0511, mse: 0.0048 |
|256, None|relu | 500 | 128 |  mae (loss): 0.0574, mse: 0.0059 |
|64, 32|relu | 500 | 128 |  mae (loss): 0.0462, mse: 0.0043 |
|128, 64|relu | 500 | 128 |  mae (loss): 0.0382, mse: 0.0028 |
|64, None|selu | 500 | 128 |  mae (loss): 0.0630, mse: 0.0074 |
|128, None|selu | 500 | 128 |  mae (loss): 0.0648, mse: 0.0077 |
|256, None|selu | 500 | 128 |  mae (loss): 0.0673, mse: 0.0078 |
|64, 32|selu | 500 | 128 |  mae (loss): 0.0493, mse: 0.0047 |
|128, 64|selu | 500 | 128 |  mae (loss): 0.0462, mse: 0.0040 |
|64, None|tanh | 500 | 128 |  mae (loss): 0.0584, mse: 0.0066 |
|128, None|tanh | 500 | 128 |  mae (loss): 0.0645, mse: 0.0075 |
|256, None|tanh | 500 | 128 |  mae (loss): 0.0686, mse: 0.0083 |
|64, 32|tanh | 500 | 128 |  mae (loss): 0.0516, mse: 0.0051 |
|128, 64|tanh | 500 | 128 |  mae (loss): 0.0497, mse: 0.0046 |
|64, None|relu | 1000 | 128 |  mae (loss): 0.0572, mse: 0.0064 |
|128, None|relu | 1000 | 128 |  mae (loss): 0.0555, mse: 0.0055 |
|256, None|relu | 1000 | 128 |  mae (loss): 0.0492, mse: 0.0043 |
|64, 32|relu | 1000 | 128 |  mae (loss): 0.0444, mse: 0.0039 |
|128, 64|relu | 1000 | 128 |  mae (loss): 0.0441, mse: 0.0038 |
|64, None|selu | 1000 | 128 |  mae (loss): 0.0564, mse: 0.0060 |
|128, None|selu | 1000 | 128 |  mae (loss): 0.0603, mse: 0.0068 |
|256, None|selu | 1000 | 128 |  mae (loss): 0.0688, mse: 0.0083 |
|64, 32|selu | 1000 | 128 |  mae (loss): 0.0458, mse: 0.0041 |
|128, 64|selu | 1000 | 128 |  mae (loss): 0.0498, mse: 0.0046 |
|64, None|tanh | 1000 | 128 |  mae (loss): 0.0580, mse: 0.0066 |
|128, None|tanh | 1000 | 128 |  mae (loss): 0.0738, mse: 0.0095 |
|256, None|tanh | 1000 | 128 |  mae (loss): 0.0633, mse: 0.0072 |
|64, 32|tanh | 1000 | 128 |  mae (loss): 0.0533, mse: 0.0053 |
|128, 64|tanh | 1000 | 128 |  mae (loss): 0.0529, mse: 0.0052 |


Best result: neurons=128, 64, activation=relu, batch_size=128, epochs=500, MAE=0.0382, MSE=0.0028

### SGD optimizer results

- Before Feedback:  

| Optimizer |	Hidden layer neurons (in sequence) |	Train set |	Test set |
|---------------|---------------|---------------------------|---------------------------|
| **SGD, lr=3e-4** | 128            | mse: 0.0046 - mae: 0.0478     | **mse: 0.0042 - mae: 0.0466**     |
| **SGD, lr=3e-5** | 128            | mse: 0.0046 - mae: 0.0478     | **mse: 0.0042 - mae: 0.0466**     |
| **SGD, lr=3e-6** | 128            | mse: 0.0046 - mae: 0.0478     | **mse: 0.0042 - mae: 0.0466**     |
| SGD, lr=3e-4 | 512, 256, 128  | mse: 0.0074 - mae: 0.0632     | mse: 0.0070 - mae: 0.0624     |
| SGD, lr=3e-5 | 512, 256, 128  | mse: 0.0157 - mae: 0.0907     | mse: 0.0151 - mae: 0.0899     |
| SGD, lr=3e-6 | 512, 256, 128  | mse: 0.0297 - mae: 0.1292     | mse: 0.0290 - mae: 0.1281     |
|SGD, lr=3e-4 | 1024, 512, 256 | mse: 0.0070 - mae: 0.0617 | mse: 0.0067 - mae: 0.0610 |
| SGD, lr=3e-5 | 1024, 512, 256 | mse: 0.0143 - mae: 0.0861     | mse: 0.0138 - mae: 0.0857     |
| SGD, lr=3e-6 | 1024, 512, 256 | mse: 0.0258 - mae: 0.1168     | mse: 0.0252 - mae: 0.1159     |



- After Feedback:  
SGD(learning_rate = 1e-4, decay=1e-7, momentum=0.9, nesterov=True)

| Neurons | Activation function | Epochs | Batch size | Results on test set|
|---------|---------------------|--------|------------|--------------------|
64, None | relu | 100 | 16 |  mae (loss): 0.0586, mse: 0.0066 |
128, None | relu | 100 | 16 |  mae (loss): 0.0522, mse: 0.0056 |
256, None | relu | 100 | 16 |  mae (loss): 0.0512, mse: 0.0051 |
64, 32 | relu | 100 | 16 |  mae (loss): 0.0580, mse: 0.0068 |
128, 64 | relu | 100 | 16 |  mae (loss): 0.0520, mse: 0.0056 |
64, None | selu | 100 | 16 |  mae (loss): 0.0572, mse: 0.0063 |
128, None | selu | 100 | 16 |  mae (loss): 0.0553, mse: 0.0061 |
256, None | selu | 100 | 16 |  mae (loss): 0.0522, mse: 0.0052 |
64, 32 | selu | 100 | 16 |  mae (loss): 0.0568, mse: 0.0064 |
128, 64 | selu | 100 | 16 |  mae (loss): 0.0522, mse: 0.0056 |
64, None | tanh | 100 | 16 |  mae (loss): 0.0587, mse: 0.0068 |
128, None | tanh | 100 | 16 |  mae (loss): 0.0569, mse: 0.0064 |
256, None | tanh | 100 | 16 |  mae (loss): 0.0616, mse: 0.0074 |
64, 32 | tanh | 100 | 16 |  mae (loss): 0.0615, mse: 0.0073 |
128, 64 | tanh | 100 | 16 |  mae (loss): 0.0570, mse: 0.0064 |
64, None | relu | 500 | 16 |  mae (loss): 0.0561, mse: 0.0061 |
128, None | relu | 500 | 16 |  mae (loss): 0.0517, mse: 0.0052 |
256, None | relu | 500 | 16 |  mae (loss): 0.0487, mse: 0.0047 |
64, 32 | relu | 500 | 16 |  mae (loss): 0.0517, mse: 0.0054 |
128, 64 | relu | 500 | 16 |  mae (loss): 0.0488, mse: 0.0048 |
64, None | selu | 500 | 16 |  mae (loss): 0.0525, mse: 0.0056 |
128, None | selu | 500 | 16 |  mae (loss): 0.0552, mse: 0.0061 |
256, None | selu | 500 | 16 |  mae (loss): 0.0567, mse: 0.0061 |
64, 32 | selu | 500 | 16 |  mae (loss): 0.0523, mse: 0.0054 |
128, 64 | selu | 500 | 16 |  mae (loss): 0.0498, mse: 0.0050 |
64, None | tanh | 500 | 16 |  mae (loss): 0.0554, mse: 0.0061 |
128, None | tanh | 500 | 16 |  mae (loss): 0.0571, mse: 0.0064 |
256, None | tanh | 500 | 16 |  mae (loss): 0.0607, mse: 0.0071 |
64, 32 | tanh | 500 | 16 |  mae (loss): 0.0549, mse: 0.0058 |
128, 64 | tanh | 500 | 16 |  mae (loss): 0.0508, mse: 0.0051 |
64, None | relu | 1000 | 16 |  mae (loss): 0.0498, mse: 0.0050 |
128, None | relu | 1000 | 16 |  mae (loss): 0.0534, mse: 0.0056 |
256, None | relu | 1000 | 16 |  mae (loss): 0.0462, mse: 0.0041 |
64, 32 | relu | 1000 | 16 |  mae (loss): 0.0522, mse: 0.0055 |
128, 64 | relu | 1000 | 16 |  mae (loss): 0.0486, mse: 0.0048 |
64, None | selu | 1000 | 16 |  mae (loss): 0.0558, mse: 0.0063 |
128, None | selu | 1000 | 16 |  mae (loss): 0.0539, mse: 0.0060 |
256, None | selu | 1000 | 16 |  mae (loss): 0.0564, mse: 0.0063 |
64, 32 | selu | 1000 | 16 |  mae (loss): 0.0522, mse: 0.0055 |
128, 64 | selu | 1000 | 16 |  mae (loss): 0.0509, mse: 0.0052 |
64, None | tanh | 1000 | 16 |  mae (loss): 0.0551, mse: 0.0062 |
128, None | tanh | 1000 | 16 |  mae (loss): 0.0560, mse: 0.0063 |
256, None | tanh | 1000 | 16 |  mae (loss): 0.0578, mse: 0.0066 |
64, 32 | tanh | 1000 | 16 |  mae (loss): 0.0543, mse: 0.0056 |
128, 64 | tanh | 1000 | 16 |  mae (loss): 0.0509, mse: 0.0051 |
64, None | relu | 100 | 32 |  mae (loss): 0.0637, mse: 0.0079 |
128, None | relu | 100 | 32 |  mae (loss): 0.0586, mse: 0.0067 |
256, None | relu | 100 | 32 |  mae (loss): 0.0542, mse: 0.0060 |
64, 32 | relu | 100 | 32 |  mae (loss): 0.0623, mse: 0.0076 |
128, 64 | relu | 100 | 32 |  mae (loss): 0.0546, mse: 0.0063 |
64, None | selu | 100 | 32 |  mae (loss): 0.0627, mse: 0.0078 |
128, None | selu | 100 | 32 |  mae (loss): 0.0573, mse: 0.0067 |
256, None | selu | 100 | 32 |  mae (loss): 0.0561, mse: 0.0063 |
64, 32 | selu | 100 | 32 |  mae (loss): 0.0622, mse: 0.0075 |
128, 64 | selu | 100 | 32 |  mae (loss): 0.0548, mse: 0.0060 |
64, None | tanh | 100 | 32 |  mae (loss): 0.0603, mse: 0.0073 |
128, None | tanh | 100 | 32 |  mae (loss): 0.0587, mse: 0.0067 |
256, None | tanh | 100 | 32 |  mae (loss): 0.0607, mse: 0.0072 |
64, 32 | tanh | 100 | 32 |  mae (loss): 0.0688, mse: 0.0093 |
128, 64 | tanh | 100 | 32 |  mae (loss): 0.0590, mse: 0.0068 |
64, None | relu | 500 | 32 |  mae (loss): 0.0561, mse: 0.0064 |
128, None | relu | 500 | 32 |  mae (loss): 0.0533, mse: 0.0057 |
256, None | relu | 500 | 32 |  mae (loss): 0.0462, mse: 0.0043 |
64, 32 | relu | 500 | 32 |  mae (loss): 0.0522, mse: 0.0056 |
128, 64 | relu | 500 | 32 |  mae (loss): 0.0492, mse: 0.0049 |
64, None | selu | 500 | 32 |  mae (loss): 0.0557, mse: 0.0063 |
128, None | selu | 500 | 32 |  mae (loss): 0.0535, mse: 0.0058 |
256, None | selu | 500 | 32 |  mae (loss): 0.0543, mse: 0.0057 |
64, 32 | selu | 500 | 32 |  mae (loss): 0.0506, mse: 0.0053 |
128, 64 | selu | 500 | 32 |  mae (loss): 0.0502, mse: 0.0051 |
64, None | tanh | 500 | 32 |  mae (loss): 0.0541, mse: 0.0059 |
128, None | tanh | 500 | 32 |  mae (loss): 0.0565, mse: 0.0065 |
256, None | tanh | 500 | 32 |  mae (loss): 0.0585, mse: 0.0067 |
64, 32 | tanh | 500 | 32 |  mae (loss): 0.0517, mse: 0.0053 |
128, 64 | tanh | 500 | 32 |  mae (loss): 0.0474, mse: 0.0044 |
64, None | relu | 1000 | 32 |  mae (loss): 0.0544, mse: 0.0058 |
128, None | relu | 1000 | 32 |  mae (loss): 0.0519, mse: 0.0053 |
256, None | relu | 1000 | 32 |  mae (loss): 0.0467, mse: 0.0043 |
64, 32 | relu | 1000 | 32 |  mae (loss): 0.0480, mse: 0.0048 |
128, 64 | relu | 1000 | 32 |  mae (loss): 0.0474, mse: 0.0046 |
64, None | selu | 1000 | 32 |  mae (loss): 0.0564, mse: 0.0064 |
128, None | selu | 1000 | 32 |  mae (loss): 0.0533, mse: 0.0058 |
256, None | selu | 1000 | 32 |  mae (loss): 0.0511, mse: 0.0054 |
64, 32 | selu | 1000 | 32 |  mae (loss): 0.0496, mse: 0.0050 |
128, 64 | selu | 1000 | 32 |  mae (loss): 0.0487, mse: 0.0049 |
64, None | tanh | 1000 | 32 |  mae (loss): 0.0542, mse: 0.0060 |
128, None | tanh | 1000 | 32 |  mae (loss): 0.0567, mse: 0.0064 |
256, None | tanh | 1000 | 32 |  mae (loss): 0.0596, mse: 0.0070 |
64, 32 | tanh | 1000 | 32 |  mae (loss): 0.0528, mse: 0.0056 |
128, 64 | tanh | 1000 | 32 |  mae (loss): 0.0502, mse: 0.0050 |
64, None | relu | 100 | 64 |  mae (loss): 0.0702, mse: 0.0092 |
128, None | relu | 100 | 64 |  mae (loss): 0.0626, mse: 0.0076 |
256, None | relu | 100 | 64 |  mae (loss): 0.0573, mse: 0.0063 |
64, 32 | relu | 100 | 64 |  mae (loss): 0.0707, mse: 0.0096 |
128, 64 | relu | 100 | 64 |  mae (loss): 0.0652, mse: 0.0082 |
64, None | selu | 100 | 64 |  mae (loss): 0.0677, mse: 0.0092 |
128, None | selu | 100 | 64 |  mae (loss): 0.0605, mse: 0.0071 |
256, None | selu | 100 | 64 |  mae (loss): 0.0588, mse: 0.0069 |
64, 32 | selu | 100 | 64 |  mae (loss): 0.0704, mse: 0.0095 |
128, 64 | selu | 100 | 64 |  mae (loss): 0.0635, mse: 0.0079 |
64, None | tanh | 100 | 64 |  mae (loss): 0.0678, mse: 0.0090 |
128, None | tanh | 100 | 64 |  mae (loss): 0.0612, mse: 0.0073 |
256, None | tanh | 100 | 64 |  mae (loss): 0.0613, mse: 0.0073 |
64, 32 | tanh | 100 | 64 |  mae (loss): 0.0733, mse: 0.0106 |
128, 64 | tanh | 100 | 64 |  mae (loss): 0.0659, mse: 0.0086 |
64, None | relu | 500 | 64 |  mae (loss): 0.0586, mse: 0.0069 |
128, None | relu | 500 | 64 |  mae (loss): 0.0512, mse: 0.0053 |
256, None | relu | 500 | 64 |  mae (loss): 0.0495, mse: 0.0049 |
64, 32 | relu | 500 | 64 |  mae (loss): 0.0561, mse: 0.0065 |
128, 64 | relu | 500 | 64 |  mae (loss): 0.0493, mse: 0.0050 |
64, None | selu | 500 | 64 |  mae (loss): 0.0567, mse: 0.0065 |
128, None | selu | 500 | 64 |  mae (loss): 0.0584, mse: 0.0068 |
256, None | selu | 500 | 64 |  mae (loss): 0.0593, mse: 0.0066 |
64, 32 | selu | 500 | 64 |  mae (loss): 0.0523, mse: 0.0057 |
128, 64 | selu | 500 | 64 |  mae (loss): 0.0469, mse: 0.0045 |
64, None | tanh | 500 | 64 |  mae (loss): 0.0546, mse: 0.0061 |
128, None | tanh | 500 | 64 |  mae (loss): 0.0569, mse: 0.0065 |
256, None | tanh | 500 | 64 |  mae (loss): 0.0639, mse: 0.0080 |
64, 32 | tanh | 500 | 64 |  mae (loss): 0.0593, mse: 0.0070 |
128, 64 | tanh | 500 | 64 |  mae (loss): 0.0500, mse: 0.0052 |
64, None | relu | 1000 | 64 |  mae (loss): 0.0549, mse: 0.0062 |
128, None | relu | 1000 | 64 |  mae (loss): 0.0523, mse: 0.0055 |
256, None | relu | 1000 | 64 |  mae (loss): 0.0490, mse: 0.0048 |
64, 32 | relu | 1000 | 64 |  mae (loss): 0.4887, mse: 0.4384 |
128, 64 | relu | 1000 | 64 |  mae (loss): 0.0475, mse: 0.0046 |
64, None | selu | 1000 | 64 |  mae (loss): 0.0542, mse: 0.0059 |
128, None | selu | 1000 | 64 |  mae (loss): 0.0584, mse: 0.0068 |
256, None | selu | 1000 | 64 |  mae (loss): 0.0550, mse: 0.0060 |
64, 32 | selu | 1000 | 64 |  mae (loss): 0.0497, mse: 0.0051 |
128, 64 | selu | 1000 | 64 |  mae (loss): 0.0479, mse: 0.0046 |
64, None | tanh | 1000 | 64 |  mae (loss): 0.0543, mse: 0.0061 |
128, None | tanh | 1000 | 64 |  mae (loss): 0.0562, mse: 0.0064 |
256, None | tanh | 1000 | 64 |  mae (loss): 0.0603, mse: 0.0071 |
64, 32 | tanh | 1000 | 64 |  mae (loss): 0.0502, mse: 0.0051 |
128, 64 | tanh | 1000 | 64 |  mae (loss): 0.0446, mse: 0.0040 |
64, None | relu | 100 | 128 |  mae (loss): 0.0811, mse: 0.0126 |
128, None | relu | 100 | 128 |  mae (loss): 0.0716, mse: 0.0100 |
256, None | relu | 100 | 128 |  mae (loss): 0.0650, mse: 0.0090 |
64, 32 | relu | 100 | 128 |  mae (loss): 0.6550, mse: 0.7704 |
128, 64 | relu | 100 | 128 |  mae (loss): 0.0945, mse: 0.0167 |
64, None | selu | 100 | 128 |  mae (loss): 0.0849, mse: 0.0145 |
128, None | selu | 100 | 128 |  mae (loss): 0.0693, mse: 0.0093 |
256, None | selu | 100 | 128 |  mae (loss): 0.0616, mse: 0.0074 |
64, 32 | selu | 100 | 128 |  mae (loss): 0.0870, mse: 0.0156 |
128, 64 | selu | 100 | 128 |  mae (loss): 0.0777, mse: 0.0118 |
64, None | tanh | 100 | 128 |  mae (loss): 0.0811, mse: 0.0126 |
128, None | tanh | 100 | 128 |  mae (loss): 0.5399, mse: 0.4633 |
256, None | tanh | 100 | 128 |  mae (loss): 0.0645, mse: 0.0081 |
64, 32 | tanh | 100 | 128 |  mae (loss): 0.0874, mse: 0.0155 |
128, 64 | tanh | 100 | 128 |  mae (loss): 0.0735, mse: 0.0107 |
64, None | relu | 500 | 128 |  mae (loss): 0.0602, mse: 0.0072 |
128, None | relu | 500 | 128 |  mae (loss): 0.0527, mse: 0.0057 |
256, None | relu | 500 | 128 |  mae (loss): 0.0485, mse: 0.0049 |
64, 32 | relu | 500 | 128 |  mae (loss): 0.6235, mse: 0.6792 |
128, 64 | relu | 500 | 128 |  mae (loss): 0.0513, mse: 0.0053 |
64, None | selu | 500 | 128 |  mae (loss): 0.5431, mse: 0.4909 |
128, None | selu | 500 | 128 |  mae (loss): 0.0581, mse: 0.0068 |
256, None | selu | 500 | 128 |  mae (loss): 0.0566, mse: 0.0063 |
64, 32 | selu | 500 | 128 |  mae (loss): 0.0582, mse: 0.0069 |
128, 64 | selu | 500 | 128 |  mae (loss): 0.0520, mse: 0.0054 |
64, None | tanh | 500 | 128 |  mae (loss): 0.5504, mse: 0.4784 |
128, None | tanh | 500 | 128 |  mae (loss): 0.0581, mse: 0.0068 |
256, None | tanh | 500 | 128 |  mae (loss): 0.0611, mse: 0.0074 |
64, 32 | tanh | 500 | 128 |  mae (loss): 0.0640, mse: 0.0082 |
128, 64 | tanh | 500 | 128 |  mae (loss): 0.5182, mse: 0.4259 |
64, None | relu | 1000 | 128 |  mae (loss): 0.5490, mse: 0.4921 |
128, None | relu | 1000 | 128 |  mae (loss): 0.0576, mse: 0.0065 |
256, None | relu | 1000 | 128 |  mae (loss): 0.0494, mse: 0.0052 |
64, 32 | relu | 1000 | 128 |  mae (loss): 0.6454, mse: 0.7040 |
128, 64 | relu | 1000 | 128 |  mae (loss): 0.5908, mse: 0.6249 |
64, None | selu | 1000 | 128 |  mae (loss): 0.0552, mse: 0.0062 |
128, None | selu | 1000 | 128 |  mae (loss): 0.5058, mse: 0.4101 |
256, None | selu | 1000 | 128 |  mae (loss): 0.0571, mse: 0.0065 |
64, 32 | selu | 1000 | 128 |  mae (loss): 0.0538, mse: 0.0058 |
128, 64 | selu | 1000 | 128 |  mae (loss): 0.0468, mse: 0.0048 |
64, None | tanh | 1000 | 128 |  mae (loss): 0.5763, mse: 0.5252 |
128, None | tanh | 1000 | 128 |  mae (loss): 0.0583, mse: 0.0069 |
256, None | tanh | 1000 | 128 |  mae (loss): 0.0595, mse: 0.0071 |
64, 32 | tanh | 1000 | 128 |  mae (loss): 0.0578, mse: 0.0067 |
128, 64 | tanh | 1000 | 128 |  mae (loss): 0.0462, mse: 0.0044 |


Best result: neurons=128, 64, activation=tanh, batch_size=64, epochs=1000, MAE=0.0446, MSE=0.0040

### Nestorov Adam optimizer results

- Automated Tests:  
Nadam(lr=1e-4, beta_1=0.9, beta_2=0.999)  

| Neurons | Activation function | Epochs | Batch size | Results on test set|
|---------|---------------------|--------|------------|--------------------|
64, None | relu | 100 | 16 |  mae (loss): 0.0637, mse: 0.0075 |
128, None | relu | 100 | 16 |  mae (loss): 0.0515, mse: 0.0050 |
256, None | relu | 100 | 16 |  mae (loss): 0.0600, mse: 0.0062 |
64, 32 | relu | 100 | 16 |  mae (loss): 0.0460, mse: 0.0040 |
128, 64 | relu | 100 | 16 |  mae (loss): 0.0432, mse: 0.0035 |
64, None | selu | 100 | 16 |  mae (loss): 0.0585, mse: 0.0063 |
128, None | selu | 100 | 16 |  mae (loss): 0.0686, mse: 0.0086 |
256, None | selu | 100 | 16 |  mae (loss): 0.0608, mse: 0.0066 |
64, 32 | selu | 100 | 16 |  mae (loss): 0.0483, mse: 0.0044 |
128, 64 | selu | 100 | 16 |  mae (loss): 0.0518, mse: 0.0049 |
64, None | tanh | 100 | 16 |  mae (loss): 0.0602, mse: 0.0067 |
128, None | tanh | 100 | 16 |  mae (loss): 0.0659, mse: 0.0073 |
256, None | tanh | 100 | 16 |  mae (loss): 0.0703, mse: 0.0081 |
64, 32 | tanh | 100 | 16 |  mae (loss): 0.0532, mse: 0.0053 |
128, 64 | tanh | 100 | 16 |  mae (loss): 0.0614, mse: 0.0067 |
64, None | relu | 500 | 16 |  mae (loss): 0.0547, mse: 0.0057 |
128, None | relu | 500 | 16 |  mae (loss): 0.0537, mse: 0.0052 |
256, None | relu | 500 | 16 |  mae (loss): 0.0469, mse: 0.0039 |
64, 32 | relu | 500 | 16 |  mae (loss): 0.0510, mse: 0.0049 |
128, 64 | relu | 500 | 16 |  mae (loss): 0.0425, mse: 0.0033 |
64, None | selu | 500 | 16 |  mae (loss): 0.0570, mse: 0.0061 |
128, None | selu | 500 | 16 |  mae (loss): 0.0603, mse: 0.0066 |
256, None | selu | 500 | 16 |  mae (loss): 0.0842, mse: 0.0114 |
64, 32 | selu | 500 | 16 |  mae (loss): 0.0503, mse: 0.0049 |
128, 64 | selu | 500 | 16 |  mae (loss): 0.0489, mse: 0.0046 |
64, None | tanh | 500 | 16 |  mae (loss): 0.0607, mse: 0.0070 |
128, None | tanh | 500 | 16 |  mae (loss): 0.0807, mse: 0.0098 |
256, None | tanh | 500 | 16 |  mae (loss): 0.0705, mse: 0.0088 |
64, 32 | tanh | 500 | 16 |  mae (loss): 0.0546, mse: 0.0056 |
128, 64 | tanh | 500 | 16 |  mae (loss): 0.0564, mse: 0.0057 |
64, None | relu | 1000 | 16 |  mae (loss): 0.0549, mse: 0.0057 |
128, None | relu | 1000 | 16 |  mae (loss): 0.0532, mse: 0.0053 |
256, None | relu | 1000 | 16 |  mae (loss): 0.0484, mse: 0.0043 |
64, 32 | relu | 1000 | 16 |  mae (loss): 0.0474, mse: 0.0043 |
128, 64 | relu | 1000 | 16 |  mae (loss): 0.0416, mse: 0.0034 |
64, None | selu | 1000 | 16 |  mae (loss): 0.0557, mse: 0.0059 |
128, None | selu | 1000 | 16 |  mae (loss): 0.0608, mse: 0.0066 |
256, None | selu | 1000 | 16 |  mae (loss): 0.0781, mse: 0.0104 |
64, 32 | selu | 1000 | 16 |  mae (loss): 0.0516, mse: 0.0050 |
128, 64 | selu | 1000 | 16 |  mae (loss): 0.0508, mse: 0.0048 |
64, None | tanh | 1000 | 16 |  mae (loss): 0.0586, mse: 0.0063 |
128, None | tanh | 1000 | 16 |  mae (loss): 0.0614, mse: 0.0068 |
256, None | tanh | 1000 | 16 |  mae (loss): 0.0786, mse: 0.0110 |
64, 32 | tanh | 1000 | 16 |  mae (loss): 0.0570, mse: 0.0059 |
128, 64 | tanh | 1000 | 16 |  mae (loss): 0.0505, mse: 0.0046 |
64, None | relu | 100 | 32 |  mae (loss): 0.0637, mse: 0.0077 |
128, None | relu | 100 | 32 |  mae (loss): 0.0577, mse: 0.0063 |
256, None | relu | 100 | 32 |  mae (loss): 0.0544, mse: 0.0054 |
64, 32 | relu | 100 | 32 |  mae (loss): 0.0469, mse: 0.0042 |
128, 64 | relu | 100 | 32 |  mae (loss): 0.0416, mse: 0.0033 |
64, None | selu | 100 | 32 |  mae (loss): 0.0629, mse: 0.0074 |
128, None | selu | 100 | 32 |  mae (loss): 0.0572, mse: 0.0060 |
256, None | selu | 100 | 32 |  mae (loss): 0.0693, mse: 0.0082 |
64, 32 | selu | 100 | 32 |  mae (loss): 0.0504, mse: 0.0048 |
128, 64 | selu | 100 | 32 |  mae (loss): 0.0498, mse: 0.0043 |
64, None | tanh | 100 | 32 |  mae (loss): 0.0612, mse: 0.0067 |
128, None | tanh | 100 | 32 |  mae (loss): 0.0664, mse: 0.0079 |
256, None | tanh | 100 | 32 |  mae (loss): 0.0632, mse: 0.0070 |
64, 32 | tanh | 100 | 32 |  mae (loss): 0.0582, mse: 0.0060 |
128, 64 | tanh | 100 | 32 |  mae (loss): 0.0483, mse: 0.0043 |
64, None | relu | 500 | 32 |  mae (loss): 0.0524, mse: 0.0052 |
128, None | relu | 500 | 32 |  mae (loss): 0.0466, mse: 0.0041 |
256, None | relu | 500 | 32 |  mae (loss): 0.0446, mse: 0.0035 |
64, 32 | relu | 500 | 32 |  mae (loss): 0.0430, mse: 0.0037 |
128, 64 | relu | 500 | 32 |  mae (loss): 0.0409, mse: 0.0032 |
64, None | selu | 500 | 32 |  mae (loss): 0.0575, mse: 0.0064 |
128, None | selu | 500 | 32 |  mae (loss): 0.0599, mse: 0.0063 |
256, None | selu | 500 | 32 |  mae (loss): 0.0626, mse: 0.0067 |
64, 32 | selu | 500 | 32 |  mae (loss): 0.0492, mse: 0.0046 |
128, 64 | selu | 500 | 32 |  mae (loss): 0.0540, mse: 0.0052 |
64, None | tanh | 500 | 32 |  mae (loss): 0.0580, mse: 0.0062 |
128, None | tanh | 500 | 32 |  mae (loss): 0.0700, mse: 0.0083 |
256, None | tanh | 500 | 32 |  mae (loss): 0.0798, mse: 0.0102 |
64, 32 | tanh | 500 | 32 |  mae (loss): 0.0518, mse: 0.0051 |
128, 64 | tanh | 500 | 32 |  mae (loss): 0.0522, mse: 0.0050 |
64, None | relu | 1000 | 32 |  mae (loss): 0.0590, mse: 0.0068 |
128, None | relu | 1000 | 32 |  mae (loss): 0.0594, mse: 0.0062 |
256, None | relu | 1000 | 32 |  mae (loss): 0.0483, mse: 0.0042 |
64, 32 | relu | 1000 | 32 |  mae (loss): 0.0435, mse: 0.0036 |
128, 64 | relu | 1000 | 32 |  mae (loss): 0.0384, mse: 0.0028 |
64, None | selu | 1000 | 32 |  mae (loss): 0.0560, mse: 0.0061 |
128, None | selu | 1000 | 32 |  mae (loss): 0.0604, mse: 0.0067 |
256, None | selu | 1000 | 32 |  mae (loss): 0.0666, mse: 0.0076 |
64, 32 | selu | 1000 | 32 |  mae (loss): 0.0494, mse: 0.0047 |
128, 64 | selu | 1000 | 32 |  mae (loss): 0.0465, mse: 0.0041 |
64, None | tanh | 1000 | 32 |  mae (loss): 0.0581, mse: 0.0064 |
128, None | tanh | 1000 | 32 |  mae (loss): 0.0574, mse: 0.0061 |
256, None | tanh | 1000 | 32 |  mae (loss): 0.0782, mse: 0.0102 |
64, 32 | tanh | 1000 | 32 |  mae (loss): 0.0495, mse: 0.0046 |
128, 64 | tanh | 1000 | 32 |  mae (loss): 0.0542, mse: 0.0052 |
64, None | relu | 100 | 64 |  mae (loss): 0.0616, mse: 0.0073 |
128, None | relu | 100 | 64 |  mae (loss): 0.0489, mse: 0.0045 |
256, None | relu | 100 | 64 |  mae (loss): 0.0469, mse: 0.0040 |
64, 32 | relu | 100 | 64 |  mae (loss): 0.0499, mse: 0.0046 |
128, 64 | relu | 100 | 64 |  mae (loss): 0.0423, mse: 0.0034 |
64, None | selu | 100 | 64 |  mae (loss): 0.0587, mse: 0.0065 |
128, None | selu | 100 | 64 |  mae (loss): 0.0634, mse: 0.0074 |
256, None | selu | 100 | 64 |  mae (loss): 0.0668, mse: 0.0077 |
64, 32 | selu | 100 | 64 |  mae (loss): 0.0555, mse: 0.0059 |
128, 64 | selu | 100 | 64 |  mae (loss): 0.0500, mse: 0.0047 |
64, None | tanh | 100 | 64 |  mae (loss): 0.0545, mse: 0.0056 |
128, None | tanh | 100 | 64 |  mae (loss): 0.0644, mse: 0.0077 |
256, None | tanh | 100 | 64 |  mae (loss): 0.0686, mse: 0.0083 |
64, 32 | tanh | 100 | 64 |  mae (loss): 0.0502, mse: 0.0049 |
128, 64 | tanh | 100 | 64 |  mae (loss): 0.0491, mse: 0.0044 |
64, None | relu | 500 | 64 |  mae (loss): 0.0614, mse: 0.0073 |
128, None | relu | 500 | 64 |  mae (loss): 0.0527, mse: 0.0054 |
256, None | relu | 500 | 64 |  mae (loss): 0.0485, mse: 0.0043 |
64, 32 | relu | 500 | 64 |  mae (loss): 0.0458, mse: 0.0040 |
128, 64 | relu | 500 | 64 |  mae (loss): 0.0399, mse: 0.0029 |
64, None | selu | 500 | 64 |  mae (loss): 0.0565, mse: 0.0062 |
128, None | selu | 500 | 64 |  mae (loss): 0.0579, mse: 0.0063 |
256, None | selu | 500 | 64 |  mae (loss): 0.0647, mse: 0.0074 |
64, 32 | selu | 500 | 64 |  mae (loss): 0.0476, mse: 0.0044 |
128, 64 | selu | 500 | 64 |  mae (loss): 0.0510, mse: 0.0046 |
64, None | tanh | 500 | 64 |  mae (loss): 0.0588, mse: 0.0066 |
128, None | tanh | 500 | 64 |  mae (loss): 0.0729, mse: 0.0090 |
256, None | tanh | 500 | 64 |  mae (loss): 0.0717, mse: 0.0089 |
64, 32 | tanh | 500 | 64 |  mae (loss): 0.0522, mse: 0.0051 |
128, 64 | tanh | 500 | 64 |  mae (loss): 0.0547, mse: 0.0054 |
64, None | relu | 1000 | 64 |  mae (loss): 0.0555, mse: 0.0060 |
128, None | relu | 1000 | 64 |  mae (loss): 0.0574, mse: 0.0061 |
256, None | relu | 1000 | 64 |  mae (loss): 0.0555, mse: 0.0054 |
64, 32 | relu | 1000 | 64 |  mae (loss): 0.0486, mse: 0.0045 |
128, 64 | relu | 1000 | 64 |  mae (loss): 0.0440, mse: 0.0037 |
64, None | selu | 1000 | 64 |  mae (loss): 0.0555, mse: 0.0059 |
128, None | selu | 1000 | 64 |  mae (loss): 0.0607, mse: 0.0070 |
256, None | selu | 1000 | 64 |  mae (loss): 0.0548, mse: 0.0057 |
64, 32 | selu | 1000 | 64 |  mae (loss): 0.0496, mse: 0.0049 |
128, 64 | selu | 1000 | 64 |  mae (loss): 0.0469, mse: 0.0042 |
64, None | tanh | 1000 | 64 |  mae (loss): 0.0559, mse: 0.0060 |
128, None | tanh | 1000 | 64 |  mae (loss): 0.0610, mse: 0.0069 |
256, None | tanh | 1000 | 64 |  mae (loss): 0.0679, mse: 0.0082 |
64, 32 | tanh | 1000 | 64 |  mae (loss): 0.0539, mse: 0.0055 |
128, 64 | tanh | 1000 | 64 |  mae (loss): 0.0520, mse: 0.0047 |
64, None | relu | 100 | 128 |  mae (loss): 0.0542, mse: 0.0058 |
128, None | relu | 100 | 128 |  mae (loss): 0.0573, mse: 0.0061 |
256, None | relu | 100 | 128 |  mae (loss): 0.0560, mse: 0.0057 |
64, 32 | relu | 100 | 128 |  mae (loss): 0.0473, mse: 0.0043 |
128, 64 | relu | 100 | 128 |  mae (loss): 0.0395, mse: 0.0030 |
64, None | selu | 100 | 128 |  mae (loss): 0.0590, mse: 0.0068 |
128, None | selu | 100 | 128 |  mae (loss): 0.0740, mse: 0.0095 |
256, None | selu | 100 | 128 |  mae (loss): 0.0632, mse: 0.0071 |
64, 32 | selu | 100 | 128 |  mae (loss): 0.0488, mse: 0.0048 |
128, 64 | selu | 100 | 128 |  mae (loss): 0.0489, mse: 0.0044 |
64, None | tanh | 100 | 128 |  mae (loss): 0.0620, mse: 0.0072 |
128, None | tanh | 100 | 128 |  mae (loss): 0.0625, mse: 0.0073 |
256, None | tanh | 100 | 128 |  mae (loss): 0.0770, mse: 0.0112 |
64, 32 | tanh | 100 | 128 |  mae (loss): 0.0557, mse: 0.0057 |
128, 64 | tanh | 100 | 128 |  mae (loss): 0.0498, mse: 0.0046 |
64, None | relu | 500 | 128 |  mae (loss): 0.0579, mse: 0.0064 |
128, None | relu | 500 | 128 |  mae (loss): 0.0583, mse: 0.0065 |
256, None | relu | 500 | 128 |  mae (loss): 0.0539, mse: 0.0049 |
64, 32 | relu | 500 | 128 |  mae (loss): 0.0458, mse: 0.0040 |
128, 64 | relu | 500 | 128 |  mae (loss): 0.0444, mse: 0.0038 |
64, None | selu | 500 | 128 |  mae (loss): 0.0598, mse: 0.0068 |
128, None | selu | 500 | 128 |  mae (loss): 0.0598, mse: 0.0067 |
256, None | selu | 500 | 128 |  mae (loss): 0.0619, mse: 0.0067 |
64, 32 | selu | 500 | 128 |  mae (loss): 0.0537, mse: 0.0057 |
128, 64 | selu | 500 | 128 |  mae (loss): 0.0495, mse: 0.0046 |
64, None | tanh | 500 | 128 |  mae (loss): 0.0594, mse: 0.0067 |
128, None | tanh | 500 | 128 |  mae (loss): 0.0579, mse: 0.0064 |
256, None | tanh | 500 | 128 |  mae (loss): 0.0759, mse: 0.0095 |
64, 32 | tanh | 500 | 128 |  mae (loss): 0.0546, mse: 0.0056 |
128, 64 | tanh | 500 | 128 |  mae (loss): 0.0482, mse: 0.0044 |
64, None | relu | 1000 | 128 |  mae (loss): 0.0545, mse: 0.0060 |
128, None | relu | 1000 | 128 |  mae (loss): 0.0562, mse: 0.0060 |
256, None | relu | 1000 | 128 |  mae (loss): 0.0492, mse: 0.0043 |
64, 32 | relu | 1000 | 128 |  mae (loss): 0.0432, mse: 0.0036 |
128, 64 | relu | 1000 | 128 |  mae (loss): 0.0396, mse: 0.0030 |
64, None | selu | 1000 | 128 |  mae (loss): 0.0626, mse: 0.0076 |
128, None | selu | 1000 | 128 |  mae (loss): 0.0557, mse: 0.0060 |
256, None | selu | 1000 | 128 |  mae (loss): 0.0705, mse: 0.0087 |
64, 32 | selu | 1000 | 128 |  mae (loss): 0.0500, mse: 0.0049 |
128, 64 | selu | 1000 | 128 |  mae (loss): 0.0484, mse: 0.0045 |
64, None | tanh | 1000 | 128 |  mae (loss): 0.0637, mse: 0.0076 |
128, None | tanh | 1000 | 128 |  mae (loss): 0.0646, mse: 0.0074 |
256, None | tanh | 1000 | 128 |  mae (loss): 0.0693, mse: 0.0085 |
64, 32 | tanh | 1000 | 128 |  mae (loss): 0.0495, mse: 0.0047 |
128, 64 | tanh | 1000 | 128 |  mae (loss): 0.0524, mse: 0.0050 |


Best result: neurons=128, 64, activation=relu, batch_size=32, epochs=1000, MAE=0.0384, MSE=0.0028