# Training the Fully-Connected Neural Network Model

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import h5py
import re
import pandas as pd

In [2]:
%load_ext tensorboard
%load_ext autoreload

Load code for this project

In [3]:
import fcnn.train
import fcnn.eval
import data_processing.data as dp
%autoreload 1
%aimport fcnn.train
%aimport fcnn.eval

### Load the training data

In [4]:
path_data = './data_processing/voxels/'
train, _ = dp.load_discretized_data(path_data, prefix='Grid20', categorical=False, binary=True, normalize=True)

Loading discretized data from: ./data_processing/voxels/Grid20voxels.h5


In [5]:
train = (train[0].toarray(), train[1])

Run the following to clear the logged training data visualized in `tensorboard`.

In [6]:
!rm -rf fcnn/logs/*

## Building the model 

The model used in the current work is composed of one fully-connected layer with ReLU activation. A single-node Sigmoid activated output layer ends the model.

The implementation of the FCNN model is flexible in a sense that one can choose the number of hidden layers, number of neurons in each layer as well as if one wants to apply dropout and at which probability. In the training, only one hidden layer was considered. An excerpt from the implemented code may be seen below:

```python
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(hidden_layers[0], input_dim=train[FEATURES].shape[1], activation='relu'))
if use_dropout:
    model.add(tf.keras.layers.Dropout(dropout))
for neurons in hidden_layers[1:]:
    model.add(tf.keras.layers.Dense(neurons, input_dim=train[FEATURES].shape[1], activation='relu'))
    if use_dropout:
        model.add(tf.keras.layers.Dropout(dropout))
model.add(tf.keras.layers.Dense(num_categories, activation='sigmoid'))
```

**See also**: [fcnn/train.py](./fcnn/train.py)

## Training the model

The following steps were taken in the process of training the model:

### A 'simple' start:

* A subset of the data was only included in the training, e.g. 160 samples
* No regularization, i.e. dropout not activated
* A limited number of neurons were included, e.g. 32
* Started with a learning rate of 1e-5, taken from ([Kuchera, 2019](https://www.sciencedirect.com/science/article/pii/S0168900219308046?via%3Dihub))
* I strived to just be able to train the model, i.e. observe a decreasing loss function with the number of epochs

In [7]:
%%time
fcnn.train.train(train=train, 
                log_dir='fcnn/logs/',
                hidden_layers=[32],
                validation_split=0.15,
                lr=1e-5, 
                decay=0.,
                examples_limit=-160,
                epochs=20, 
                batch_size=32,
                seed=71,
                use_dropout=False,
                dropout=0.5,
               )

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 32)                256032    
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 33        
Total params: 256,065
Trainable params: 256,065
Non-trainable params: 0
_________________________________________________________________
None

Writing fits to: fcnn/logs/nodes32_dropoutFalse_lr1e-05_decay0.0_samples-160/20200828-155109
Checkpoint path: fcnn/logs/nodes32_dropoutFalse_lr1e-05_decay0.0_samples-160/20200828-155109/epoch-{epoch:02d}.h5
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
CPU times: user 45.8 s, sys: 29.2 s, total: 1min 15s
Wall time: 25.1

In [8]:
%tensorboard --logdir fcnn/logs/ --port 6008

Reusing TensorBoard on port 6008 (pid 20761), started 2 days, 23:28:39 ago. (Use '!kill 20761' to kill it.)

There was really no issue training the model (loss was steadily decreasing), see `TensorBoard` above, therefore a further advanced model was trained:

### Towards final model:

* All data included
* No regularization, i.e. dropout not activated
* 128 neurons were included
* A faster learning rate of 1e-3 (tuned)
* Now striving to train the model smoothly by further tuning the learning rate and assessing early stopping possibilities

In [9]:
%%time
fcnn.train.train(train=train, 
                log_dir='fcnn/logs/',
                hidden_layers=[128],
                validation_split=0.15,
                lr=1e-3, 
                decay=0.,
                examples_limit=-1,
                epochs=20, 
                batch_size=32,
                seed=71,
                use_dropout=False,
                dropout=0.5,
               )

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 128)               1024128   
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 129       
Total params: 1,024,257
Trainable params: 1,024,257
Non-trainable params: 0
_________________________________________________________________
None

Writing fits to: fcnn/logs/nodes128_dropoutFalse_lr0.001_decay0.0_samples-1/20200828-155135
Checkpoint path: fcnn/logs/nodes128_dropoutFalse_lr0.001_decay0.0_samples-1/20200828-155135/epoch-{epoch:02d}.h5
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
CPU times: user 54.8 s, sys: 38.4 s, total: 1min 33s
Wall time: 

In [10]:
%tensorboard --logdir fcnn/logs/ --port 6008

Reusing TensorBoard on port 6008 (pid 20761), started 2 days, 23:29:09 ago. (Use '!kill 20761' to kill it.)

There was again no issue training the model, see `TensorBoard` above. The model converged very fast and since the loss of the validation function does not increase after a while, there should be limited overfitting.

### Final model:

* All data included
* Dropout activated
* 128 neurons were included
* A learning rate of 1e-3 (tuned)
* Applying an early stopping after 12 epochs

In [11]:
%%time
fcnn.train.train(train=train, 
                log_dir='fcnn/logs/',
                hidden_layers=[128],
                validation_split=0.15,
                lr=1e-3, 
                decay=0.,
                examples_limit=-1,
                epochs=12, 
                batch_size=32,
                seed=71,
                use_dropout=True,
                dropout=0.5,
               )

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 128)               1024128   
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 129       
Total params: 1,024,257
Trainable params: 1,024,257
Non-trainable params: 0
_________________________________________________________________
None

Writing fits to: fcnn/logs/nodes128_dropoutTrue_lr0.001_decay0.0_samples-1/20200828-155204
Checkpoint path: fcnn/logs/nodes128_dropoutTrue_lr0.001_decay0.0_samples-1/20200828-155204/epoch-{epoch:02d}.h5
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12
CPU times: user 33.2 s, sys: 2

In [12]:
%tensorboard --logdir fcnn/logs/ --port 6008

Reusing TensorBoard on port 6008 (pid 20761), started 2 days, 23:29:27 ago. (Use '!kill 20761' to kill it.)