**DeapSECURE module 4: Deap Learning**

# Session 3: Model Tuning

Welcome to the DeapSECURE online training program!
This is a Jupyter notebook for the hands-on learning activities of the
["Deep Learning" (DL) module](https://deapsecure.gitlab.io/deapsecure-lesson04-nn/),
Episode 6: ["Tuning Neural Network Models for Better Accuracy"](https://deapsecure.gitlab.io/deapsecure-lesson04-nn/30-model-tuning/index.html).
Please visit the [DeapSECURE](https://deapsecure.gitlab.io/) website to learn more about our training program.

## Overview

In this session, we will use this notebook to **tune** neural network models to improve the accuracy of the classification task on the Sherlock's "Applications" dataset.
We will be using the same dataset, which contains 18 applications.

> **Your challenge** in this notebook is to train more neural network models using the "18-apps" dataset to improve the classification accuracy of the model. Can we reach 99%? How about 99.9%? Or 99.99%?

> **DISCUSSION**: In cybersecurity, why do we care about 99.99% or even 99.999% accuracy?
> Think, for example, the case of spam detection.
> What will happen if we falsely mark many legitimate emails as spam?
> Or let many spam mails enter into your inbox?

**QUICK LINKS**
* [Setup](#sec-setup)
* [Loading Sherlock Applications Data](#sec-load_data)
* [Neural Network Models](#sec-NN)
* [Model Tuning Methods](#sec-Model_Tuning_Methods)

<a id="sec-setup"></a>
## 1. Setup Instructions

If you are opening this notebook from the Wahab OnDemand interface, you're all set.

If you see this notebook elsewhere, and want to perform the exercises on Wahab cluster, please follow the steps outlined in our setup procedure.

1. Make sure you have activated your HPC service.
2. Point your web browser to https://ondemand.wahab.hpc.odu.edu/ and sign in with your MIDAS ID and password.
3. Create a new Jupyter session with the following parameters: Python version **3.7**, Python suite `tensorflow 2.6 + pytorch 1.10`, Number of Cores **4**, Number of GPU **0**, Partition `main`, and Number of Hours at least **4**. (See <a href="https://wiki.hpc.odu.edu/en/ood-jupyter" target="_blank">ODU HPC wiki</a> for more detailed help.)
4. From the JupyterLab launcher, start a new Terminal session. Then issue the following commands to get the necessary files:

       mkdir -p ~/CItraining/module-nn
       cp -pr /shared/DeapSECURE/module-nn/. ~/CItraining/module-nn

Using the file manager on the left sidebar, now change the working directory to `~/CItraining/module-nn`.
The file name of this notebook is `NN-session-3.ipynb`.

### 1.1 Reminder

* Throughout this notebook, `#TODO` is used as a placeholder where you need to fill in with something appropriate. 

* To run a code in a cell, press `Shift+Enter`.

* <a href="https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf" target="_blank">Pandas cheatsheet</a>

* <a href="https://deapsecure.gitlab.io/deapsecure-lesson02-bd/10-pandas-intro/index.html#summary-indexing-syntax" target="_blank">Summary table of the commonly used indexing (subscripting) syntax</a> from our own lesson.

* <a href="https://keras.io/api/" target="_blank">Keras API document</a>

We recommend you open these on separate tabs or print them;
they are handy help for writing your own codes.

### 1.2 Loading Python Libraries

First step, we need to import the required libraries into this Jupyter Notebook:
`pandas`, `numpy`,`matplotlib.pyplot`, and `tensorflow`.

In [None]:
import os
import sys

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# CUSTOMIZATIONS (optional)
np.set_printoptions(linewidth=1000)

%matplotlib inline

In [None]:
# tools for deep learning:
import tensorflow as tf
import tensorflow.keras as keras

# Import key Keras objects
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam

In [None]:
# Import ML toolbox functions
from sherlock_ML_toolbox import load_prep_data_18apps, split_data_18apps, \
NN_Model_1H, plot_loss, plot_acc, combine_loss_acc_plots, fn_out_history_1H, \
model_layer_code_XH, fn_dir_tuning_XH, fn_out_history_XH

<a id="sec-load_data"></a>
## 2. Loading Sherlock Applications Data

Utilize the toolbox sherlock_ML_toolbox.py to load in the data, 
preprocess the data (data cleaning, label/feature separation,
feature normalization/scaling, etc.) until it is ready for ML except
for train-validation splitting.

In [None]:
# Load in the pre-processed SherLock data.
datafile = "sherlock/sherlock_18apps.csv"
df_orig, df, labels, df_labels_onehot, df_features \
    = #TODO

# Split the data into train and validation datasets with their respective features and labels.
train_features, val_features, train_labels, val_labels, train_L_onehot, val_L_onehot \
    = #TODO

In [None]:
print("First 10 rows/entries from the preprocessed data:")
#TODO

In [None]:
print("Last 10 rows/entries from the preprocessed data:")
#TODO

<a id="sec-NN"></a>
## 3. Neural Network Model with One Hidden Layer -- the Baseline

Let us now start by building a simple neural network model with just one hidden layer.
This will serve as a *baseline*, which we will attempt to improve through the tuning process below.

### Aside: Metadata

(Experiment) metadata: provides context about each run/experiment.
Saving metadata provides the user with a quick way to recall 
important information about a particular run/experiment.

In these experiments, we will save the following metadata:
 - Expt_ID: shorthand of the naming convention along with the type of experiment
 - Job_ID: this will be unique for each experiment
 - hidden_neurons: as a list, where each element is the number of hidden neurons in that layer
 - learning_rate
 - batch_size
 
This metadata can either be saved during each experiment (this helps ensure that no mistakes are made);
or, it can be saved after if the user is very careful to remember
what to fill in for each run.

Since these experiments very methodically scan the hyperparameter
space, we will collect the metadata at the end of each experiment type.

### 3.1 The Baseline Model

The baseline neural network model has one hidden layer with `18` hidden neurons and a learning rate of `0.0003`. It is trained with a batch size of 32.

Let us train this model with an initial *learning rate* of 0.0003.

In [None]:
# Create the outer hidden_neurons directory
dir0_HN = "scan-hidden-neurons/"
if not os.path.exists(dir0_HN):
        os.makedirs(dir0_HN)

In [None]:
## Helper function

def saveOutputs_HN(numNeurons, currHistory, currModel):
    """
    Save the outputs of the hidden neurons model tuning experiments.
    It will create a directory within the hidden neurons directory with the 
    MODEL_DIR name.
    Save within this folder the following: 
    1. A loss_acc_plot.png that is the training and validation loss vs. epochs
      and the training and validation accuracy vs. epochs graphs.
    2. model_history.csv that contains the training and validation loss and
      accuracy per epoch data.
    3. model_weights.h5 that contains the saved model.
    
    Args:
      numNeurons: the number of hidden neurons used in this experiment.
      currHistory: the current history object used to create (and then save) the CSV and plot files.
      currModel: the current model to save.
    
    """
    # Create model output directory
    model_name = "model_1H" + str(numNeurons) + "N_lr" + str(0.0003) + "_bs" + str(32) + "_e" + str(10)
    MODEL_DIR = dir0_HN+model_name
    if not os.path.exists(MODEL_DIR):
        os.makedirs(MODEL_DIR)

    ## Save the Output

    # Utilize os.path.join to add the output files to the MODEL_DIR defined above.
    history_file = os.path.join(MODEL_DIR, 'model_history.csv')
    plot_file = os.path.join(MODEL_DIR, 'loss_acc_plot.png')
    model_file = os.path.join(MODEL_DIR, 'model_weights.h5')

    # save the history into a CSV file
    history_df = pd.DataFrame(currHistory.history)
    history_df.to_csv(history_file, index=False)

    # save the plots using the toolbox function and then add a title
    combine_loss_acc_plots(currHistory, plot_loss, plot_acc, show=False)
    plt.suptitle(model_name, fontsize=15)
    plt.savefig(plot_file)

    # save the model
    currModel.save(model_file)

In [None]:
model_1H = #TODO
model_1H_history = model_1H.fit(train_features,
                                train_L_onehot,
                                epochs=10, batch_size=32,
                                validation_data=(val_features, val_L_onehot),
                                verbose=2)

In [None]:
# Save the outputs
#saveOutputs_HN(#TODO)

<a id="sec-Model_Tuning_Methods"></a>
## 4 Model Tuning Methods

Now that we have built and trained the baseline neural network model, we will run a variety of experiments using different combinations of *hyperparameters*, in order to find the best performing model.
A secondary goal is to investigate how increasing or decreasing a hyperparameter affects the accuracy of the model.
Below is a list of hyperparameters that could be interesting to explore; feel free to experiment with your own ideas as well.

We will use the `NN_Model_1H` with 18 neurons in the hidden layer as a baseline.
Starting from this model, let us: 

- val with different numbers of neurons in the hidden layer: **12**, **8**, **4**, **2**, **1**
    - It is also worthwhile to val a higher number of neurons: **40**, **80**, or more
- val with different learning rates: **0.0003**, **0.001**, **0.01**, **0.1**
- val with different batch sizes: **16**, **32**, **64**, **128**, **512**, **1024**
- val with different numbers of hidden layers: **2**, **3**, and so on

> **NOTE:**
> The easiest way to do this exploration is to simply copy the code cell where we constructed and trained the baseline model and paste it to a new cell below, since most of the parameters (`hidden_neurons`, `learning_rate`, `batch_size`, etc.) can be changed when calling the `NN_Model_1H` function or when fitting the model.
> However, to change the number of hidden layers (which we will do much later), the original `NN_model_1H` function must be duplicated and modified.

#### Post-Processing

To take advantage of Jupyter Notebook's ability to immediately inspect graphical 
elements, part of the post-processing will be done after each model's run.
Inspect the resulting loss and accuracy graphs and answer the following.

**QUESTIONS**: Based on the plots shown above (for the baseline model), inspect whether the training runs went as expected.

1) Visually inspect for any anomalies. Note the runs that produce "abonrmal training trends", i.e., where the "loss vs. epochs" and/or "accuracy vs. epochs" curves exhibit a different behavior from what shown for the baseline model.

2) Visually (or numerically) check for convergence (e.g. check the loss or accuracy for the last 4-5 epochs; what their slopes look like in this region; any fluctuations?)

3) Observe the differences in the *final* accuracies as a result of different `hidden_neurons` values. (We will do this more carefully in the next phase)


### 4.1 Tuning Experiments #1: Varying Number of Neurons in Hidden Layers

In this round of experiments, we create several variants of `NN_Model_1H` models with varying the `hidden_neurons` hyperparameter, i.e. the number of neurons in the hidden layer.
The accuracy and loss of each model will be assessed as a function of `hidden_neurons`.
All the other hyperparameters (e.g. learning rate, epochs, batch_size, number of hidden layers) will be kept constant; they will be varied later.
Not every number of hidden neurons is tested, so feel free to create new code cells with a different number of neurons as your curiousity leads you.

##### Model "1H12N": 12 neurons in the hidden layer

In [None]:
"""Construct & train a NN_Model_1H with 12 neurons in the hidden layer""";

#model_1H12N = NN_Model_1H(#TODO...)
#model_1H12N_history = model_1H12N.fit(#TODO...)

# Also plot the loss & accuracy

##### Model "1H8N": 8 neurons in the hidden layer

In [None]:
"""Construct & train a NN_Model_1H with 8 neurons in the hidden layer""";

#model_1H8N = NN_Model_1H(#TODO...)
#model_1H8N_history = #TODO

# Also plot the loss & accuracy

> ### Tips & Tricks for Experimental Runs
>
> Do you see the systematic names of the model and history variables, etc?
> The variable called `model_1H12N` means "a model with one hidden layer (`1H`) that has 12 neurons (`12N`)".
> The use of systematic names, albeit complicated, will be very helpful in keeping track of different experiments.
> For example, down below, we will have models with two hidden layers; such a model can be denoted by a variable name such as `model_2H18N12N`, etc.
>
> **DISCUSSION QUESTION:**
> Why don't we just name the variables `model1`, `model2`, `model3`, ...?
> What are the advantages and disadvantages of naming them with this schema?
>
> **Keeping track of experimental results**:
> At this stage, it may be helpful to keep track the final training accuracy (at the last epoch) for each model with a distinct `hidden_neurons` value.
> You can use pen-and-paper, or build a spreadsheet with the following
> values:
>
> | `hidden_neurons` | `val_accuracy` |
> |------------------|----------------|
> |        1         |      ....      |
> |       ...        |      ....      |
> |       18         | 0.9792 (example) |
> |       ...        |      ....      |
> |       80         |      ....      |

**EXERCISES**: create additional code cells to run models with 4, 2, 1 neurons in the hidden layer

##### Model "1H4N": 4 neurons in the hidden layer

In [None]:
"""Construct & train a NN_Model_1H with 4 neurons in the hidden layer""";
#TODO

##### Model "1H2N": 2 neurons in the hidden layer

In [None]:
"""Construct & train a NN_Model_1H with 2 neurons in the hidden layer""";

#TODO

##### Model "1H1N": 1 neurons in the hidden layer

In [None]:
"""Construct & train a NN_Model_1H with 1 neurons in the hidden layer""";

#TODO

**EXERCISES**: create more code cells to run models with 40 and 80 neurons in the hidden layer. *You are welcome to explore even higher numbers of hidden neurons. Observe carefully what happening!*

##### Model "1H40N": 40 neurons in the hidden layer

In [None]:
"""Construct & train a NN_Model_1H with 40 neurons in the hidden layer""";

#TODO

##### Model "1H80N": 80 neurons in the hidden layer

In [None]:
"""Construct & train a NN_Model_1H with 80 neurons in the hidden layer""";

#TODO

#### Post Processing for Experiment Type 1: Hidden Neurons

##### Visual inspection of graphs:

Recall the post-processing questions listed above.

#### Post Processing (Compiling the CSV) for Experiment 1: Hidden Neurons

#### *Method 1: Via a temporary data structure*
 
In this method, we will construct and fill a temporary data structure (`all_lastEpochMetrics`) dynamically before forming the dataframe.
This approach is useful when the size of the data (e.g. total number of rows) is not known *a priori*.

The following is a *simplified* loop which shows the logic
of this intermediate data construction:

In [None]:
# outer directory
dirPathHN = dir0_HN

# The number of neurons for each experiment/model
listHN = [1, 2, 4, 8, 12, 18, 40, 80]

# Number of epochs - 1
lastEpochNum = 9 

# Initalize. This will hold the list of dictionaries of last epoch metrics
# (loss, val_loss, accuracy, val_accuracy)
all_lastEpochMetrics = []

# Fill in the rows for the DataFrame
for HN in listHN:
    # Read the history CSV file and get the last row's data, which corresponds to the last epoch data.
    # run_subdir = "model_1H" + str(HN) + "N_lr0.0003_bs32_e10"
    # result_csv = os.path.join(dirPathHN, run_subdir, "model_history.csv")
    result_csv = fn_out_history_1H(dirPathHN, HN, 0.0003, 32, 10)
    print("Reading:", result_csv)
    epochMetrics = pd.read_csv(result_csv)
    # Fetch the loss, accuracy, val_loss, and val_accuracy from the last epoch
    # (should be the last row in the CSV file unless there's something wrong
    # during the traning)
    lastEpochMetrics = epochMetrics.iloc[lastEpochNum, :].to_dict()
    # Attach the "neurons" value
    lastEpochMetrics["hidden_neurons"] = HN
    all_lastEpochMetrics.append(lastEpochMetrics)

Now construct the `df_HN` dataframe:

In [None]:
df_HN = pd.DataFrame(all_lastEpochMetrics, 
                     columns=["hidden_neurons", "loss", "accuracy", "val_loss", "val_accuracy"])

In [None]:
print(df_HN)

##### Save the post-processing results for the hidden neurons experiment

In [None]:
df_HN.to_csv("post_processing_neurons.csv", index=False)

### 4.2 Tuning Experiment #2: Varying Learning Rate

In this batch of experiment, the accuracy and loss function of each model will be compared while changing the 'learning rate'.
For simplicity, all the other parameters (e.g. the number of neurons, epochs, batch_size, hidden layers) will be kept constant.
The one hidden layer with 18 neurons model will be used.
Not every number of learning rate is included, so feel free to create new code cells with a different learning rate.

#### Define the helper functions

In [None]:
# Create the outer learning_rate directory
dir0_LR = "scan-learning-rate/"
if not os.path.exists(dir0_LR):
        os.makedirs(dir0_LR)

## Helper function

def saveOutputs_LR(learning_rate, currHistory, currModel):
    """
    Save the outputs of the learning rate model tuning experiments.
    It will create a directory within the learning rate directory with the 
    MODEL_DIR name.
    Save within this folder the following: 
    1. A loss_acc_plot.png that is the training and validation loss vs. epochs
      and the training and validation accuracy vs. epochs graphs.
    2. model_history.csv that contains the training and validation loss and
      accuracy per epoch data.
    3. model_weights.h5 that contains the saved model.
    
    Args:
      learning_rate: the learning rate used in this experiment.
      currHistory: the current history object used to create (and then save) the CSV and plot files.
      currModel: the current model to save.
    
    """
    # Create model output directory
    model_name = "model_1H18N_lr" + str(learning_rate) + "_bs" + str(32) + "_e" + str(10)
    MODEL_DIR = dir0_LR+model_name
    if not os.path.exists(MODEL_DIR):
        os.makedirs(MODEL_DIR)

    ## Save the Output

    # Utilize os.path.join to add the output files to the MODEL_DIR defined above.
    history_file = os.path.join(MODEL_DIR, 'model_history.csv')
    plot_file = os.path.join(MODEL_DIR, 'loss_acc_plot.png')
    model_file = os.path.join(MODEL_DIR, 'model_weights.h5')

    # save the history into a CSV file
    history_df = pd.DataFrame(currHistory.history)
    history_df.to_csv(history_file, index=False)

    # save the plots using the toolbox function and then add a title
    combine_loss_acc_plots(currHistory, plot_loss, plot_acc, show=False)
    plt.suptitle(model_name, fontsize=15)
    plt.savefig(plot_file)

    # save the model
    currModel.save(model_file)

##### Model "1H18N" With Learning Rate 0.0003

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & learning rate=0.0003""";

#model_1H18N_LR0_0003 = NN_Model_1H(#TODO...)
#model_1H18N_LR0_0003_history = #TODO

# Also plot the loss & accuracy (optional)

# Save the model


**TODO**

... (create additional code cells to run models (`1H18N`) with larger learning rates: **0.001**, **0.01**,**0.1**)

##### Model "1H18N" With Learning Rate 0.001

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & learning rate=0.001""";

#TODO

##### Model "1H18N" With Learning Rate 0.01

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & learning rate=0.01""";


#TODO

##### Model "1H18N" With Learning Rate 0.1

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & learning rate=0.1""";


#TODO

#### Post Processing for Experiment Type 2: Learning Rate

##### Visual inspection of graphs:

Recall the post-processing questions listed above.

#### Post Processing (Compiling the CSV) for Experiment Type 2: Learning Rate

This follows the same format as the hidden neurons with different variable names.
See above for more information.

In [None]:
# outer directory
dirPathLR = dir0_LR

# The learning rates for each experiment/model
listLR = [.0003, 0.001, 0.01, 0.1]

# Number of epochs - 1
lastEpochNum = 9 

# Initalize. This will hold the list of dictionaries of last epoch metrics
# (loss, val_loss, accuracy, val_accuracy)
all_lastEpochMetrics_LR = []

# Fill in the rows for the DataFrame
for LR in listLR:
    # Read the history CSV file and get the last row's data, which corresponds to the last epoch data.
    # run_subdir = "model_1H18N_lr"+str(LR)+"_bs32_e10"
    # result_csv = os.path.join(dirPathHN, run_subdir, "model_history.csv")
    result_csv = fn_out_history_1H(dirPathLR, 18, LR, 32, lastEpochNum+1)
    print("Reading:", result_csv)
    epochMetrics = pd.read_csv(result_csv)
    # Fetch the loss, accuracy, val_loss, and val_accuracy from the last epoch
    # (should be the last row in the CSV file unless there's something wrong
    # during the traning)
    lastEpochMetrics = epochMetrics.iloc[lastEpochNum, :].to_dict()
    # Attach the "learning rate" value
    lastEpochMetrics["learning_rate"] = LR
    all_lastEpochMetrics_LR.append(lastEpochMetrics)

Now construct the `df_LR` dataframe:

In [None]:
df_LR = pd.DataFrame(all_lastEpochMetrics_LR, 
                     columns=["learning_rate", "loss", "accuracy", "val_loss", "val_accuracy"])

In [None]:
print(df_LR)

##### Save the post-processing results for the learning rate experiments

In [None]:
df_LR.to_csv("post_processing_lr.csv", index=False)

### 4.3 Tuning Experiments #3: Varying Batch Size

The accuracy and loss of each model will be compared while changing the 'batch size'.
For simplicity, all other parameters (e.g. learning rate, epochs, number of neurons, hidden layers) will be kept constant.
The one hidden layer with 18 neurons model will be used.
Not every number of batch size is included, so feel free to create new code cells with a different number of batch size.

##### Define the helper function

In [None]:
# Create the outer batch_size directory
dir0_BS = "scan-batch-size"
if not os.path.exists(dir0_BS):
        os.makedirs(dir0_BS)

## Helper function

def saveOutputs_BS(batch_size, currHistory, currModel):
    """
    Save the outputs of the batch size model tuning experiments.
    It will create a directory within the batch size directory with the 
    MODEL_DIR name.
    Save within this folder the following: 
    1. A loss_acc_plot.png that is the training and validation loss vs. epochs
      and the training and validation accuracy vs. epochs graphs.
    2. model_history.csv that contains the training and validation loss and
      accuracy per epoch data.
    3. model_weights.h5 that contains the saved model.
    
    Args:
      batch_size: the batch size used in this experiment.
      currHistory: the current history object used to create (and then save) the CSV and plot files.
      currModel: the current model to save.
    
    """
    # Create model output directory
    model_name = "model_1H18N_lr0.0003_bs" + str(batch_size) + "_e" + str(10)
    MODEL_DIR = "model_tuning/batch_size/"+model_name
    if not os.path.exists(MODEL_DIR):
        os.makedirs(MODEL_DIR)

    ## Save the Output

    # Utilize os.path.join to add the output files to the MODEL_DIR defined above.
    history_file = os.path.join(MODEL_DIR, 'model_history.csv')
    plot_file = os.path.join(MODEL_DIR, 'loss_acc_plot.png')
    model_file = os.path.join(MODEL_DIR, 'model_weights.h5')

    # save the history into a CSV file
    history_df = pd.DataFrame(currHistory.history)
    history_df.to_csv(history_file, index=False)

    # save the plots using the toolbox function and then add a title
    combine_loss_acc_plots(currHistory, plot_loss, plot_acc, show=False)
    plt.suptitle(model_name, fontsize=15)
    plt.savefig(plot_file)

    # save the model
    currModel.save(model_file)

**TODO**

... (create additional code cells to run models (`1H18N`) with larger batch sizes, e.g. 16, 32, 64, 128, 512, 1024, ...).
Remember that we have the original batch_size=16.

##### Model "1H18N" With Batch Size 16

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & batch size=16""";

#TODO

##### Model "1H18N" With Batch Size 32

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & batch size=32""";

#TODO

##### Model "1H18N" With Batch Size 64

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & batch size=64""";

#TODO

##### Model "1H18N" With Batch Size 128

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & batch size=128""";

#TODO

##### Model "1H18N" With Batch Size 512

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & batch size=512""";

#TODO

##### Model "1H18N" With Batch Size 1024

In [None]:
"""Construct & train a NN_Model_1H with 18 neurons in the hidden layer & batch size=1024""";

#TODO

#### Post Processing for Experiment Type 3: Batch Size

##### Visual inspection of graphs:



#### Post Processing (Compiling the CSV) for Experiment Type 3: Batch Size

This follows the same format as the hidden neurons with different variable names.
See above for more information.

In [None]:
# outer directory
dirPathLR = dir0_BS

# The batch sizes for each experiment/model
listBS = [16, 32, 64, 128, 512, 1024]

# Number of epochs - 1
lastEpochNum = 9 

# Initalize. This will hold the list of dictionaries of last epoch metrics
# (loss, val_loss, accuracy, val_accuracy)
all_lastEpochMetrics_BS = []

# Fill in the rows for the DataFrame
for BS in listBS:
    # Read the history CSV file and get the last row's data, which corresponds to the last epoch data.
    # run_subdir = "model_1H18N_lr0.0003_bs"+str(BS)+"_e10"
    # result_csv = os.path.join(dirPathHN, run_subdir, "model_history.csv")
    result_csv = fn_out_history_1H(dirPathLR, 18, 0.0003, BS, lastEpochNum+1)
    print("Reading:", result_csv)
    epochMetrics = pd.read_csv(result_csv)
    # Fetch the loss, accuracy, val_loss, and val_accuracy from the last epoch
    # (should be the last row in the CSV file unless there's something wrong
    # during the traning)
    lastEpochMetrics = epochMetrics.iloc[lastEpochNum, :].to_dict()
    # Attach the "batch size" value
    lastEpochMetrics["batch_size"] = BS
    all_lastEpochMetrics_BS.append(lastEpochMetrics)

Now construct the `df_BS` dataframe:

In [None]:
df_BS = pd.DataFrame(all_lastEpochMetrics_BS, 
                     columns=["batch_size", "loss", "accuracy", "val_loss", "val_accuracy"])

In [None]:
print(df_BS)

##### Save the post-processing results for the batch size experiments

In [None]:
df_BS.to_csv("post_processing_bs.csv", index=False)

### 4.4 Tuning Experiments #4: Varying the number of hidden layers

The accuracy and loss of each model will be compared while changing the 'number of hidden layers'.
For simplicity, all other parameters (e.g. learning rate, epochs, batch_size, number of neurons) will be kept constant.
Not every number of hidden layers is included, so feel free to create new code cells with a different number of layers.

##### Define the helper function

In [None]:
# Create the outer layers directory
dir0_HL = "scan-layers"
if not os.path.exists(dir0_HL):
        os.makedirs(dir0_HL)

## Helper function

def saveOutputs_HL(num_layers, currHistory, currModel):
    """
    Save the outputs of the multiple hidden layers model tuning experiments.
    It will create a directory within the layers directory with the 
    MODEL_DIR name.
    Save within this folder the following: 
    1. A loss_acc_plot.png that is the training and validation loss vs. epochs
      and the training and validation accuracy vs. epochs graphs.
    2. model_history.csv that contains the training and validation loss and
      accuracy per epoch data.
    3. model_weights.h5 that contains the saved model.
    
    Args:
      num_layers: the number of hidden layers used in this experiment.
      currHistory: the current history object used to create (and then save) the CSV and plot files.
      currModel: the current model to save.
    
    """
    # Create model output directory
    titleAdd = str(num_layers) + "H"
    for j in range(num_layers):
        titleAdd += "18N"
    model_name = "model_"+titleAdd+"_lr0.0003_bs32" + "_e" + str(10)
    MODEL_DIR = "model_tuning/layers/"+model_name
    if not os.path.exists(MODEL_DIR):
        os.makedirs(MODEL_DIR)

    ## Save the Output

    # Utilize os.path.join to add the output files to the MODEL_DIR defined above.
    history_file = os.path.join(MODEL_DIR, 'model_history.csv')
    plot_file = os.path.join(MODEL_DIR, 'loss_acc_plot.png')
    model_file = os.path.join(MODEL_DIR, 'model_weights.h5')

    # save the history into a CSV file
    history_df = pd.DataFrame(currHistory.history)
    history_df.to_csv(history_file, index=False)

    # save the plots using the toolbox function and then add a title
    combine_loss_acc_plots(currHistory, plot_loss, plot_acc, show=False)
    plt.suptitle(model_name, fontsize=15)
    plt.savefig(plot_file)

    # save the model
    currModel.save(model_file)

#### Create the NN_Model_2H function that will build and compile a model with 2 hidden layers

In [None]:
def NN_Model_2H(hidden_neurons_1,sec_hidden_neurons_1, learning_rate):
    """Definition of deep learning model with two dense hidden layers"""
    random_normal_init = tf.random_normal_initializer(mean=0.0, stddev=0.05)
    model = Sequential([
        # More hidden layers can be added here
        Dense(hidden_neurons_1, activation='relu', input_shape=(19,),
              kernel_initializer=random_normal_init), # Hidden Layer
        #TODO: Add another hidden layer
        Dense(18, activation='softmax',
              kernel_initializer=random_normal_init)  # Output Layer
    ])
    adam_opt = Adam(learning_rate=learning_rate, beta_1=0.9, beta_2=0.999, amsgrad=False)
    model.compile(optimizer=adam_opt,
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])
    return model

In [None]:
# the model with 18 neurons in both of the hidden layers
model_2H18N18N = #TODO
model_2H18N18N_history = #TODO

In [None]:
#TODO: save outputs

#### Create the NN_Model_3H function that will build and compile a model with 3 hidden layers

In [None]:
def NN_Model_3H(hidden_neurons_1,hidden_neurons_2, hidden_neurons_3, learning_rate):
    """Definition of deep learning model with three dense hidden layers"""
    random_normal_init = tf.random_normal_initializer(mean=0.0, stddev=0.05)
    model = Sequential([
        # More hidden layers can be added here
        Dense(hidden_neurons_1, activation='relu', input_shape=(19,),
              kernel_initializer=random_normal_init), # Hidden Layer
        Dense(hidden_neurons_2, activation='relu',
              kernel_initializer=random_normal_init), # Hidden Layer
        #TODO: Add another hidden layer
        Dense(18, activation='softmax',
              kernel_initializer=random_normal_init)  # Output Layer
    ])
    adam_opt = Adam(learning_rate=learning_rate, beta_1=0.9, beta_2=0.999, amsgrad=False)
    model.compile(optimizer=adam_opt,
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])
    return model

In [None]:
# the model with 18 neurons in each of the 3 hidden layers 
model_3H18N18N18N = #TODO
model_3H18N18N18N_history = model_3H18N18N18N.fit(train_features,
                                      train_L_onehot,
                                      epochs=10, batch_size=32,
                                      validation_data=(val_features, val_L_onehot),
                                      verbose=2)

In [None]:
# Save the results

#TODO

#### The NN_1H model (the model with 1 hidden layer)
##### For simplicity sake, we will just save the output from the baseline model defined above

In [None]:
# Save the 1H18N model
saveOutputs_HL(1, model_1H_history, model_1H)

#### Post Processing for Experiment Type 4: Number of Hidden Layers

##### Visual inspection of graphs:

All of them follow the usual trends.

#### Post Processing (Compiling the CSV) for Experiment Type 4: Multiple Hidden Layers

This follows the same format as the hidden neurons with different variable names.
See above for more information. However, this requires a helper function located in the `sherlock_ML_toolbox.py` file called `model_layer_code_XH()`. Test it below.

In [None]:
# Test out model_layer_code_XH(), which provides follows the naming convention.

#TODO

In [None]:
# outer directory
dirPathHL = dir0_HL

# The hidden layers for each experiment/model.
# The number of hidden neurons in each layer input as a list.
# Each number in the list is the number of neurons in that layer.
listHL = [[18], [18, 18], [18, 18, 18]]

# Number of epochs - 1
lastEpochNum = 9 

# Initalize. This will hold the list of dictionaries of last epoch metrics
# (loss, val_loss, accuracy, val_accuracy)
all_lastEpochMetrics_HL = []

# Fill in the rows for the DataFrame
for HL in listHL:
    # Read the history CSV file and get the last row's data, which corresponds to the last epoch data.
    result_csv = fn_out_history_XH(dirPathHL, HL, 0.0003, 32, lastEpochNum+1)
    print("Reading:", result_csv)
    epochMetrics = pd.read_csv(result_csv)
    # Fetch the loss, accuracy, val_loss, and val_accuracy from the last epoch
    # (should be the last row in the CSV file unless there's something wrong
    # during the traning)
    lastEpochMetrics = epochMetrics.iloc[lastEpochNum, :].to_dict()
    # Attach the "layers" value
    lastEpochMetrics["neurons"] = HL
    all_lastEpochMetrics_HL.append(lastEpochMetrics)

Now construct the `df_HL` dataframe:

In [None]:
df_HL = pd.DataFrame(all_lastEpochMetrics_HL, 
                     columns=["neurons", "loss", "accuracy", "val_loss", "val_accuracy"])

In [None]:
print(df_HL)

##### Save the post-processing results for the hidden layer experiments

In [None]:
df_HL.to_csv("post_processing_layers.csv", index=False)

## Additional Tuning Opportunities

There are other hyperparameters that can be adjusted:

  * Change the optimizer (try optimizers other than `Adam`)
  * Activation function  (this is actually a part of the network's architecture)

We encourage you to explore the effects of changing these in your network.

## Closing Remarks

This process of experimentation with different parameters for the neural network can get repetitive and cause this notebook to become very long.
Instead, it would be more beneficial to run experiments like this in a scripting environment.
To do this, we need to identify the relevant code elements for our script.
In a general sense, this is what we should pick out:

* Useful Python libraries & user-defined functions
* Proper sequence of commands that were run throughout this notebook (i.e. one-hot encoding must be done before training the models)
* Code cells that require repetition

In brief, once the initial experiments are done and we have established a working pipeline for machine learning, we need to change the way we work.
Real machine learning work requires many repetitive experiments, each of which may take a long time to complete.
Instead of running many experiments in Jupyter notebooks, where each will require us to wait for a while to finish, we need to be able to carry out many experiments in parallel so that we can obtain our results in a timely manner.
This is key reason why we should make a script for these experiments and submit the script to run them in batch (non-interactive model).
HPC is well suited for this type of workflow--in fact it is most efficient when used in this way.
Here are the key components of the "batch" way of working:

* A job scheduler (such as SLURM job scheduler on HPC) to manage our jobs and run them on the appropriate resources;
* The machine learning script written in Python, which will read inputs from files and write outputs to files and/or standard output;
* The job script to launch the machine learning script in the non-interactive environment (e.g. HPC compute node);
* A way to systematically repeat the experiments with some variations. This can be done by adding some command-line arguments for the (hyper)parameters that will be varied for each experiment.