**DeapSECURE module 4: Deap Learning**

# Session 1: Binary Classification

Welcome to the DeapSECURE online training program!
This is a Jupyter notebook for the hands-on learning activities of the
["Deep Learning" (DL) module](https://deapsecure.gitlab.io/deapsecure-lesson04-nn/), Episode 4: ["An Introduction to Keras with Binary Classification Task"](https://deapsecure.gitlab.io/deapsecure-lesson04-nn/20-keras-intro/index.html).
Please visit the [DeapSECURE](https://deapsecure.gitlab.io/) website to learn more about our training program.

In this notebook, we will learn how to use Keras framework to build a very simple "binary classfication model".
We will build a one-neuron model to perform the "application classification task" using the SherLock's "**2-apps**" dataset introduced in the ["Machine Learning"](https://deapsecure.gitlab.io/deapsecure-lesson03-ml/) module.
A single neuron is the simplest neural network model for this classification task, because there is only one output needed to distinguish the two different apps.


**QUICK LINKS**
* [Setup](#sec-setup)
* [Loading Sherlock Data](#sec-Load_data)
* [Binary Classification](#sec-Binary_clf)
* [Examining the Performance of One-neuron Model for Binary Classification Task](#sec-examining_performance)


<a id="sec-setup"></a>
## 1. Setup Instructions

If you are opening this notebook from the Wahab OnDemand interface, you're all set.

If you see this notebook elsewhere, and want to perform the exercises on Wahab cluster, please follow the steps outlined in our setup procedure.

1. Make sure you have activated your HPC service.
2. Point your web browser to https://ondemand.wahab.hpc.odu.edu/ and sign in with your MIDAS ID and password.
3. Create a new Jupyter session with the following parameters: Python version **3.7**, Python suite `tensorflow 2.6 + pytorch 1.10`, Number of Cores **4**, Number of GPU **0**, Partition `main`, and Number of Hours at least **4**. (See <a href="https://wiki.hpc.odu.edu/en/ood-jupyter" target="_blank">ODU HPC wiki</a> for more detailed help.)
4. From the JupyterLab launcher, start a new Terminal session. Then issue the following commands to get the necessary files:

       mkdir -p ~/CItraining/module-nn
       cp -pr /shared/DeapSECURE/module-nn/. ~/CItraining/module-nn

Using the file manager on the left sidebar, now change the working directory to `~/CItraining/module-nn`.
The file name of this notebook is `NN-session-1.ipynb`.

### 1.1 Reminder

* Throughout this notebook, `#TODO` is used as a placeholder where you need to fill in with something appropriate. 

* To run a code in a cell, press `Shift+Enter`.

* <a href="https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf" target="_blank">Pandas cheatsheet</a>

* <a href="https://deapsecure.gitlab.io/deapsecure-lesson02-bd/10-pandas-intro/index.html#summary-indexing-syntax" target="_blank">Summary table of the commonly used indexing (subscripting) syntax</a> from our own lesson.

* <a href="https://keras.io/api/" target="_blank">Keras API document</a>

We recommend you open these on separate tabs or print them;
they are handy help for writing your own codes.

### 1.2 Loading Python Libraries

First step, we need to import the required libraries into this Jupyter Notebook:
`pandas`, `numpy`,`matplotlib.pyplot`,`sklearn` and `tensorflow`.

In [None]:
"""Import the necessary Python modules""";

import os
import sys
import pandas
import numpy
#import seaborn
from matplotlib import pyplot
import sklearn

# tools for machine learning:
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
# for evaluating model performance
from sklearn.metrics import accuracy_score, confusion_matrix
# classic machine learning models:
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier

# TensorFlow
import tensorflow
import tensorflow.keras as keras

# Import Keras objects
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam

%matplotlib inline

In [None]:
# Some advanced learners may like to use shortcuts,
# so we give them here:
pd = pandas
np = numpy
plt = pyplot

import tensorflow as tf

<a id="sec-Load_data"></a>
## 2. Loading Preprocessed SherLock "2-apps" dataset

First, we load the SherLock's "2-apps" _preprocessed_ features and labels into DataFrames.
We use the reduced set of features saved at the end of the "Machine Learning" module.

In [None]:
df2_features = pd.read_csv('sherlock/2apps_4f/sherlock_2apps_features.csv')
df2_labels = pd.read_csv('sherlock/2apps_4f/sherlock_2apps_labels.csv')

After preprocessing and feature selection, we only have 4 features, namely: `cutime`,`num_threads`,`otherPrivateDirty`,`priority`. 
The label has two values: `0` representing **Facebook**, and `1` **WhatsApp**.

As we do in the ML module, we first split the data into training and testing sets.

In [None]:
train_F, test_F, train_L, test_L = train_test_split(df2_features, df2_labels, test_size=0.2)

<a id="sec-Binary_clf"></a>
## 3. Binary Classification Task in Keras

Keras is a powerful, high-level framework to develop and deploy neural network models in Python.
Keras is intuitive to use, allowing rapid prototyping, experimentation, as well as deployment of deep learning models for real-world problems.
Keras began as a high-level interface to several lower-level software frameworks such as Theano and TensorFlow; however, newer versions are [built exclusively for TensorFlow](https://github.com/keras-team/keras/releases/tag/2.4.0).
In this notebook, we show how easy it is to define, train, evaluate, and deploy neural networks with Keras.

The steps involved in deep learning are very similar to the steps of traditional machine learning:

1. Loading and preprocessing the input data;
2. Defining a neural network model using Keras;
3. Compiling the network (model);
4. Fitting (training) the network using the training data;
5. Evaluating the performance of the network;
6. Improving the model's performance iteratively by adjusting the network's hyperparameters and retraining;
7. Deploying the model to make predictions (i.e. "inference").

Of these steps, the second and third steps will require the Keras-specific objects and functions.
Keras model object

### 3.1 Defining a Neural Network the Keras Way

**There are mainly two ways that we can build models in Keras:**

* Sequential
* Functional

In this training, we limit ourselves to the Sequential model, which is sufficient to build simple neural network models.
Please refer to [Keras documentation on the Sequential model](https://keras.io/guides/sequential_model/) to learn more.

Keras organizes the objects in a logical way, in this way:
* A neural network model consists of *layers* (at minimum, an input layer [always implied] and an output layer).
* Each layer consists of one or more neurons (or generally, *nodes*).

Keras library has many objects for the different types of layers.
The `Dense` object defines a fully-connected neuron layers, which is the most basic type of layer.
We will create neural network models using one or more `Dense` layers.

### 3.2 Constructing a Neural Network Model

Let us create a neural network model with Keras.
This model must have four inputs (defined by the four features in the SherLock "2-apps" data) and one output to distinguish between the two applications: Facebook and WhatsApp.
We will create a `Sequential` object to represent the model.

Let us create a function to construct a neural network with Keras. 
This function will be called `NN_binary_clf` (*binary* for binary classification task; *clf* is the short for "classifier"):

In [None]:
def NN_binary_clf(learning_rate):
    """Create a one-neuron binary classifier using Keras"""
    model = Sequential([
        Dense(1, activation='sigmoid', input_shape=(4,))
    ])
    adam = Adam(lr=learning_rate,
                beta_1=0.9, beta_2=0.999, amsgrad=False)
    model.compile(optimizer=adam,
                  loss='binary_crossentropy',
                  metrics=['accuracy'])
    return model

This function builds a `Sequential` model (full object name: `tensorflow.keras.models.Sequential`).
The model has only one layer defined by this declaration:

```python
Dense(1, activation='sigmoid', input_shape=(4,))
```

The `Dense` function declares a regular fully-connected neural layer, which can be a hidden layer or an output layer.
The arguments have the following meaning:

* `1`: the number of outputs from this layer, which also defines the number of fully connected neurons in this layer.

* `activation='sigmoid'` defines the (nonlinear) activation function used to transform the weighted sum of the input values to the output values.

* `input_shape=(4,)` defines that this layer connects to the input layer that has four inputs.

Please see Keras' [documentation for the Dense layer](https://keras.io/api/layers/core_layers/dense/) for more information and additional parameters.

In the `NN_binary_clf` function, this dense layer is the first and last layer in the model.

The next line in the function above,

```python
adam = tf.keras.optimizers.Adam(lr=learning_rate,
                                beta_1=0.9, beta_2=0.999, amsgrad=False)
```

defines an *optimizer* to use to train the model, i.e. to minimize the loss function.
We use the Adam optimizer, which is a "stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments" ([ref](https://keras.io/api/optimizers/adam/)).
This is the go-to optimizer by many deep learning practitioners.
The critical parameter here is the *learning rate*, which determines how fast the model "learn" based on the feedback from the previous iteration.

The last line *compiles* the model, by integrating it with the other key component of a network, which is the *loss function*:

```python
model.compile(optimizer=adam,
              loss='binary_crossentropy',
              metrics=['accuracy'])
```

The loss function is one of the important components of neural networks. Loss is nothing but a prediction error of neural net. And the method to calculate the loss is called loss function. In simple words, the Loss is used to calculate the gradients. And gradients are used to update the weights of the Neural Net. This is how a Neural Net is trained. the followings are essential loss functions which could be used for most of the models. [(Source: Towardsdatascince)](https://towardsdatascience.com/understanding-different-loss-functions-for-neural-networks-dd1ed0274718)

  * Mean Squared Error (MSE)
  * Binary Crossentropy (BCE)
  * Categorical Crossentropy (CC)
  * Sparse Categorical Crossentropy (SCC)

### 3.3 Model Fitting and Validation

Then, we use a model object to call `NN_binary_clf` function and start the fitting process:

* epochs: The number of epochs is a hyperparameter that defines the number times that the learning algorithm will work through the entire training dataset.

* batch size: The number of examples from the training dataset used in the estimate of the error gradient is called the batch size and is an important hyperparameter that influences the dynamics of the learning algorithm.

* Loss function used: `binary_crossentropy`

* Optimizer used: Adam optimizer.

In [None]:
model = NN_binary_clf(0.0003)
model_history = model.fit(train_F, train_L,
                          epochs=5, batch_size=32,
                          validation_data=(test_F, test_L),
                          verbose=2)

### 3.4 Explanation of the Training Result

`loss` and `val_loss` are the values of loss functions for your training and validation data, respectively.
Similarly, `accuracy` is the accuracy on the training data and `val_accuracy` is the accuracy on the validation data.
For model validation, we need to focus on the `val_loss` and `val_accuracy` values.

**QUESTION**: Why are `val_loss` and `val_accuracy` are more important to pay attention to than the training data's metrics?

**QUESTION**: What are the trends of `val_loss` and `val_accuracy` as we have more and more epochs?

Model validation is already part of the Keras training function (`fit`); there is no need to invoke a separate validation function.

This training output shows 5 iteration or epochs: in each epoch the model went through the all training data once to have the model parameters (neuron weights) adjusted to better fit the behavior of the training data.
As the result shows, our first epochs took \~20 seconds to complete (this timing varies based on the actual computer hardware used to run this training). The loss function drops down just below \~0.35 and the accuracy is \~0.85.

Our model has only the input layer and output layer with no hidden layers in between; there is not much flexibility to account for the complexity embodied in the training data.
Therefore, we should expect a fairly low accuracy outcome.

<a id="sec-examining_performance"></a>
## 4. Examining the Performance of One-neuron Model for Binary Classification Task

We've finished constructing our first model with keras. Now, let's compare the performance of our 1-neuron keras model against traditional machine learning outputs. We will construct a Decision Tree model and Logistic Regression model, both trained by scikit-learn, and then compare their accuracy on the same data with that of our 1-neuron model with keras.

Before constructing our models, let us define a function to evaluate the accuracy of a model.

In [None]:
def model_evaluate(model,test_F,test_L):
    test_L_pred = model.predict(test_F)
    print("Evaluation by using model:",type(model).__name__)
    print("accuracy_score:",accuracy_score(test_L, test_L_pred))
    print("confusion_matrix:","\n",confusion_matrix(test_L, test_L_pred))
    return

Now we can use the `model_evaluate` function to evaluate our models. Now, let's construct our decision tree and logistic regression models.

In [None]:
""" Construct a Decision Tree Model and fit to training data""";
#TODO
#model_dtc =
#model_dtc.fit()

In [None]:
"""Construct a Logistic Regression Model and fit to training data""";
#TODO
#model_lr =
#model_lr.fit()

Now that we have constructed our Decision Tree and Logistic Regression models, we can compare their accuracy to our 1-nueron keras model by calling the `model_evaluate` function on our simple machine learning models.

In [None]:
"""Uncomment and run""";
#model_evaluate(model_dtc, test_F, test_L)
#print()
#model_evaluate(model_lr, test_F, test_L)
#print()
#print('1-Neuron model:', model.evaluate(test_F, test_L))

### Your Challenge: Improving Neuron Model

**QUESTION**:
Can you think of ways to improve the one-neuron model?
You are welcome to try your ideas using more cells below, and share them with your fellow learners?

We will learn more of these in latter notebooks.

In [None]:
## Your responses here

## 4. Conclusion

In this notebook, we learned how to:

* import the necessary libraries;
* load and preprocess our dataset;
* define a neural network model using Keras;
* fit (train) our model.

Please summarize your findings below:

1. Which model performed the best so far: decision tree, logistic regression or a single-neuron model?
    
2. Which model trained faster?

3. Why in this example the single-neuron model performed as it did?

4. How can we improve the performance of neural networks?