![alt text](images/HDAT9500Banner.PNG)
<br>

# Chapter 6: Artificial Neural Networks / Deep Learning
# Assignment: Add regularization to a neural network by using Dropout


# 1. Introduction

In this exercise, we will regularize the deep neural network we built in the previous exercise.

"Dropout works by probabilistically removing, or “dropping out,” inputs to a layer, which may be input variables in the data sample or activations from a previous layer."

Read:

https://machinelearningmastery.com/how-to-reduce-overfitting-with-dropout-regularization-in-keras/
http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

In particular, pay attention to the section: MLP Dropout Regularization.


## 1.1. Aims of the Exercise:

1. This is an introduction to Artificial Neural Networks / Deep Learning with regularization. 
2. We will use Keras.

 
It aligns with all of our course learning outcomes: 

1.	Distinguish a range of task specific machine learning techniques appropriate for Health Data Science.
2.	Design machine learning tasks for Health Data Science scenarios.
3.	Construct appropriate training and test sets for health research data.


## 1.2. Jupyter Notebook Intructions
1. Read the content of each cell.
2. Where necessary, follow the instructions that are written in each cell.
3. Run/Execute all the cells that contain Python code sequentially (one at a time), using the "Run" button.
4. For those cells in which you are asked to write some code, please write the Python code first and then execute/run the cell.
 
## 1.3. Tips
 1. The square brackets on the left hand side of each cell indicate whether the cell has been executed or not. Empty square brackets mean that the cell has not been executed, whereas square brackets that contain a number means that the cell has been executed. Run all the cells in sequence, using the "Run" button.
 2. To edit this notebook, just double-click in each cell. In the document, each cell can be a "Code" cell or "text-Markdown" cell. To choose between these two options, go to the combo-box above. 
 3. If you want to save your notebook, please make sure you press the "floppy disk" icon button above. 
 4. To clean the content of all cells and re-start the Notebook, please go to Cell->All Output->Clear


# 2. Load the Wisconsin Cancer Data Set and Prepare the data

For data dictionary and all information:
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)

In [None]:
import sys
print(sys.version)
#For this notebook to work, Python must be 3.6.4 or 3.6.5

import numpy as np
import pandas as pd
from IPython.display import display

from plotnine import *
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
cancer = pd.read_csv('data/breast-cancer-wisconsin-data/data.csv', sep=',')

In [None]:
# Sanity Check:
display(cancer[:][:5])
print(cancer.shape)

In [None]:
cancer.dtypes

In [None]:
# Divide the data into X and y (output, labels)
X = cancer.iloc[:, 2:].values
y_categorical = cancer.iloc[:, 1].values

In [None]:
# Sanity check
display(X[:][:5])

Pay attention to the shape of the input vector!!!!
We will use it later in our ANN.

In [None]:
 print(X.shape)

In [None]:
# Sanity check:
# We can see that # 19, 20 and 21 are B within the array that goes from 0:29
display(y_categorical[:30])
print(y_categorical.shape)

Encoding categorical data into 0-1

In [None]:
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder
labelencoder_X_1 = LabelEncoder()
y = labelencoder_X_1.fit_transform(y_categorical)

In [None]:
# Sanity check
# We can see that # 19, 20 and 21 are "0" within the array that goes from 0:29
# Therefore, the sanity check confirms that we have done the encoding correctly
display(y[:30])
print(y.shape)

Splitting the dataset into the Training set and Test set.

In [None]:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1)

<font color=red>Scaling</font>  our data is <font color=red> very important </font> when we use ANNs:

In [None]:
# Very very very important: Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

So far, everything is very familiar. We have used some new scikit learn instructions, but esentially, we have been following these steps during the course.

The new part starts here:

# 3. Our first ANN using Keras


In [None]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation

<div class="alert alert-block alert-success">**Start Activity**</div>

### <font color='blue'> Question 1: Build the whole ANN as you did in the previous exercise. Add dropout regularization to the two hidden layers. Choose an appropriate value for the regularization (tune the dropout value). How would you define dropout and how does it affect the network? (70 marks) </font>

<font color='blue'> Play around with several values of dropout and see the effect that this value has in the network.</font>

<b> Write answer here:</b>
#####################################################################################################################

(Double-click here)


#####################################################################################################################

In [None]:
# Write Python Code here:


### <font color='blue'> Question 2: Calculate accuracy, confusion matrix and all the metrics included in classification_report function. (15 marks) </font>

In [None]:
# Write Python Code here:
# Compiling the ANN
# Predicting the Test set results


In [None]:
# Making the Confusion Matrix


### <font color='blue'> Question 3: Write your conclusions about the performance and potential use of this classifier. (15 marks) </font>

<b> Write answer here:</b>
#####################################################################################################################

(Double-click here)


#####################################################################################################################

<div class="alert alert-block alert-warning">**End Activity**</div>