![alt text](images/HDAT9500Banner.PNG)
<br>

# Chapter 6: Artificial Neural Networks / Deep Learning
# Exercise 01: 


# 1. Introduction

In this exercise, we will build our first dense neural network using Keras.


## 1.1. Aims of the Exercise:

1. This is an introduction to Artificial Neural Networks / Deep Learning. 
2. We will use Keras, a high-level API built on top of low level neural networks APIs such as Tensorflow and Theano. Keras takes care of many things and it is easy to use.

 
It aligns with all of the learning outcomes of our course: 

1.	Distinguish a range of task specific machine learning techniques appropriate for Health Data Science.
2.	Design machine learning tasks for Health Data Science scenarios.
3.	Construct appropriate training and test sets for health research data.


## 1.2. Jupyter Notebook Intructions
1. Read the content of each cell.
2. Where necessary, follow the instructions that are written in each cell.
3. Run/Execute all the cells that contain Python code sequentially (one at a time), using the "Run" button.
4. For those cells in which you are asked to write some code, please write the Python code first and then execute/run the cell.
 
## 1.3. Tips
 1. The square brackets on the left hand side of each cell indicate whether the cell has been executed or not. Empty square brackets mean that the cell has not been executed, whereas square brackets that contain a number means that the cell has been executed. Run all the cells in sequence, using the "Run" button.
 2. To edit this notebook, just double-click in each cell. In the document, each cell can be a "Code" cell or "text-Markdown" cell. To choose between these two options, go to the combo-box above. 
 3. If you want to save your notebook, please make sure you press the "floppy disk" icon button above. 
 4. To clean the content of all cells and re-start the Notebook, please go to Cell->All Output->Clear


# 2. Load the Wisconsin Cancer Data Set and Prepare the data

For data dictionary and all information:
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)

In [None]:
import sys
print(sys.version)
#For this notebook to work, Python must be 3.6.4 or 3.6.5

import numpy as np
import pandas as pd
from IPython.display import display

from plotnine import *
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
cancer = pd.read_csv('data/breast-cancer-wisconsin-data/data.csv', sep=',')

In [None]:
# Sanity Check:
display(cancer[:][:5])
print(cancer.shape)

In [None]:
cancer.dtypes

In [None]:
# Divide the data into X and y (output, labels)
X = cancer.iloc[:, 2:].values
y_categorical = cancer.iloc[:, 1].values

In [None]:
# Sanity check
display(X[:][:5])

<div class="alert alert-block alert-success">**Start Activity**</div>

Pay attention to the shape of the input vector!!!!
We will use it later in our ANN.

In [None]:
 print(X.shape)

### <font color='blue'> Question 1:  What is the meaning of (569,30)? </font>

<b> Write answer here:</b>
#####################################################################################################################

(Double-click here)


#####################################################################################################################

<div class="alert alert-block alert-warning">**End Activity**</div>

In [None]:
# Sanity check:
# We can see that # 19, 20 and 21 are B within the array that goes from 0:29
display(y_categorical[:30])
print(y_categorical.shape)

Encoding categorical data into 0-1

In [None]:
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder
labelencoder_X_1 = LabelEncoder()
y = labelencoder_X_1.fit_transform(y_categorical)

In [None]:
# Sanity check
# We can see that # 19, 20 and 21 are "0" within the array that goes from 0:29
# Therefore, the sanity check confirms that we have done the encoding correctly
display(y[:30])
print(y.shape)

Splitting the dataset into the Training set and Test set

In [None]:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1)

In [None]:
# Sanity Check
print(X_train.shape)
print(X_test.shape)

<font color=red>Scaling</font>  our data is <font color=red> very important </font> when we use ANNs:

In [None]:
# Very very very important: Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

So far, everything is very familiar. We have used some new scikit learn instructions, but esentially, we have been following these steps during the course.

The new part starts here:

# 3. Our first ANN using Keras


In [None]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation

<div class="alert alert-block alert-success">**Start Activity**</div>

### <font color='blue'> Question 2: Initalise ANN: Check page 28 of the book we use in this chapter and write the command to initialise our first ANN </font>

In [None]:
# Write Python Code here:


### <font color='blue'> Question 3:  Add input layer and first hidden layer: Using the function add, create the input layer and first hidden layer with 16 nodes, a relu activation function. Use the argument input_dim </font>
<p>
<font color='green'>Page 28 of our book</font>
<p>
<font color='green'>Read here how to use the function "add" by using the argument input_dim: 
<p>
    https://keras.io/getting-started/sequential-model-guide/</font>
<p>
<font color='green'> In addition, read:  
     https://keras.io/layers/merge/#add_1</font>
<p>
<font color='green'> Function "dense":           
    https://keras.io/layers/core/#dense

In [None]:
# Write Python Code here:


### <font color='blue'> Question 4:  Add second hidden layer: Using the functions "add" and "dense", create a second layer with 16 nodes, a relu activation function.  </font>
<p>
<font color='green'>Page 28 of our book</font>
<p>
<font color='green'>Read here how to use the function "add" by using the argument input_dim: 
<p>
    https://keras.io/getting-started/sequential-model-guide/</font>
<p>
<font color='green'> In addition, read:  
     https://keras.io/layers/merge/#add_1</font>
<p>
<font color='green'> Function "dense":           
    https://keras.io/layers/core/#dense

In [None]:
# Write Python Code here:


### <font color='blue'> Question 5:  Add output layer: Using the functions "add" and "dense". What activation function would you use and why?  </font>

In [None]:
# Write Python Code here:


### <font color='blue'> Question 6: Compile our model.  Use gradient descent ("adam", for example) for the optimizer, "binary_crossentropy" as the loss function, and accuracy as our metric.  </font>

In [None]:
# Write Python Code here:
# Compiling the ANN


### <font color='blue'> Question 7: # Fitting the ANN to the Training set. Set the batch_size=100, nb_epoch=150. These hyper-parameters have to be tuned. These numbers have been optimised already. How much accuracy does the model obtain in the training test?</font>
<p>
<font color='blue'>Note: batch_size, nb_epoch can be tuned using GridSearchCV or trial and error. We will tune these parameters in the course final assignment.
</font>

In [None]:
# Write Python Code here:


### <font color='blue'> Question 8: Can you define: batch_size=100, nb_epoch=150?</font>
<p>
<font color='blue'>Note: batch_size, nb_epoch can be tuned using GridSearchCV or trial and error. We will tune these parameters in the course final assignment.
</font>

<b> Write answer here:</b>
#####################################################################################################################

(Double-click here)


#####################################################################################################################

### <font color='blue'> Question 9: Calculate accuracy, confusion matrix and all the metrics included in classification_report function  </font>

In [None]:
# Write Python Code here:
# Compiling the ANN
# Predicting the Test set results


In [None]:
# Making the Confusion Matrix


### <font color='blue'> Question 10: Write your conclusions about the performance and potential use of this classifier. </font>

<b> Write answer here:</b>
#####################################################################################################################

(Double-click here)


#####################################################################################################################

<div class="alert alert-block alert-warning">**End Activity**</div>