![alt text](images/HDAT9500Banner.PNG)
<br>

# Chapter 6: Artificial Neural Networks / Deep Learning
# Exercise 01: 


# 1. Introduction

In this exercise, we will build our first dense neural network using Keras.


## 1.1. Aims of the Exercise:

1. This is an introduction to Artificial Neural Networks / Deep Learning. 
2. We will use Keras, a high-level API built on top of low level neural networks APIs such as Tensorflow and Theano. Keras takes care of many things and it is easy to use.

 
It aligns with all of the learning outcomes of our course: 

1.	Distinguish a range of task specific machine learning techniques appropriate for Health Data Science.
2.	Design machine learning tasks for Health Data Science scenarios.
3.	Construct appropriate training and test sets for health research data.


## 1.2. Jupyter Notebook Intructions
1. Read the content of each cell.
2. Where necessary, follow the instructions that are written in each cell.
3. Run/Execute all the cells that contain Python code sequentially (one at a time), using the "Run" button.
4. For those cells in which you are asked to write some code, please write the Python code first and then execute/run the cell.
 
## 1.3. Tips
 1. The square brackets on the left hand side of each cell indicate whether the cell has been executed or not. Empty square brackets mean that the cell has not been executed, whereas square brackets that contain a number means that the cell has been executed. Run all the cells in sequence, using the "Run" button.
 2. To edit this notebook, just double-click in each cell. In the document, each cell can be a "Code" cell or "text-Markdown" cell. To choose between these two options, go to the combo-box above. 
 3. If you want to save your notebook, please make sure you press the "floppy disk" icon button above. 
 4. To clean the content of all cells and re-start the Notebook, please go to Cell->All Output->Clear


# 2. Load the Wisconsin Cancer Data Set and Prepare the data

For data dictionary and all information:
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)

In [1]:
import sys
print(sys.version)
#For this notebook to work, Python must be 3.6.4 or 3.6.5

import numpy as np
import pandas as pd
from IPython.display import display

from plotnine import *
import matplotlib.pyplot as plt
import seaborn as sns

3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]


In [2]:
cancer = pd.read_csv('data/breast-cancer-wisconsin-data/data.csv', sep=',')

In [3]:
# Sanity Check:
display(cancer[:][:5])
print(cancer.shape)

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,84348301,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,84358402,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


(569, 32)


In [4]:
cancer.dtypes

id                           int64
diagnosis                   object
radius_mean                float64
texture_mean               float64
perimeter_mean             float64
area_mean                  float64
smoothness_mean            float64
compactness_mean           float64
concavity_mean             float64
concave points_mean        float64
symmetry_mean              float64
fractal_dimension_mean     float64
radius_se                  float64
texture_se                 float64
perimeter_se               float64
area_se                    float64
smoothness_se              float64
compactness_se             float64
concavity_se               float64
concave points_se          float64
symmetry_se                float64
fractal_dimension_se       float64
radius_worst               float64
texture_worst              float64
perimeter_worst            float64
area_worst                 float64
smoothness_worst           float64
compactness_worst          float64
concavity_worst     

In [5]:
# Divide the data into X and y (output, labels)
X = cancer.iloc[:, 2:].values
y_categorical = cancer.iloc[:, 1].values

In [6]:
# Sanity check
display(X[:][:5])

array([[1.799e+01, 1.038e+01, 1.228e+02, 1.001e+03, 1.184e-01, 2.776e-01,
        3.001e-01, 1.471e-01, 2.419e-01, 7.871e-02, 1.095e+00, 9.053e-01,
        8.589e+00, 1.534e+02, 6.399e-03, 4.904e-02, 5.373e-02, 1.587e-02,
        3.003e-02, 6.193e-03, 2.538e+01, 1.733e+01, 1.846e+02, 2.019e+03,
        1.622e-01, 6.656e-01, 7.119e-01, 2.654e-01, 4.601e-01, 1.189e-01],
       [2.057e+01, 1.777e+01, 1.329e+02, 1.326e+03, 8.474e-02, 7.864e-02,
        8.690e-02, 7.017e-02, 1.812e-01, 5.667e-02, 5.435e-01, 7.339e-01,
        3.398e+00, 7.408e+01, 5.225e-03, 1.308e-02, 1.860e-02, 1.340e-02,
        1.389e-02, 3.532e-03, 2.499e+01, 2.341e+01, 1.588e+02, 1.956e+03,
        1.238e-01, 1.866e-01, 2.416e-01, 1.860e-01, 2.750e-01, 8.902e-02],
       [1.969e+01, 2.125e+01, 1.300e+02, 1.203e+03, 1.096e-01, 1.599e-01,
        1.974e-01, 1.279e-01, 2.069e-01, 5.999e-02, 7.456e-01, 7.869e-01,
        4.585e+00, 9.403e+01, 6.150e-03, 4.006e-02, 3.832e-02, 2.058e-02,
        2.250e-02, 4.571e-03, 2.357e

<div class="alert alert-block alert-success">**Start Activity**</div>

Pay attention to the shape of the input vector!!!!
We will use it later in our ANN.

In [7]:
 print(X.shape)

(569, 30)


### <font color='blue'> Question 1:  What is the meaning of (569,30)? Which features were removed from the original data?</font>

<b> Write answer here:</b>
#####################################################################################################################

There are 569 records with 30 features.


#####################################################################################################################

<div class="alert alert-block alert-warning">**End Activity**</div>

In [8]:
# Sanity check:
# We can see that # 19, 20 and 21 are B within the array that goes from 0:29
display(y_categorical[:30])
print(y_categorical.shape)

array(['M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'B', 'B', 'B', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M'], dtype=object)

(569,)


Encoding categorical data into 0-1

In [9]:
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder
labelencoder_X_1 = LabelEncoder()
y = labelencoder_X_1.fit_transform(y_categorical)

In [10]:
# Sanity check
# We can see that # 19, 20 and 21 are "0" within the array that goes from 0:29
# Therefore, the sanity check confirms that we have done the encoding correctly
display(y[:30])
print(y.shape)

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0,
       1, 1, 1, 1, 1, 1, 1, 1], dtype=int64)

(569,)


Splitting the dataset into the Training set and Test set

In [11]:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1)

In [12]:
# Sanity Check
print(X_train.shape)
print(X_test.shape)

(455, 30)
(114, 30)


<font color=red>Scaling</font>  our data is <font color=red> very important </font> when we use ANNs:

In [13]:
# Very very very important: Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

So far, everything is very familiar. We have used some new scikit learn instructions, but esentially, we have been following these steps during the course.

The new part starts here:

# 3. Our first ANN using Keras


In [14]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


<div class="alert alert-block alert-success">**Start Activity**</div>

### <font color='blue'> Question 2: Initalise ANN: Check page 28 of the book we use in this chapter and write the command to initialise our first ANN </font>

In [15]:
# Write Python Code here:
from numpy.random import seed
seed(4)
from tensorflow import set_random_seed
set_random_seed(9)

In [16]:
ann1 = Sequential()




### <font color='blue'> Question 3:  Add input layer and first hidden layer: Using the function add, create the input layer and first hidden layer with 16 nodes, a relu activation function. Use the argument input_dim </font>
<p>
<font color='green'>Page 28 of our book</font>
<p>
<font color='green'>Read here how to use the function "add" by using the argument input_dim: 
<p>
    https://keras.io/getting-started/sequential-model-guide/</font>
<p>
<font color='green'> In addition, read:  
     https://keras.io/layers/merge/#add_1</font>
<p>
<font color='green'> Function "dense":           
    https://keras.io/layers/core/#dense

In [17]:
# Write Python Code here:
ann1.add(Dense(output_dim = 16, kernel_initializer = 'uniform', input_dim = 30))
ann1.add(Activation('relu'))





  


### <font color='blue'> Question 4:  Add second hidden layer: Using the functions "add" and "dense", create a second layer with 16 nodes, a relu activation function.  </font>
<p>
<font color='green'>Page 28 of our book</font>
<p>
<font color='green'>Read here how to use the function "add" by using the argument input_dim: 
<p>
    https://keras.io/getting-started/sequential-model-guide/</font>
<p>
<font color='green'> In addition, read:  
     https://keras.io/layers/merge/#add_1</font>
<p>
<font color='green'> Function "dense":           
    https://keras.io/layers/core/#dense

In [18]:
# Write Python Code here:
ann1.add(Dense(output_dim = 16, kernel_initializer = 'uniform', activation = 'relu'))

  


### <font color='blue'> Question 5:  Add output layer: Using the functions "add" and "dense". What activation function would you use and why?  </font>

In [19]:
# Write Python Code here:
ann1.add(Dense(output_dim=1, kernel_initializer='uniform', activation = 'sigmoid'))

  


### <font color='blue'> Question 6: Compile our model.  Use gradient descent, for example "adam" for the optimizer, for the optimizer, "binary_crossentropy" as the loss function, and accuracy as our metric.  </font>

In [25]:
# Write Python Code here:
# Compiling the ANN
ann1.compile(optimizer = 'adam',
            loss = 'binary_crossentropy',
            metrics = ['accuracy']
            )

### <font color='blue'> Question 7: Fitting the ANN to the Training set. Set the batch_size=100, nb_epoch=150. These hyper-parameters have to be tuned. These numbers have been optimised already. How much accuracy does the model obtain in the training test?</font>
<p>
<font color='blue'>Note: batch_size, nb_epoch can be tuned using GridSearchCV or trial and error. We will tune these parameters in the course final assignment.
</font>

In [26]:
# Write Python Code here:
ann1.fit(X_train, y_train, epochs=150, batch_size=100)

Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150
Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74/150
Epoch 75/150
Epoch 76/150
Epoch 77/150
Epoch 78

Epoch 85/150
Epoch 86/150
Epoch 87/150
Epoch 88/150
Epoch 89/150
Epoch 90/150
Epoch 91/150
Epoch 92/150
Epoch 93/150
Epoch 94/150
Epoch 95/150
Epoch 96/150
Epoch 97/150
Epoch 98/150
Epoch 99/150
Epoch 100/150
Epoch 101/150
Epoch 102/150
Epoch 103/150
Epoch 104/150
Epoch 105/150
Epoch 106/150
Epoch 107/150
Epoch 108/150
Epoch 109/150
Epoch 110/150
Epoch 111/150
Epoch 112/150
Epoch 113/150
Epoch 114/150
Epoch 115/150
Epoch 116/150
Epoch 117/150
Epoch 118/150
Epoch 119/150
Epoch 120/150
Epoch 121/150
Epoch 122/150
Epoch 123/150
Epoch 124/150
Epoch 125/150
Epoch 126/150
Epoch 127/150
Epoch 128/150
Epoch 129/150
Epoch 130/150
Epoch 131/150
Epoch 132/150
Epoch 133/150
Epoch 134/150
Epoch 135/150
Epoch 136/150
Epoch 137/150
Epoch 138/150
Epoch 139/150
Epoch 140/150
Epoch 141/150
Epoch 142/150
Epoch 143/150
Epoch 144/150
Epoch 145/150
Epoch 146/150
Epoch 147/150
Epoch 148/150
Epoch 149/150
Epoch 150/150


<keras.callbacks.History at 0x1d1aeadb6a0>

### <font color='blue'> Question 8: Can you define: batch_size=100, nb_epoch=150?</font>
<p>
<font color='blue'>Note: batch_size, nb_epoch can be tuned using GridSearchCV or trial and error. We will tune these parameters in the course final assignment.
</font>

<b> Write answer here:</b>
#####################################################################################################################

(Double-click here)


#####################################################################################################################

### <font color='blue'> Question 9: Calculate accuracy, confusion matrix and all the metrics included in classification_report function  </font>

In [29]:
# Write Python Code here:
# Compiling the ANN
# Predicting the Test set results
y_pred = ann1.predict_classes(X_test)

In [32]:
# Making the Confusion Matrix
from sklearn.metrics import *

print("Confusion matrix of the benign only test set:")
cm = confusion_matrix(y_true = y_test, y_pred = y_pred)
print(cm)

Confusion matrix of the benign only test set:
[[71  1]
 [ 3 39]]


In [33]:
print("Precision, Recall, F1-score for positive and negative classes of the combined benign and melignant test set:")
print(classification_report(y_test, y_pred))
print('Accuracy for combined benign and malignant test: {:.3f}.'.format(accuracy_score(y_test, y_pred)))

Precision, Recall, F1-score for positive and negative classes of the combined benign and melignant test set:
             precision    recall  f1-score   support

          0       0.96      0.99      0.97        72
          1       0.97      0.93      0.95        42

avg / total       0.97      0.96      0.96       114

Accuracy for combined benign and malignant test: 0.965.


### <font color='blue'> Question 10: Write your conclusions about the performance and potential use of this classifier. </font>

<b> Write answer here:</b>
#####################################################################################################################

(Double-click here)


#####################################################################################################################

<div class="alert alert-block alert-warning">**End Activity**</div>