<a href="https://colab.research.google.com/github/AhmadM-DL/Intro2DL/blob/main/Introduction2DL_Part2_Artificial_Neural_Networks_with_SKlearn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Artificial Neural Networks with SciKit Learn
<!-- 
By: Ahmad Mustapha - Machine Learning researcher @ AUB 
LinkedIn: ahmad-mustapha-ml
-->

[First we tinker with the NN](https://playground.tensorflow.org/#activation=relu&batchSize=10&dataset=spiral&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.81394&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false&playButton_hide=false)

### Machine Learning Task

In this Lab session we will dive into Deep Learning using Scikit Learn Multi-Layer Perceptron (MLP) classes.

We will be using [Pima Indians Diabetes Dataset](https://https://www.openml.org/d/37). The dataset task is to classify wither a female patient from the pima indians natives shows signs of diabetes according to World Health Organization criteria (i.e., if the 2 hour post-load plasma glucose was at least 200 mg/dl). All the patients where at least 21 of age.

The dataset contains 768 record each of 8 features/predictors along with the binary target meaning Diabetes/No Diabetes. The predictors are:
  1. Number of times pregnant
  2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test
  3. Diastolic blood pressure (mm Hg)
  4. Triceps skin fold thickness (mm)
  5. 2-Hour serum insulin (mu U/ml)
  6. Body mass index (weight in kg/(height in m)^2)
  7. Diabetes pedigree function
  8. Age (years)

The dataset is unbalanced as 500 recordes are labled as negative and 268 labeled as positive.

 


In [None]:
# Get Diabetes Data
!wget https://raw.githubusercontent.com/plotly/datasets/master/diabetes.csv

In [24]:
# Imports

# Import required libraries
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt

import sklearn
from sklearn.neural_network import MLPClassifier
from sklearn.neural_network import MLPRegressor

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

### Exploring/ Preparing the dataset

In [None]:
# Load Dataset
df = pd.read_csv('diabetes.csv') 

In [None]:
df.head()

In [None]:
df.describe().transpose()

In [49]:
## Prepare Data

# Seperate Target and Predictors, get as numpy array
features, targets = df.iloc[:, :-1].values, df.iloc[:, -1].values

# Scale Features (Scaling is importent for ANNs)
scaler = StandardScaler()
features = scaler.fit_transform(features)

# Split Train/Test
X_train, X_test, y_train, y_test = train_test_split(features, targets, test_size=0.30, random_state=40)

In [None]:
# The data manifold 
from sklearn.manifold import TSNE
#from sklearn.decomposition import PCA
tsne = TSNE(n_components=2, perplexity=30)
reduced_features = tsne.fit_transform(features)

fig = plt.figure(figsize=(5,5))
p_reduced_features = [row for i,row in enumerate(reduced_features) if targets[i]==1]
n_reduced_features = [row for i,row in enumerate(reduced_features) if targets[i]==0]

plt.scatter([f1 for (f1, _) in n_reduced_features], [f2 for (_, f2) in n_reduced_features], label="Diabetes Negative", c="green")
plt.scatter([f1 for (f1, _) in p_reduced_features], [f2 for (_, f2) in p_reduced_features], label="Diabetes Positive", c="red")

plt.title("Pima Indians Diabetes Data Manifold")
plt.xlabel("tsne_1")
plt.xlabel("tsne_2")
plt.legend()
plt.show()

### Baseline

In this section we will initilize and train SKlearn's [MLPClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html). We decided to start this to ease the learning curve. However SKlearn APIs obsecure alot of the Deep Learning underlayings.


In [67]:
# ANN Classifier
mlp = MLPClassifier(hidden_layer_sizes=(4), activation='relu', solver='adam', max_iter=500)

# Train
mlp.fit(X_train, y_train)

# Test
predict_train = mlp.predict(X_train)
predict_test = mlp.predict(X_test)

In [None]:
print(classification_report(y_test, predict_test))

### Using Keras

<img src="https://keras.io/img/logo.png" width="330"></img>

Keras is a high-level Deep Learning library. According to thier website: 

> Deep learning for humans. Keras is an API designed for human beings, not machines. Keras follows best practices for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear & actionable error messages. It also has extensive documentation and developer guides.

Keras is build on Tensorflow (A low-level DL Library)






Keras offers different neural networks building blocks. Those are available through the [keras.models](https://keras.io/api/models/model/) module. The [Sequential](https://keras.io/api/models/sequential/) model is used to stack different layers lineary in which gradients and computations can flow from one layer to another. In this lecture we will start by the [Dense](https://keras.io/api/layers/core_layers/dense/) layer building block. The Dense layer is used to create Fully Connected Layers. Remeber that MLP is a linear stack of fully connected layers. 

The Dense constructor takes on multiple parameters the following are of interest for now:
*   unites: The number of unites the layer include
*   acitivation: The [activation](https://keras.io/api/layers/activations/) function to apply after crossing input with weights [relu, sigmoid, tanh]
*   input_dim: The number size, use only for the first layer

In [106]:
# Imports 
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model
import tensorflow as tf
%load_ext tensorboard

In [90]:
# define the keras model
model = Sequential()
model.add(Dense(4, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

We can understand the built model using Keras models *`Summary`* or Keras Utility function `keras.utils.vis_utils.plot_model`

In [None]:
model.summary()

In [None]:
plot_model(model, show_shapes=True, show_layer_names=False)

We prepare the model for training using the model *compile* method. Using it we add more properties for the training process. Most of intrest for now:
*   Loss Function: The loss function used to compute the error between the model predictions and the ground truth.
*   Optimizer: The optimization method to use like stochastic gradient decent (SGD) and adaptive moment estimation (Adam)
*   Metrics: The metrics we want to report. For classification we report accuraccy. For regression might report mean square error (MSE). ..



In [102]:
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

We finally call the model's `fit` method to train it over the data. `fit` takes on the following parameters:
*   x: Training Data (predictors)
*   y: Training Ground Truth (targets)
*   epochs: The number of times the model have to pass over hte entire dataset
*   batch_size: The number of instance to train the model on at a time. The batch size is an importent parameter. By using batches we can train our models on huge datasets because we are loading a small sample of the records to the computing device (CPU, GPU) memory. When a batch of records it passed forward to the model. The loss is computed for each record averaged and fed back to the model to update the weights through backpropagation.
* callbacks: An array of callback functions. A callback is a function that fires during the training process according to a predifined trigger. You don't need to worry about it alot as we will be using build functions. On function of interest is TensorBaord callback. It takes the metrics reported by the model (see compile above) and logs them into a certain directory. We later can view the logs using Tensorboard.  





In [103]:
tensorboard_callback = tf.keras.callbacks.TensorBoard("./runs", histogram_freq=1)

In [None]:
# fit the keras model on the dataset
model.fit(X_train, y_train, epochs=500, batch_size=200, callbacks=[tensorboard_callback])

In [None]:
# evaluate the keras model
_, accuracy = model.evaluate(X_test, y_test)
print('Accuracy: %.2f' % (accuracy*100))  

In [None]:
%tensorboard --logdir "./runs"