## Aim

In this notebook aims to build, compile and fit a neural network model to the Iris dataset. 
We will implement validation, regularisation and callbacks to improve the model.

In [None]:
#### PACKAGE IMPORTS ####

from numpy.random import seed
seed(92)
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, model_selection 


# If you would like to make further imports from tensorflow, add them here
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from tensorflow.keras import initializers

## The Iris dataset

We will use the [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html). It cantains 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. For a reference, see the following papers:

- R. A. Fisher. "The use of multiple measurements in taxonomic problems". Annals of Eugenics. 7 (2): 179–188, 1936.

Our goal is to construct a neural network to classifies each sample into the correct class, as well as applying validation and regularisation techniques.

In [None]:
#To connect to the database while programming in Kaggle, we use the following setup:
datafile = "../input/Iris.csv"
#database = "../input/Database.sqlite"

#Alternatively, we can use a built-in function `sklearn.datasets.load_iris()`


In [None]:
iris_data = datasets.load_iris()

targets = iris_data.target
data = iris_data.data    

train_data, test_data, train_targets, test_targets = train_test_split(data, targets, test_size=0.1)

In [None]:
## We will now convert the training and test targets using a one hot encoder.

train_targets = tf.keras.utils.to_categorical(np.array(train_targets))
test_targets = tf.keras.utils.to_categorical(np.array(test_targets))

## Sample Model

We now define a funtion to return a sample model. 
Following are the characteristics of the sample model.
* The model will use the `input_shape` in the function argument to set the input size in the first layer.
* The first layer will be a dense layer with 64 units.
* The weights of the first layer will be initialised with the He uniform initializer.
* The biases of the first layer will be all initially equal to one.
* There will be a further four dense layers, each with 128 units.
* This will be followed with four dense layers, each with 64 units.
* All of these Dense layers will use the ReLU activation function.
* The output Dense layer will have 3 units and the softmax activation function.


In [None]:
## Sample NN model

def get_model(input_shape):
    model=tf.keras.Sequential([
        Dense(64, activation='relu', input_shape=(input_shape),
              kernel_initializer=tf.keras.initializers.he_uniform(),
              bias_initializer=initializers.Ones()),
        Dense(128,activation='relu'),
        Dense(128,activation='relu'),
        Dense(128,activation='relu'),
        Dense(128,activation='relu'),
        
        Dense(64, activation='relu'),
        Dense(64, activation='relu'),
        Dense(64, activation='relu'),
        Dense(64, activation='relu'),
        
        Dense(3, activation='softmax')
    ])
    return model

In [None]:
model = get_model(train_data[0].shape)
print(model.summary())

In [None]:
## Compile the model:
model.compile(loss='mse', optimizer="adam", metrics=["mse","mae","accuracy"])

## Fitting the model

Nowwe will train the model on the Iris dataset, using the model's `fit` method. 
* The training will run for a fixed number of epochs, given by the function's `epochs` argument.
* We will return the training history to be used for plotting the learning curves.
* We set the batch size to 40 and the validation set to be 15% of the training set.

In [None]:
history = model.fit( train_data, train_targets, epochs=1000, batch_size=40, validation_split=0.40)

We will now plot two graphs:

Epoch vs accuracy
Epoch vs loss

In [None]:
try:
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
except KeyError:
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show() 

In [None]:
#Run this cell to plot the epoch vs loss graph
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show() 

## Oh No!

It seems that we have *overfit* our dataset. We will now try to mitigate this overfitting by regularisation.

The specs for the new regularised model are the same as our original model, with the addition of two dropout layers, weight decay, and a batch normalisation layer. 

In particular, we will

* add a dropout layer after the 3rd Dense layer
* add two more Dense layers with 128 units before a batch normalisation layer
* add two more Dense layers with 64 units and then another Dropout layer
* add two more Dense layers with 64 units and then the final 3-way softmax layer
* add weight decay (l2 kernel regularisation) in all Dense layers except the final softmax layer

In [None]:
from tensorflow.keras import regularizers
from tensorflow.keras.layers import Dropout

def get_regularised_model(input_shape, dropout_rate, weight_decay):
    model=tf.keras.Sequential([
        Dense(64, activation='relu', input_shape=(input_shape),
             kernel_initializer=tf.keras.initializers.he_uniform(),
             bias_initializer=initializers.Ones(),
             kernel_regularizer=regularizers.l2(weight_decay)),
        Dense(128,activation='relu', kernel_regularizer=regularizers.l2(weight_decay)),
        Dense(128,activation='relu',kernel_regularizer=regularizers.l2(weight_decay)),
        Dropout(dropout_rate),
        Dense(128,activation='relu',kernel_regularizer=regularizers.l2(weight_decay)),
        Dense(128,activation='relu',kernel_regularizer=regularizers.l2(weight_decay)),
        
        Dense(64, activation='relu',kernel_regularizer=regularizers.l2(weight_decay)),
        Dense(64, activation='relu',kernel_regularizer=regularizers.l2(weight_decay)),
        Dropout(dropout_rate),
        Dense(64, activation='relu',kernel_regularizer=regularizers.l2(weight_decay)),
        Dense(64, activation='relu',kernel_regularizer=regularizers.l2(weight_decay)),
        
        Dense(3, activation='softmax')
    ])
    return model
    

#### Instantiate, compile and train the model

In [None]:
# Instantiate the model, using a dropout rate of 0.3 and weight decay coefficient of 0.001

reg_model = get_regularised_model(train_data[0].shape, 0.3, 0.001)

# Compile the model
reg_model.compile(loss='mse', optimizer="adam", metrics=["mse","mae","accuracy"])

In [None]:
history2 = reg_model.fit( train_data, train_targets, epochs=1000, batch_size=40, validation_split=0.40)

In [None]:
#Run this cell to plot the new accuracy vs epoch graph

try:
    plt.plot(history2.history['accuracy'])
    plt.plot(history2.history['val_accuracy'])
except KeyError:
    plt.plot(history2.history['acc'])
    plt.plot(history2.history['val_acc'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show() 

In [None]:
#Run this cell to plot the new loss vs epoch graph

plt.plot(history2.history['loss'])
plt.plot(history2.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show() 

We can see that the regularisation has helped to reduce the overfitting of the network.
We will now incorporate *callbacks* into a new training run that implements early stopping and learning rate reduction on plateaux.

The function below performs the following:

* It creates an `EarlyStopping` callback object and a `ReduceLROnPlateau` callback object
* The early stopping callback is used and monitors validation loss with the mode set to `"min"` and patience of 30.
* The learning rate reduction on plateaux is used with a learning rate factor of 0.2 and a patience of 20.

In [None]:
def get_callbacks():

    early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=30, mode='min')
    
    learning_rate_reduction = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=20)
    
    return early_stopping, learning_rate_reduction


In [None]:
## Istantiate the model which incorporates the callbacks:

call_model = get_regularised_model(train_data[0].shape, 0.3, 0.0001)
call_model.compile(loss='mse', optimizer="adam", metrics=["mse","mae","accuracy"])


In [None]:

early_stopping, learning_rate_reduction = get_callbacks()
call_history = call_model.fit(train_data, train_targets, epochs=800, validation_split=0.15,
                         callbacks=[early_stopping, learning_rate_reduction], verbose=0)

In [None]:
#Run this cell to plot the new accuracy vs epoch graph

try:
    plt.plot(call_history.history['accuracy'])
    plt.plot(call_history.history['val_accuracy'])
except KeyError:
    plt.plot(call_history.history['acc'])
    plt.plot(call_history.history['val_acc'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='lower right')
plt.show() 

In [None]:
#Run this cell to plot the new loss vs epoch graph

plt.plot(call_history.history['loss'])
plt.plot(call_history.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show() 

In [None]:
# Evaluate the model on the test set
results = call_model.evaluate(test_data, test_targets, verbose=0)
#contents of 'results' is loss vaule and  metrics which are ["mse","mae","accuracy"], as set during compilie stage.

print("Test loss: {:.3f}\nTest accuracy: {:.2f}%".format(results[0], 100 * results[3]))

Following code gives generalized examples of callbacks. Provisous example may be modified accordingly.

In [None]:
# #### Example training callback
# Write a custom callback
from tensorflow.keras.callbacks import Callback

class TrainingCallback(Callback):
    def on_train_begin(self, logs=None):
        print ("Starting training...")
    def on_epoch_begin(self, epoch, logs=None):
        print(f"Starting epoch {epoch}")
    def on_train_batch_begin(self, batch, logs=None):
        print(f"Training: Starting batch {batch}")
    def on_train_batch_end(self, batch, logs=None):
        print(f"Training: Finished batch {batch}")
    def on_epoch_end(self, epoch, logs=None):
        print(f"Training: Fininshed epoch {epoch}")    
    def on_train_end(self, logs=None):
        print ("Finished training.")

class TestingCallback(Callback):
    def on_test_begin(self, logs=None):
        print ("Starting testing...")
    def on_test_batch_begin(self, batch, logs=None):
        print(f"Testing: Starting batch {batch}")
    def on_test_batch_end(self, batch, logs=None):
        print(f"Testing: Finished batch {batch}")
    def on_test_end(self, logs=None):
        print ("Finished testing.")

class PredcitionCallback(Callback):
    def on_predict_begin(self, logs=None):
        print ("Starting predict ing...")
    def on_predict_batch_begin(self, batch, logs=None):
        print(f"Predicting: Starting batch {batch}")
    def on_predict_bach_end(self, batch, logs=None):
        print(f"Predicting: Finished batch {batch}")
    def on_predict_end(self, logs=None):
        print ("Finished predicting.")


## Sample calls

In [None]:
history = call_model.fit(train_data, train_targets, epochs=5, 
                      validation_split=0.15, 
                      batch_size=128, 
                      verbose=0,
                      callbacks=[TrainingCallback()])
# Evaluate the model
call_model.evaluate(test_data , test_targets, verbose=0, callbacks=[TestingCallback()])

# Make predictions with the model
call_model.predict(test_data, verbose=0, callbacks=[PredcitionCallback()])




## Sample calls

In [None]:
## Training with early stopping:
# Re-train the regularised model
history = call_model.fit(
    train_data, train_targets, 
    epochs=100, validation_split=0.15, batch_size=64,
    verbose=2, callbacks=[tf.keras.callbacks.EarlyStopping()])