# Activation and Loss functions

### To complete this assignment, create a Jupyter notebook containing your solutions to the following tasks:

1. In this task, you'll implement several ANN models with different activation functions. Specifically:
  1. Implement a three layer ANN model with 128, 64 and 10 neurons in the layers using tanh activation function for each layer.
  1. Implement a three layer ANN model with 128, 64 and 10 neurons in the layers using sigmoid activation function for each layer.
  1. Implement a three layer ANN model with 128, 64 and 10 neurons in the layers using ReLU activation function for each layer.
  1. Compare the result of each model with each other. Which activation function did perform better?
1. In this task, you'll implement the ANN models specified below with the hinge loss function as their loss functions:
  1. Implement a three layer ANN model with 128, 64 and 10 neurons in the layers using tanh activation function for each layer.
  1. Implement a three layer ANN model with 128, 64 and 10 neurons in the layers using sigmoid activation function for each layer.
  1. Implement a three layer ANN model with 128, 64 and 10 neurons in the layers using ReLU activation function for each layer.
  1. Compare the result of each model with the result of the same model from the previous task. Which loss function did perform better?

### Lib imports and data load

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential 
from tensorflow.keras.layers import Dense

In [2]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

input_dim = 784  # 28*28
output_dim = nb_classes = 10
batch_size = 128

X_train = X_train.reshape(60000, input_dim)
X_test = X_test.reshape(10000, input_dim)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

Y_train = to_categorical(y_train, nb_classes)
Y_test = to_categorical(y_test, nb_classes)

### Model Set 1 - Loss:'categorical_crossentropy'

In [3]:
for actv_func in ['tanh', 'sigmoid', 'relu']:
    model = Sequential()
    model.add(Dense(128, input_shape=(input_dim,), activation=actv_func))
    model.add(Dense(64, activation=actv_func))
    model.add(Dense(output_dim, activation=actv_func))

    mtrcs = ['accuracy']
    loss_fn = 'categorical_crossentropy'
    model.compile(optimizer='adam', loss=loss_fn, metrics=mtrcs)
    model.fit(X_train, Y_train, batch_size=batch_size, epochs=20, verbose=0)

    train_score = model.evaluate(X_train, Y_train, verbose=0)
    test_score = model.evaluate(X_test, Y_test, verbose=0)

    scoredf = pd.DataFrame(index=[f"loss-function: {loss_fn}"] + ["metrics: " + i for i in mtrcs], 
                           columns=['train-score', 'test-score'])
    scoredf['train-score'] = train_score
    scoredf['test-score'] = test_score
    display(f"activation function: {actv_func}",
            scoredf)

'activation function: tanh'

Unnamed: 0,train-score,test-score
loss-function: categorical_crossentropy,7.893569,7.833394
metrics: accuracy,0.09035,0.0892


'activation function: sigmoid'

Unnamed: 0,train-score,test-score
loss-function: categorical_crossentropy,0.009655,0.069892
metrics: accuracy,0.998917,0.9786


'activation function: relu'

Unnamed: 0,train-score,test-score
loss-function: categorical_crossentropy,,
metrics: accuracy,0.098717,0.098


<span style="color:blue">Sigmoid performs the best with the highest accuracy and lowest loss score for both the training and validation data.</span>

### Model Set 2 - Loss:'hinge'

In [4]:
for actv_func in ['tanh', 'sigmoid', 'relu']:
    model = Sequential()
    model.add(Dense(128, input_shape=(input_dim,), activation=actv_func))
    model.add(Dense(64, activation=actv_func))
    model.add(Dense(output_dim, activation=actv_func))

    mtrcs = ['accuracy']
    loss_fn = 'hinge'
    model.compile(optimizer='adam', loss=loss_fn, metrics=mtrcs)
    model.fit(X_train, Y_train, batch_size=batch_size, epochs=20, verbose=0)

    train_score = model.evaluate(X_train, Y_train, verbose=0)
    test_score = model.evaluate(X_test, Y_test, verbose=0)

    scoredf = pd.DataFrame(index=[f"loss-function: {loss_fn}"] + ["metrics: " + i for i in mtrcs], 
                           columns=['train-score', 'test-score'])
    scoredf['train-score'] = train_score
    scoredf['test-score'] = test_score
    display(f"activation function: {actv_func}",
            scoredf)

'activation function: tanh'

Unnamed: 0,train-score,test-score
loss-function: hinge,0.157794,0.158831
metrics: accuracy,0.2413,0.2411


'activation function: sigmoid'

Unnamed: 0,train-score,test-score
loss-function: hinge,1.000002,1.000002
metrics: accuracy,0.097517,0.0974


'activation function: relu'

Unnamed: 0,train-score,test-score
loss-function: hinge,0.959472,0.960573
metrics: accuracy,0.4073,0.4063


<span style="color:blue">With hinge loss function, tanh has the lowest loss score and relu has the highest accuracy.  funnily enough, sigmoid which performed the best with categorical_crossentropy loss, performed the worst with hinge loss.  Across all 6 models build, sigmoid activation with categorical crossentropy performed the best.</span>