<a href="https://colab.research.google.com/github/LokeRuiKee/ml-dl-playground/blob/main/ANN_experiment1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Algorithm: ANN



List of activation function:

1. Rectified Linear Unit (ReLU)
2. Leaky ReLU
3. Exponential Linear Unit (ELU)
4. Parametric ReLU (PReLU)
5. Sigmoid Function
6. Hyperbolic Tangent (tanh) Function
7. SoftMax Function


List of optimizer:

1. Stochastic Gradient Descent (SGD)
2. SGD with Momentum
3. RMSProp (Root Mean Square Propagation)
4. Adam (Adaptive Moment Estimation)
5. Adamax
6. Nadam (Nesterov-accelerated Adaptive Moment Estimation)
7. AdaGrad (Adaptive Gradient Algorithm)
8. AdaDelta


# Template code

In [None]:
# https://colab.research.google.com/
# Importing libraries
import numpy as np
import datetime
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
# Loading the dataset
(X_train, y_train),(X_test, y_test)=fashion_mnist.load_data()
# Normalizing the images
X_train=X_train/255
X_test=X_test/255
# Reshaping the data
X_train.shape
X_train=X_train.reshape(-1,28*28)
X_train.shape
X_test=X_test.reshape(-1,28*28)
X_test.shape


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


(10000, 784)

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='relu',input_shape=(784,)))
# number of units/neurons: 128
# activation function: ReLU
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 128)               100480    
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8704000115394592


# 7 activation, optimizer: adam

activation_functions = ['relu', 'LeakyReLU', 'elu', 'PReLU', 'sigmoid', 'tanh', 'softmax']

*experiment of activation function here is only in input layer. output layer is softmax by default.

## relu

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='relu',input_shape=(784,)))
# number of units/neurons: 128
# activation function: ReLU
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_2 (Dense)             (None, 128)               100480    
                                                                 
 dropout_1 (Dropout)         (None, 128)               0         
                                                                 
 dense_3 (Dense)             (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.873199999332428


## LeakyReLU

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='LeakyReLU',input_shape=(784,)))
# number of units/neurons: 128
# activation function: LeakyReLU
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_6 (Dense)             (None, 128)               100480    
                                                                 
 dropout_2 (Dropout)         (None, 128)               0         
                                                                 
 dense_7 (Dense)             (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8697999715805054


## elu

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='elu',input_shape=(784,)))
# number of units/neurons: 128
# activation function: elu
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_8 (Dense)             (None, 128)               100480    
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_9 (Dense)             (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8788999915122986


## PReLU

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='PReLU',input_shape=(784,)))
# number of units/neurons: 128
# activation function: PReLU
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_11 (Dense)            (None, 128)               100608    
                                                                 
 dropout_4 (Dropout)         (None, 128)               0         
                                                                 
 dense_12 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101898 (398.04 KB)
Trainable params: 101898 (398.04 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8740000128746033


## sigmoid

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_13 (Dense)            (None, 128)               100480    
                                                                 
 dropout_5 (Dropout)         (None, 128)               0         
                                                                 
 dense_14 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8819000124931335


## tanh

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='tanh',input_shape=(784,)))
# number of units/neurons: 128
# activation function: tanh
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_15 (Dense)            (None, 128)               100480    
                                                                 
 dropout_6 (Dropout)         (None, 128)               0         
                                                                 
 dense_16 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8651999831199646


## softmax

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='softmax',input_shape=(784,)))
# number of units/neurons: 128
# activation function: softmax
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_17 (Dense)            (None, 128)               100480    
                                                                 
 dropout_7 (Dropout)         (None, 128)               0         
                                                                 
 dense_18 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8176000118255615


## Conclusion:
Best activation function = sigmoid. Test accuracy: 0.8819000124931335

# Best activation function: Sigmoid, 8 optimizer

optimizers = ['SGD', 'SGD with Momentum', 'RMSProp', 'Adam', 'Adamax', 'Nadam', 'AdaGrad', 'AdaDelta']


## SGD

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='SGD', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: SGD
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_19"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_35 (Dense)            (None, 128)               100480    
                                                                 
 dropout_16 (Dropout)        (None, 128)               0         
                                                                 
 dense_36 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8122000098228455


## SGD with Momentum

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
SGDm = tf.keras.optimizers.SGD(learning_rate=0.01, momentum=0.9)
model.compile(optimizer=SGDm, loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_12"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_21 (Dense)            (None, 128)               100480    
                                                                 
 dropout_9 (Dropout)         (None, 128)               0         
                                                                 
 dense_22 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8611000180244446


## RMSProp

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='RMSProp', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: RMSProp
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_13"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_23 (Dense)            (None, 128)               100480    
                                                                 
 dropout_10 (Dropout)        (None, 128)               0         
                                                                 
 dense_24 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.873199999332428


## adam

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_14"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_25 (Dense)            (None, 128)               100480    
                                                                 
 dropout_11 (Dropout)        (None, 128)               0         
                                                                 
 dense_26 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8780999779701233


## Adamax

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='adamax', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Adamax
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_15"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_27 (Dense)            (None, 128)               100480    
                                                                 
 dropout_12 (Dropout)        (None, 128)               0         
                                                                 
 dense_28 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8633999824523926


## Nadam

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_29 (Dense)            (None, 128)               100480    
                                                                 
 dropout_13 (Dropout)        (None, 128)               0         
                                                                 
 dense_30 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8812999725341797


## AdaGrad

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='AdaGrad', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: AdaGrad
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_17"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_31 (Dense)            (None, 128)               100480    
                                                                 
 dropout_14 (Dropout)        (None, 128)               0         
                                                                 
 dense_32 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.7505000233650208


## AdaDelta

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='AdaDelta', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: AdaDelta
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_18"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_33 (Dense)            (None, 128)               100480    
                                                                 
 dropout_15 (Dropout)        (None, 128)               0         
                                                                 
 dense_34 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.6292999982833862


# Conclusion:
Best optimizer = Nadam (Test accuracy: 0.8812999725341797)


# Experimenting Best Output Activation Function

Input activation function = sigmoid, Optimizer = Nadam

## output relu

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='relu'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_27"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_51 (Dense)            (None, 128)               100480    
                                                                 
 dropout_24 (Dropout)        (None, 128)               0         
                                                                 
 dense_52 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.10000000149011612


## output  LeakyReLU

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='LeakyReLU'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_28"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_53 (Dense)            (None, 128)               100480    
                                                                 
 dropout_25 (Dropout)        (None, 128)               0         
                                                                 
 dense_54 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.21809999644756317


## output elu

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='elu'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))

Model: "sequential_29"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_55 (Dense)            (None, 128)               100480    
                                                                 
 dropout_26 (Dropout)        (None, 128)               0         
                                                                 
 dense_56 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.1460999995470047


## output PReLU

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='PReLU'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_30"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_57 (Dense)            (None, 128)               100480    
                                                                 
 dropout_27 (Dropout)        (None, 128)               0         
                                                                 
 dense_58 (Dense)            (None, 10)                1300      
                                                                 
Total params: 101780 (397.58 KB)
Trainable params: 101780 (397.58 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.10000000149011612


## output Sigmoid

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='sigmoid'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_31"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_59 (Dense)            (None, 128)               100480    
                                                                 
 dropout_28 (Dropout)        (None, 128)               0         
                                                                 
 dense_60 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.8780999779701233


## output tanh

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='tanh'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_32"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_61 (Dense)            (None, 128)               100480    
                                                                 
 dropout_29 (Dropout)        (None, 128)               0         
                                                                 
 dense_62 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.37389999628067017


## output softmax

In [None]:
# 1. Defining the model
model=tf.keras.models.Sequential()
# 2. Adding a first fully connected hidden layer
model.add(tf.keras.layers.Dense(units=128, activation='sigmoid',input_shape=(784,)))
# number of units/neurons: 128
# activation function: sigmoid
# input_shape: (784,)
# 3. Adding second layer with dropout
model.add(tf.keras.layers.Dropout(0.2))
# 4. Adding the output layer
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))
# units: number of classes (10 in Fashion MNIST dataset)
# activation: softmax
# 5. Compiling the model
model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# Optimizer: Nadam
# Loss: Sparse softmax (categorical) crossentropy
model.summary()
# 6. Training the model
model.fit(X_train, y_train, epochs=10)
# 7. Model evaluation and prediction
test_loss, test_accuracy=model.evaluate(X_test, y_test)
print("Test accuracy: {}".format(test_accuracy))


Model: "sequential_33"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_63 (Dense)            (None, 128)               100480    
                                                                 
 dropout_30 (Dropout)        (None, 128)               0         
                                                                 
 dense_64 (Dense)            (None, 10)                1290      
                                                                 
Total params: 101770 (397.54 KB)
Trainable params: 101770 (397.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.879800021648407


## Conclusion:
Best output layer activation function = softmax. Test accuracy:  0.879800021648407