# Assignment 4 Dataset:

Use fashion-mnist dataset for this assignment. You will have a chance to explore various Deep Learning models in this Assignment.

# Dataset description:

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

# Labels:
Each training and test example is assigned to one of the following labels:

0. T-shirt/top
1. Trouser
2. Pullover
3. Dress
4. Coat
5. Sandal
6. Shirt
7. Sneaker
8. Bag
9. Ankle boot




# Assignment 4 - Questions

This Assignment focuses on exploring various Deep Learning models(neural networks) and model building. 

1. Load the fashion-mnist dataset from the tensorflow datasets or download from kaggle(https://www.kaggle.com/zalando-research/fashionmnist)
2. Normalize the data - briefly comment why we need to normalize?
3. Split the dataset into train(50,000), valid(10,000) and test sets(10,000)
4. Reshape the input data to a 2D for MLP(Multi Layer perceptron) and tensor(4d) for CNN(Convolution Neural Network)
5. Build a Neural Network Multi-Layer Perceptron Classifier model (you can use sklearn neural network MLP Classifier)
6. Experiment with the architecture of the MLP classfier that you already built and report if you seen any improvement in the accuracy
7. Build a basic sequential deep learning model(CNN) and compare the accuracy with MLP Classifier
8. Experiment with the architecture of CNN and report if you see any improvement in the accuracy
9. Evaluate MLPClassifier and CNN using F1 score values and accuracy(only for the best model)
10. Explain the paramters to tune to reduce the risk of overfitting in deep learning models
10. Optional Explain the meaning of Precision, Recall and F1-Score and why these are used to evaluate Classification models (instead of using Accuracy as a metric). 

10. optional: Try any advanced deep learning model

Submit the .ipynb, and .html 


In [3]:
#Importing all required libraries
import tensorflow as tf
from keras.utils import to_categorical
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from tensorflow import keras
import keras
from keras.layers import Conv2D, MaxPool2D, Flatten,Dense, Dropout, Activation,Flatten,MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from sklearn.neural_network import MLPClassifier
from tensorflow.python.keras.optimizers import Adam
from sklearn import metrics
from tensorflow import keras
from tensorflow.keras import datasets, layers, models

1) Load the fashion-mnist dataset from the tensorflow datasets or download from kaggle(https://www.kaggle.com/zalando-research/fashionmnist)

In [4]:
# Load training and test data into dataframes
data_train = pd.read_csv('fashion-mnist_train.csv')
data_test = pd.read_csv('fashion-mnist_test.csv')

2) Normalize the data - briefly comment why we need to normalize?

In [5]:
#Normalising data
train_data = np.array(data_train, dtype='float32')
test_data = np.array(data_test, dtype='float32')

X_train = train_data[:, 1:] / 255
y_train = train_data[:, 0]

X_test = test_data[:, 1:] / 255
y_test = test_data[:, 0]

**Normalization is necessary to ensure that all the data are on the same scale and this usually improves the performance. For the Fashion MNIST
data set, the normalization for training dataset and the validation data set is handled as mapping [0,255] to the [0,1] that will increase the training speed.**

3) Split the dataset into train(50,000), valid(10,000) and test sets(10,000)

In [6]:
#Splitting the dataset
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=10000)

4) Reshape the input data to a 2D for MLP(Multi Layer perceptron) and tensor(4d) for CNN(Convolution Neural Network)

In [7]:
#Reshaping the data to 2D for MLP
X_train = X_train.reshape(50000, 784)
X_valid = X_valid.reshape(10000, 784)
X_test = X_test.reshape(10000, 784)

y_train = keras.utils.to_categorical(y_train, 10)
y_valid = keras.utils.to_categorical(y_valid, 10)
y_test = keras.utils.to_categorical(y_test, 10)

In [8]:
X_train.shape

(50000, 784)

5) Build a Neural Network Multi-Layer Perceptron Classifier model (you can use sklearn neural network MLP Classifier)

6) Experiment with the architecture of the MLP classfier that you already built and report if you seen any improvement in the accuracy

In [9]:
#Building MLP Classifier
mlp_clf = MLPClassifier(max_iter=100, verbose=True,
                        hidden_layer_sizes=(150,))

mlp_clf.fit(X_train, y_train)

Iteration 1, loss = 1.31405484
Iteration 2, loss = 0.85496585
Iteration 3, loss = 0.76453662
Iteration 4, loss = 0.70416721
Iteration 5, loss = 0.66379934
Iteration 6, loss = 0.63384224
Iteration 7, loss = 0.60988214
Iteration 8, loss = 0.59113659
Iteration 9, loss = 0.57092746
Iteration 10, loss = 0.55422922
Iteration 11, loss = 0.53478312
Iteration 12, loss = 0.52524976
Iteration 13, loss = 0.50984523
Iteration 14, loss = 0.50003477
Iteration 15, loss = 0.49057679
Iteration 16, loss = 0.47923097
Iteration 17, loss = 0.46735221
Iteration 18, loss = 0.46056224
Iteration 19, loss = 0.44901337
Iteration 20, loss = 0.44182715
Iteration 21, loss = 0.43283593
Iteration 22, loss = 0.42375932
Iteration 23, loss = 0.41889549
Iteration 24, loss = 0.40757107
Iteration 25, loss = 0.39935114
Iteration 26, loss = 0.39765259
Iteration 27, loss = 0.38867228
Iteration 28, loss = 0.38146999
Iteration 29, loss = 0.37579527
Iteration 30, loss = 0.36443001
Iteration 31, loss = 0.36046088
Iteration 32, los



MLPClassifier(hidden_layer_sizes=(150,), max_iter=100, verbose=True)

In [10]:
#Evaluating the model
mlp1_pred = mlp_clf.predict(X_test)
mlp1_pred = np.argmax(mlp1_pred,axis=1)
y_true = np.argmax(y_test,axis=1)

mlp1_f1 = metrics.f1_score(y_true, mlp1_pred, average= "weighted")
mlp1_accuracy = metrics.accuracy_score(y_true, mlp1_pred)
mlp1_confusion_matrix = metrics.confusion_matrix(y_true, mlp1_pred)

In [11]:
#Printing evaluation metrics for MLP Classifier
print("-----------------1st MLP Classifier Report---------------")
print("F1 score: {}".format(mlp1_f1))
print("Accuracy score: {}".format(mlp1_accuracy))
print("Confusion matrix: \n", mlp1_confusion_matrix)

-----------------1st MLP Classifier Report---------------
F1 score: 0.8760185150665216
Accuracy score: 0.8742
Confusion matrix: 
 [[871   0   8  16   1   1 101   0   2   0]
 [  7 983   0   6   0   1   2   0   1   0]
 [ 86   0 780  11  52   0  69   0   2   0]
 [ 56   9  11 905  11   0   8   0   0   0]
 [121   1  52  50 726   1  48   0   1   0]
 [ 16   1   0   1   0 946   1  25   0  10]
 [193   1  56  28  23   0 692   0   7   0]
 [ 11   0   0   0   0  22   0 938   0  29]
 [ 27   0   3   4   2   2  14   1 947   0]
 [ 11   0   0   0   0   6   0  28   1 954]]


In [12]:
mlp2 = MLPClassifier(hidden_layer_sizes=(100,35,10),
                       max_iter=100,
                       activation = 'tanh')
mlp2.fit(X_train, y_train)



MLPClassifier(activation='tanh', hidden_layer_sizes=(100, 35, 10), max_iter=100)

In [13]:
#Evaluating the model
mlp2_pred = mlp2.predict(X_test)
mlp2_pred = np.argmax(mlp2_pred,axis=1)
y_true = np.argmax(y_test,axis=1)

mlp2_f1 = metrics.f1_score(y_true, mlp2_pred, average= "weighted")
mlp2_accuracy = metrics.accuracy_score(y_true, mlp2_pred)
mlp2_confusion_matrix = metrics.confusion_matrix(y_true, mlp2_pred)

In [14]:
#Printing evaluation metrics for MLP Classifier
print("-----------------2nd MLP Classifier Report---------------")
print("F1 score: {}".format(mlp2_f1))
print("Accuracy score: {}".format(mlp2_accuracy))
print("Confusion matrix: \n", mlp2_confusion_matrix)

-----------------2nd MLP Classifier Report---------------
F1 score: 0.8784498038942871
Accuracy score: 0.8772
Confusion matrix: 
 [[825   3   7  17   2   0 140   0   6   0]
 [  7 977   0  10   0   1   5   0   0   0]
 [ 79   1 717  15  83   0 104   0   1   0]
 [ 52  12   4 887  27   0  16   0   2   0]
 [ 29   0  47  23 829   0  69   0   3   0]
 [ 15   1   1   1   0 937   0  31   1  13]
 [140   2  29  29  48   0 746   0   6   0]
 [ 13   0   0   0   0  17   0 932   0  38]
 [ 13   1   4   1   2   1  12   2 962   2]
 [  6   0   0   0   0   5   1  26   2 960]]


**From the above both MLP Classifier,we can see there is improvement in performance and we can say that the second MLP classifier has high F1 Score and high accuracy and it slightly performed better than the first MLP Classifier.**

7) Build a basic sequential deep learning model(CNN) and compare the accuracy with MLP Classifier

In [51]:
#Reshaping data to 4D
X_train = X_train.reshape([-1, 28, 28, 1])
X_test = X_test.reshape([-1, 28, 28, 1])
X_valid=X_valid.reshape([-1, 28, 28, 1])

In [16]:
#Building CNN Model
modelCNN = Sequential()
modelCNN.add(layers.Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
modelCNN.add(layers.MaxPooling2D((2, 2)))
modelCNN.add(layers.Flatten())
modelCNN.add(layers.Dense(100, activation='tanh', kernel_initializer='he_uniform'))
modelCNN.add(layers.Dense(50, activation='tanh', kernel_initializer='he_uniform'))
modelCNN.add(layers.Dense(10, activation='softmax'))

# initiate adam optimizer
opt = keras.optimizers.Adam(lr=0.0001)

modelCNN.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

In [17]:
modelCNN.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fa372820430>

In [18]:
# Evaluating CNN Model
cnn_pred = modelCNN.predict(X_test)
cnn_pred = np.argmax(cnn_pred,axis=1)
y_true = np.argmax(y_test,axis=1)

cnn_f1 = metrics.f1_score(y_true, cnn_pred, average= "weighted")
cnn_accuracy = metrics.accuracy_score(y_true, cnn_pred)
cnn_confusion_matrix = metrics.confusion_matrix(y_true, cnn_pred)

In [19]:
# Printing evaluating metrics for CNN Model
print("-----------------1st Convolutional Neural Network Report---------------")
print("F1 score: {}".format(cnn_f1))
print("Accuracy score: {}".format(cnn_accuracy))
print("Confusion matrix: \n", cnn_confusion_matrix)

-----------------1st Convolutional Neural Network Report---------------
F1 score: 0.9083799513240239
Accuracy score: 0.9086
Confusion matrix: 
 [[812   0  13  30   0   1 136   0   8   0]
 [  0 986   1  10   0   0   3   0   0   0]
 [ 14   1 889  20  21   0  52   0   3   0]
 [  8   9   2 973   3   0   5   0   0   0]
 [  2   2  94  59 771   0  72   0   0   0]
 [  0   1   0   0   0 965   0  24   2   8]
 [ 74   2  63  48  26   0 781   0   6   0]
 [  0   0   0   0   0   3   0 972   0  25]
 [  4   2   4   2   1   4   6   2 973   2]
 [  0   0   0   0   0   3   0  33   0 964]]


Comparing the accuracy with MLP Classifier, we can say that the CNN Model performed better than the MLP classifier.

8) Experiment with the architecture of CNN and report if you see any improvement in the accuracy

In [43]:
# Experimenting with CNN Model
modelCNN2 = Sequential()
modelCNN2.add(layers.Conv2D(32, (3, 3), activation='tanh', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
modelCNN2.add(layers.MaxPooling2D((2, 2)))
modelCNN2.add(layers.Flatten())
modelCNN2.add(layers.Dense(784, activation='tanh', kernel_initializer='he_uniform'))
modelCNN2.add(Dropout(rate=0.3))
modelCNN2.add(layers.Dense(500, activation='tanh', kernel_initializer='he_uniform'))
modelCNN2.add(Dropout(rate=0.2))
modelCNN2.add(layers.Dense(100, activation='tanh', kernel_initializer='he_uniform'))
modelCNN2.add(Dropout(rate=0.1))
modelCNN2.add(layers.Dense(10, activation='softmax'))

# initiate adam optimizer
opt = keras.optimizers.Adam(lr=0.001)

modelCNN2.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

In [44]:
modelCNN2.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fa2eb78d9d0>

In [45]:
#Evaluating CNN Model
cnn2_pred = modelCNN2.predict(X_test)
cnn2_pred = np.argmax(cnn2_pred,axis=1)
y_true = np.argmax(y_test,axis=1)

cnn2_f1 = metrics.f1_score(y_true, cnn2_pred, average= "weighted")
cnn2_accuracy = metrics.accuracy_score(y_true, cnn2_pred)
cnn2_confusion_matrix = metrics.confusion_matrix(y_true, cnn2_pred)

In [46]:
#Printing metrics of CNN Model
print("-----------------2nd Convolutional Neural Network Report---------------")
print("F1 score: {}".format(cnn2_f1))
print("Accuracy score: {}".format(cnn2_accuracy))
print("Confusion matrix: \n", cnn2_confusion_matrix)

-----------------2nd Convolutional Neural Network Report---------------
F1 score: 0.8966773985708857
Accuracy score: 0.8964
Confusion matrix: 
 [[749   2  11  49   2   0 176   0  11   0]
 [  1 974   1  14   2   1   6   0   1   0]
 [  9   0 777  12 109   0  91   0   2   0]
 [  9   7   5 933  25   0  21   0   0   0]
 [  0   1  18  25 908   0  47   0   1   0]
 [  0   1   0   0   0 959   0  23   3  14]
 [ 61   2  30  36  68   0 793   0  10   0]
 [  0   0   0   0   0  14   0 927   0  59]
 [  2   2   3   3   3   3   8   3 971   2]
 [  0   0   0   0   0   3   0  24   0 973]]


9) Evaluate MLPClassifier and CNN using F1 score values and accuracy(only for the best model)

In [49]:
#Printing metrics of Best CNN Model
print("-----------------1st Convolutional Neural Network Report---------------")
print("F1 score: {}".format(cnn_f1))
print("Accuracy score: {}".format(cnn_accuracy))
print("Confusion matrix: \n", cnn_confusion_matrix)

-----------------1st Convolutional Neural Network Report---------------
F1 score: 0.9083799513240239
Accuracy score: 0.9086
Confusion matrix: 
 [[812   0  13  30   0   1 136   0   8   0]
 [  0 986   1  10   0   0   3   0   0   0]
 [ 14   1 889  20  21   0  52   0   3   0]
 [  8   9   2 973   3   0   5   0   0   0]
 [  2   2  94  59 771   0  72   0   0   0]
 [  0   1   0   0   0 965   0  24   2   8]
 [ 74   2  63  48  26   0 781   0   6   0]
 [  0   0   0   0   0   3   0 972   0  25]
 [  4   2   4   2   1   4   6   2 973   2]
 [  0   0   0   0   0   3   0  33   0 964]]


In [50]:
# Printing Metrics for best MLP Model
print("-----------------2nd MLP Classifier Report---------------")
print("F1 score: {}".format(mlp2_f1))
print("Accuracy score: {}".format(mlp2_accuracy))
print("Confusion matrix: \n", mlp2_confusion_matrix)

-----------------2nd MLP Classifier Report---------------
F1 score: 0.8784498038942871
Accuracy score: 0.8772
Confusion matrix: 
 [[825   3   7  17   2   0 140   0   6   0]
 [  7 977   0  10   0   1   5   0   0   0]
 [ 79   1 717  15  83   0 104   0   1   0]
 [ 52  12   4 887  27   0  16   0   2   0]
 [ 29   0  47  23 829   0  69   0   3   0]
 [ 15   1   1   1   0 937   0  31   1  13]
 [140   2  29  29  48   0 746   0   6   0]
 [ 13   0   0   0   0  17   0 932   0  38]
 [ 13   1   4   1   2   1  12   2 962   2]
 [  6   0   0   0   0   5   1  26   2 960]]


**Above is the evaluation metrics i.e. F1 Score,Accuracy Score and confusion matrix for the best model.**
- We can see that CNN Model performed better among all the models above with 90% accuracy.

10) Explain the paramters to tune to reduce the risk of overfitting in deep learning models

**Parameters to tune to reduce the risk of overfitting in deep learning models:**  

  1) Number of epochs will affect the performance. For large number of epochs , there is improvement in performance. 
    But need to do certain experimentation for deciding epochs, learning rate. We can see after certain epochs there 
    is not any reduction is training loss and improvement in training accuracy. Accordingly we can decide number of 
    epochs. Also we can use dropout layer in the CNN model. We can use various optimizers e.g SGD,rmsprop,Adam etc.
    
  2) Learning rate controls how quickly or slowly a model learns a problem and tuning learning rate provides 
    perhaps the most important hyperparameter to tune for your model in order to achieve good performance 
    on your problem. A common way is to make the initial learning rate 10 times smaller than the 
    one used for intial training.
    
  3)Batch size - Batch size has an effect on the resource requirements of the training process, speed and 
    number of iterations.Small batch sizes add regularization while large batch sizes add less, so we use this while 
    balancing the proper amount of regularization.
    
  4)Number of hidden units - The number of hidden units is the measure of model’s learning capacity.Slightly more 
    number of units then optimal number is not a problem,but a much larger number will lead to overfitting.
    
  5)Number of layers - Increasing the number of hidden units and/or layers may lead to overfitting because it 
    will make it easier for the neural network to memorize the training set, that is to learn a function 
    that perfectly separates the training set but that does not generalize to unseen data.
    
All these things will help to improve the performance of deep learning models.