

## **COMP6685 Deep Learning Coursework A1**


Individual (25% of total mark)


**TASK:**	You are required to develop a phyton code with appropriate comments and answer questions.

**Description**: Create a code using this temlate to train a Convolutional Neural Network (CNN) on the fashion MNIST dataset available at https://keras.io/api/datasets/fashion_mnist/ . 

Fashion MNIST is a dataset of 60,000 28x28 grayscale images of 10 fashion categories, along with a test set of 10,000 images.

The dataset should be imported in the code and one sample image should be visualised before applying the model.

Define a CNN and comment the chosen parameters of the network. Apply a regularization method (L1, L2 or L1L2). Divide the dataset into training, validation and test set. Obtain the accuracy on the validation set and plot the final results using the data from the test set. Comment your lines of code appropriately to explain your solution.

Enhance the model's performance to obtain the best or optimal validation accuracy. Further questions about final remarks on the results will be answered on the markdown defined in the template.

---
---

Note: This is only a template. You can add more code/text cells if necessary.

Import the dataset and divide it into training, validation and test sets. Explain how you obtained the validation set. How did you choose the size of the validation set? **(10 marks)**

---

In [None]:
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras import utils
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.optimizers import SGD, Adam, RMSprop
OPTIMIZER = SGD(learning_rate=0.1) # Stochastic gradient descent optimiser

# importing of service libraries
import numpy as np
import matplotlib.pyplot as plt

#each 2D image consists of 28x28 values/pixels, which needs to be reshaped in a vector of 784 pixels
RESHAPED = 784


print('Libraries imported.')

In [None]:
#training constants
BATCH_SIZE = 128
N_EPOCH = 20  
N_CLASSES = 10
VERBOSE = 1
VALIDATION_SPLIT = 0.2
OPTIM = RMSprop()

print('Main variables initialised.')

In [None]:
# Fashion_MNIST is a set of 60K images 28x28 pixels
IMG_CHANNELS = 1
IMG_ROWS = 28
IMG_COLS = 28

print('Image variables initialisation')

In [None]:
#load dataset
(input_X_train, output_y_train), (input_X_test, output_y_test) = fashion_mnist.load_data()
print('input_X_train shape:', input_X_train.shape)
print(input_X_train.shape[0], 'train samples')
print(input_X_test.shape[0], 'test samples')

# convert to categorical
output_Y_train = utils.to_categorical(output_y_train, N_CLASSES)
output_Y_test = utils.to_categorical(output_y_test, N_CLASSES) 

# i used a value of 20% for the validation set as this is standard, i also tried 10% but that didnt make much of a change.

Visualise a random sample image of the dataset. **(10 marks)**

---



In [None]:
# visualisation of the numerical vector and 2D colour plot of the sample fashion_mnist imnage 
Selected_Image = 1
image = input_X_train[Selected_Image]
print ("Sample input image: " + str(image))
plt.imshow(image)
plt.show() 

Define your CNN model. Specify the network and training parameters and comment them. **(10 marks)**

---

In [None]:
# float and normalization
input_X_train = input_X_train.astype('float32')
input_X_test = input_X_test.astype('float32')
input_X_train = input_X_train.astype('float32')
input_X_test = input_X_test.astype('float32')
input_X_train /= 255
input_X_test /= 25

In [None]:
# CNN model definition
model = Sequential()
 
model.add(Conv2D(32, kernel_size=3, padding='same', input_shape=(IMG_ROWS, IMG_COLS, IMG_CHANNELS)))
model.add(Activation('relu'))
model.add(Conv2D(32, kernel_size=3, padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
 
model.add(Conv2D(64, kernel_size=3, padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
 
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(N_CLASSES))
model.add(Activation('softmax'))

In [None]:
# compile the model
model.compile(loss='categorical_crossentropy', optimizer=OPTIM, metrics=['accuracy'])

model.summary()

Train the CNN model. **(10 marks)**

---

In [None]:
# import the regularizers
from tensorflow.keras import regularizers

N_EPOCH = 20
N_HIDDEN = 128
P_DROPOUT = 0.3

input_X_train = input_X_train.reshape(60000, RESHAPED)
input_X_test = input_X_test.reshape(10000, RESHAPED)


model = Sequential()

# Hidden layer 1 with 128 hidden units and ReLu activation function
model.add(Dense(N_HIDDEN, activity_regularizer=regularizers.L2(1e-5),input_shape=(RESHAPED,)))
model.add(Activation('relu'))          


# output layer with 10 units and softmax activation
model.add(Dense(N_CLASSES))
model.add(Activation('softmax'))

model.summary()
# model compilation
model.compile(loss='categorical_crossentropy', optimizer=OPTIM, metrics=['accuracy'])

In [None]:
history = model.fit(input_X_train, output_Y_train, batch_size=BATCH_SIZE, epochs=N_EPOCH, validation_split=VALIDATION_SPLIT,  verbose=VERBOSE)

Evaluate your model. What is the best/highest validation accuracy your network achieved? How did you obtain this accuracy? **(10 marks)**


In [None]:
score = model.evaluate(input_X_test, output_Y_test, batch_size=BATCH_SIZE, verbose=VERBOSE)
print("\nTest score/loss:", score[0])
print('Test accuracy:', score[1])

# with my base model i got a test accuracy of 74% and a test score of 5.5 i knew this wasnt the best accuracy and that i could make it better with the use of regularization. i was able to get an 86% test accuracy and a test score of 3.6. this was done using L2 regularization. this was the best one as using L1 or a combination of L1L2 led to worse test accuracy.

Plot the final results on the test set and print the accuracy/loss on that set. **(10 marks)**

---

In [None]:
# summarize history for accuracy
#plt.plot(mo)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

Additional questions:


*   Describe whether you found any differences in the network’s accuracy when applying regularisation compared to not applying it. If there were differences, which regularisation did you use? If no differences were found, what could be the reason? **(10 marks)**

# applying regularization increased the networks efficency, this is because the total amount of parameters was reduced by 100,000. this also meant that it was easier to train the network and this increased the test accuracy as well as decreasing the test score. Also, it made the process of training much faster, i was able to train the network much faster.

*   Write your conclusions about the results achieved with your model on the fashion MNIST dataset and ideas to improve these results/performance further. **(10 marks)**

# i couldve used a larger training set, this could help as the CNN would have more variations of the image to go off of. to do this i could use data augmentation. also i couldve added another dropout regularizer after using L2, this is because it would help with overfitting.





Additional remarks:

*   Code outline appropriately commented. **(10 marks)**
*   Code running without errors. **(10 marks)**

---

