# Deep Learning Challenge
>* Section 5.3.6

* Using the fashion MNIST dataset:
* 1. Preprocess your data so that you can feed it into ANN models.
* 2. Split your data into training and test sets.
* 3. Try different ANN models and train them on your training set. You can play with the following:

>* Number of layers
>* Activation functions of the layers
>* Number of neurons in the layers
>* Different batch sizes during training

* 4. Compare your models' training scores and interpret your results.
* 5. Evaluate how your models perform on your test set. Compare the results of your models. 

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential 
from tensorflow.keras.layers import Dense

In [2]:
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

In [3]:
print("Train images shape:", X_train.shape)
print("Train labels shape:", y_train.shape)
print("Test images shape:", X_test.shape)
print("Test labels shape:", y_test.shape)

Train images shape: (60000, 28, 28)
Train labels shape: (60000,)
Test images shape: (10000, 28, 28)
Test labels shape: (10000,)


In [4]:
input_dim = 784 
output_dim = nb_classes = 10
nb_epoch = 20

X_train = X_train.reshape(60000, input_dim)
X_test = X_test.reshape(10000, input_dim)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

In [5]:
Y_train = to_categorical(y_train, nb_classes)
Y_test = to_categorical(y_test, nb_classes)

In [6]:
model = Sequential()
# The first dense layer
model.add(Dense(128, input_shape=(784,), activation="tanh"))
# The second dense layer
model.add(Dense(64, activation="tanh"))
# Third dense layer
model.add(Dense(32, activation="tanh"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=8, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27aeb12ad00>

In [7]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.21143603324890137
Training accuracy: 0.9205333590507507


In [8]:
model = Sequential()
# The first dense layer
model.add(Dense(128, input_shape=(784,), activation="tanh"))
# The second dense layer
model.add(Dense(64, activation="tanh"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=8, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27aeb8d4430>

In [9]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.200212761759758
Training accuracy: 0.9269833564758301


In [10]:
model = Sequential()
# The first dense layer
model.add(Dense(128, input_shape=(784,), activation="relu"))
# The second dense layer
model.add(Dense(64, activation="relu"))
# Third dense layer
model.add(Dense(32, activation="relu"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=8, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27b123dc2b0>

In [11]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.19804686307907104
Training accuracy: 0.92535001039505


In [12]:
model = Sequential()
# The first dense layer
model.add(Dense(128, input_shape=(784,), activation="relu"))
# The second dense layer
model.add(Dense(64, activation="relu"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=8, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27b128075b0>

In [13]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.18942220509052277
Training accuracy: 0.9304500222206116


In [14]:
model = Sequential()
# The first dense layer
model.add(Dense(128, input_shape=(784,), activation="tanh"))
# The second dense layer
model.add(Dense(64, activation="tanh"))
# Third dense layer
model.add(Dense(32, activation="tanh"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=128, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27b166ff280>

In [15]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.36468929052352905
Training accuracy: 0.870199978351593


In [16]:
model = Sequential()
# The first dense layer
model.add(Dense(128, input_shape=(784,), activation="relu"))
# The second dense layer
model.add(Dense(64, activation="relu"))
# Third dense layer
model.add(Dense(32, activation="relu"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=128, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27b2872f970>

In [17]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.3533790111541748
Training accuracy: 0.8759499788284302


In [18]:
model = Sequential()
# The first dense layer
model.add(Dense(156, input_shape=(784,), activation="tanh"))
# The second dense layer
model.add(Dense(78, activation="tanh"))
# Third dense layer
model.add(Dense(39, activation="tanh"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=128, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27b16bc1340>

In [19]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.3555021584033966
Training accuracy: 0.8713499903678894


In [20]:
model = Sequential()
# The first dense layer
model.add(Dense(156, input_shape=(784,), activation="relu"))
# The second dense layer
model.add(Dense(78, activation="relu"))
# Third dense layer
model.add(Dense(39, activation="relu"))
# The last layer is the output layer
model.add(Dense(10, activation="softmax"))

model.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=128, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x27b31c23fa0>

In [21]:
train_score = model.evaluate(X_train, Y_train, verbose=0)
print('Training score:', train_score[0])
print('Training accuracy:', train_score[1])

Training score: 0.34525319933891296
Training accuracy: 0.8782333135604858



```
| Activation Function | Layers          | Batch Size | Accuracy Score | Training Score |
|---------------------|-----------------|------------|----------------|----------------|
| Tahn                | 3 (128, 64, 32) | 8          | 0.92           | 0.21           |
| Tahn                | 2 (128, 64)     | 8          | 0.93           | 0.20           |
|                     |                 |            |                |                |
| Relu                | 3 (128, 64, 32) | 8          | 0.93           | 0.20           |
| Relu                | 2 (128, 64)     | 8          | 0.93           | 0.19           |
|                     |                 |            |                |                |
| Tahn                | 3 (128, 64, 32) | 128        | 0.87           | 0.36           |
| Relu                | 3 (128, 64, 32) | 128        | 0.88           | 0.35           |
|                     |                 |            |                |                |
| Tahn                | 3 (156, 78, 39) | 128        | 0.87           | 0.36           |
| Relu                | 3 (156, 78, 39) | 128        | 0.88           | 0.35           |

* The models performed very similarly (Tahn & Relu) with the changes in parameters.
* The best models were 2 layer (128, 64, 10) with batch size 8.
* Their was a very slight loss of performance when using 2 layers.
<br>
* Both models again were very similar when the batch size was increased to 128. They did see a marked drop in performance.
* A change to the number of nuerons didn't seem to have an effect at batch size 128.
<br>
* With this data set a lower batch size was the main factor in scoring.