In [1]:
from __future__ import absolute_import, division, print_function, unicode_literals
import numpy as np
import tensorflow as tf
from tensorflow import keras as ks
print(tf.__version__)

2.4.1


In [4]:
mnist_fashion = ks.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist_fashion.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


Of the total 70,000 images, 60,000 are used for training and the 
remaining 10,000 for testing. The labels are integer arrays ranging from 
0 to 9. The class names are not a part of the data set. 

In [5]:
print(training_images.shape)
print(training_labels.shape)
print(test_images.shape)
print(test_labels.shape)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)


As the pixel values range from 0 to 255, we will scale those values in 
the range of 0 to 1 before pushing them to the model. We can scale these 
values (both for training and test data sets) by dividing the values by 255.

In [6]:
training_images = training_images /255.0
test_images = test_images / 255.0

reshaping the matrix as 28 * 28 * 1

In [7]:
training_images = training_images.reshape((60000, 28, 28, 1))
test_images = test_images.reshape((10000,28,28,1))

print(training_images.shape)
print(test_images.shape)

(60000, 28, 28, 1)
(10000, 28, 28, 1)


Now, let’s build the different layers of the model. We will be using the 
Keras implementation to build the different layers of a CNN. We will keep 
it simple, by having only three layers.

First layer—convolutional layer with ReLU activation 
function: This layer takes the 2D array (28 × 28 pixels) 
as input. We will take 50 convolutional kernels (filters) 
of shape 3 × 3 pixels. The output of which will be 
passed to a ReLU activation function before being 
passed to the next layer.

In [8]:
cnn_model = ks.models.Sequential()
cnn_model.add(ks.layers.Conv2D(50,(3,3), activation='relu', input_shape=(28,28,1), name='Conv2D_layer'))

Second layer—pooling layer: This layer takes the 50 
26 × 26 2D arrays as input and transforms them into 
the same number (50) of arrays, with dimensions half 
that of the original (i.e., from 26 × 26 to 13 × 13 pixels)

In [9]:
cnn_model.add(ks.layers.MaxPooling2D((2,2), name='Maxpooling_2D'))

Third layer—fully connected layer: This layer takes the 
50 13 × 13 2D arrays as input and transforms them into 
a 1D array of 8450 elements (50 × 13 × 13). These 8450 
input elements are passed through a fully connected 
neural network that gives the probability scores for 
each of the 10 output labels (at the output layer).

In [10]:
cnn_model.add(ks.layers.Flatten(name='Flatten'))
cnn_model.add(ks.layers.Dense(50, activation='relu', name='Hidden_layer'))
cnn_model.add(ks.layers.Dense(10, activation='softmax', name='Output_layer'))

Lets check the different layers thorugh summary method.

In [11]:
cnn_model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
Conv2D_layer (Conv2D)        (None, 26, 26, 50)        500       
_________________________________________________________________
Maxpooling_2D (MaxPooling2D) (None, 13, 13, 50)        0         
_________________________________________________________________
Flatten (Flatten)            (None, 8450)              0         
_________________________________________________________________
Hidden_layer (Dense)         (None, 50)                422550    
_________________________________________________________________
Output_layer (Dense)         (None, 10)                510       
Total params: 423,560
Trainable params: 423,560
Non-trainable params: 0
_________________________________________________________________


Now we will use an optimization function with the help of the 
compile method. An Adam optimizer with objective function sparse_
categorical_crossentropy, which optimizes for the accuracy metric, can 
be built as follows:

In [13]:
cnn_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Model training:

In [14]:
cnn_model.fit(training_images, training_labels, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fd6861dddd0>

Model evaluation:

In [15]:
training_loss, training_accuracy = cnn_model.evaluate(training_images, training_labels)
print('Training accuracy {}'.format(round(float(training_accuracy), 2)))

Training accuracy 0.96


In [16]:
test_loss,test_accuracy = cnn_model.evaluate(test_images, test_labels)
print('Test Accuracy {}'.format(round(float(test_accuracy),2)))

Test Accuracy 0.91
