Import the Zalando dataset

In [1]:
from tensorflow.keras.datasets import fashion_mnist
import numpy as np

((trainX, trainY), (testX, testY)) = fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


prepare the data (reshaping the samples and one-hot encoding the labels):

In [2]:
labels_train = np.zeros((60000, 10))
labels_train[np.arange(60000), trainY] = 1
data_train = trainX.reshape(60000, 28, 28, 1)

labels_test = np.zeros((10000, 10))
labels_test[np.arange(10000), testY] = 1
data_test = testX.reshape(10000, 28, 28, 1)

Note that in this case, we use as network’s inputs tensors of dimensions
(number_of_images, image_height, image_width, color_channels). Since the
Zalando dataset is made up of gray values images, the color_channels will be equal to 1.
Each observation is in a row (since feed-forward neural networks take as input flattened
tensors). Check the dimensions with the code

In [3]:
print('Dimensions of the training dataset: ', data_train.shape)
print('Dimensions of the test dataset: ', data_test.shape)
print('Dimensions of the training labels: ', labels_train.shape)
print('Dimensions of the test labels: ', labels_test.shape)

Dimensions of the training dataset:  (60000, 28, 28, 1)
Dimensions of the test dataset:  (10000, 28, 28, 1)
Dimensions of the training labels:  (60000, 10)
Dimensions of the test labels:  (10000, 10)


Normalize the data

In [4]:
data_train_norm = np.array(data_train / 255.0)
data_test_norm = np.array(data_test / 255.0)

Build our network. With Keras, creating and training a CNN
model is straightforward; the following function defines the network’s architecture

In [8]:
from tensorflow.keras import models, layers


def build_model():
	# create model
	model = models.Sequential()
	model.add(layers.Conv2D(6, (5, 5), strides=(1, 1),
	                        activation='relu',
	                        input_shape=(28, 28, 1)))
	model.add(layers.MaxPooling2D(pool_size=(2, 2),
	                              strides=(2, 2)))
	model.add(layers.Conv2D(16, (5, 5), strides=(1, 1),
	                        activation='relu'))
	model.add(layers.MaxPooling2D(pool_size=(2, 2),
	                              strides=(2, 2)))
	model.add(layers.Flatten())
	model.add(layers.Dense(128, activation='relu'))
	model.add(layers.Dense(10, activation='softmax'))
	# compile model
	model.compile(loss='categorical_crossentropy',
	              optimizer='adam',
	              metrics=['categorical_accuracy'])
	return model

When building CNNs in Keras, a single line of code (and a Keras method) will
correspond to a different layer. The build_model function creates a CNN stacking Conv2D
(which builds a convolutional layer) and MaxPooling2D (which builds a max pooling
layer) layers. The stride is a tuple since it gives the stride in different dimensions (for
rows and columns). In our examples we have gray images, but we could also have RGB,
for example. That would mean having more dimensions: the three color channels.

Display the architecture of the model so far, using model.summary():

In [9]:
model = build_model()
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 24, 24, 6)         156       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 12, 12, 6)        0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 8, 8, 16)          2416      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 4, 4, 16)         0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 256)               0         
                                                                 
 dense (Dense)               (None, 128)               3

Note that the output of every convolutional and pooling layer is a 3D tensor of
shape (height, width, number_of_filters). The first dimension (i.e., the number of
batches), is set to None since the network does not know it yet and thus it can be applied
to every set of samples, of any length. The width and height dimensions decrease as you
go deeper into the network. The number of output channels for each Conv2D layer is
controlled by the first function argument. Typically, as the width and height decrease,
you can afford (computationally) to add more output filters to each Conv2D layer.

To complete the model, we added two Dense layers. They take vectors as input
(which are 1D), while the current output is a 3D tensor. This is why you first need to
flatten the 3D output to 1D, then add one or more Dense layers on top.

Train and test the network. Use mini-batch gradient descent
with a batch size of 100 and we will train our network for ten epochs.

If you run this code (it took roughly four minutes on a medium performance laptop),
it will start, after just one epoch, with a training accuracy of 76.3%. After ten epochs it will
reach a training accuracy of 91% (88% on the dev set).

In [10]:
model.fit(data_train_norm, labels_train, validation_data= (data_test_norm, labels_test), epochs=10, batch_size=100, verbose=1)

Epoch 1/10
  1/600 [..............................] - ETA: 1:34 - loss: 2.3344 - categorical_accuracy: 0.0600

2023-09-29 12:50:44.176152: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x167ae6440>

Try to change the network’s parameters to see if you can get a better accuracy. 
Change kernel size, stride, and padding.