# Training a Convolutional Neural Network


In this notebook we will train a convolutional neural network with the Zalando dataset (images of clothes). Let's start by preparing the data:

In [1]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt 

In [10]:

class_names = ["T-Shirt","Trouser","Pullover","Dress","Coat","Sandal","Shirt","Sneaker","Bag","Ankle Boot"]


fashion_mnist = keras.datasets.fashion_mnist

(x_train,y_train),(x_test,y_test) = fashion_mnist.load_data()

#We have to reshape in order to add the color channel. The other dimensions remain the same

x_train = x_train.reshape((len(x_train),x_train[0].shape[0],x_train[0].shape[1],1))
x_test = x_test.reshape((len(x_test),x_test[0].shape[0],x_test[0].shape[1],1))

x_train = x_train.astype('float32')/255
x_test = x_test.astype('float32')/255



Now we build the neural network (same code we had in `Convolutional_Neural_Network` notebook):

In [22]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten


model = Sequential()
model.add(Conv2D(32,(5,5), activation="relu", #convolutional layer of 32 filters and 5x5 windows. Activation function is relu
         input_shape=(28,28,1))) #we already specify input shape. Images are black and white, so we only have one color channel
model.add(MaxPooling2D(2,2)) #pooling with 2x2 window
model.add(Conv2D(64,(5,5), activation="relu")) #remember that we only need to specify the input shape in the first layer
model.add(MaxPooling2D(2,2)) #pooling with 2x2 window
model.add(Flatten()) #remember that we need to flatten the 3D tensor to 1D in order to use a densely connected neural network
model.add(Dense(10,activation="softmax"))

Please, bear in mind that we have **not** changed our labels to categoricals. This is because we will use `sparse_categorical_crossentropy`as a loss function. This is, in essence, the same as `categorical_crossentropy`, but it allows as to use labels as a sequence of integers rather than in a one-hot encoded form (for more information,check https://www.dlology.com/blog/how-to-use-keras-sparse_categorical_crossentropy/. Keras documentation also available).


Now that we have explained this, it is time to train the neural network:

In [23]:
model.compile(optimizer='sgd',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy']) #here we apply new loss function

model.fit(x_train,y_train,epochs=5)



Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x28d45ff56c8>

In [25]:
test_loss,test_acc = model.evaluate(x_test,y_test)

print("Test accuracy", test_acc)

Test accuracy 0.8313000202178955


As you can see, one epoch with this neural network requires significantly more time than with a basic one. However, you can see that we obtain much better results (83% compared to 76% with a basic network).



## Improving the neural network


Maybe we can get even better results if we add more filters to the convolutional networks and a second densely connected neural network before the last one. Apart from this, we can also use padding in the convolutional layers. Remember that padding allows us to maintain the dimensionality of the outputs. In order to do this, we must add the parameter `padding="same"` in each convolutional layer we would like to be padded.

In [27]:
model = Sequential()
model.add(Conv2D(64,(5,5), activation="relu",padding="same", #padding added
         input_shape=(28,28,1))) 
model.add(MaxPooling2D(2,2))
model.add(Conv2D(64,(5,5), activation="relu",padding="same")) 
model.add(MaxPooling2D(2,2)) 
model.add(Flatten()) 
model.add(Dense(64,activation="relu"))
model.add(Dense(10,activation="softmax"))

 Let's have a look at the summary to check that the outputs have the same dimension as the inputs:

In [28]:
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_6 (Conv2D)            (None, 28, 28, 64)        1664      
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 14, 14, 64)        0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 14, 14, 64)        102464    
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 7, 7, 64)          0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 3136)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 64)                200768    
_________________________________________________________________
dense_4 (Dense)              (None, 10)               

As you can see, the first layer maintains the (28,28) shape of the input (before padding the output became a 24x24 matrix), while the second one has a (14,14) shape (pooling halved each dimension)


Let's train the model. It will take even more time than before, so you can let the code run and go grab a cup of tea/coffee :)

In [29]:
model.compile(optimizer='sgd',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy']) #here we apply new loss function

model.fit(x_train,y_train,epochs=5)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x28d4614ccc8>

In [30]:
test_loss,test_acc = model.evaluate(x_test,y_test)

print("Test accuracy", test_acc)

Test accuracy 0.8730000257492065


And as we should have expected, we obtain even better results. However, this is not the best we can do, but we will leave it for the next notebook `Improving_Convolutional_Neural_Network`.