# Introduction to Convolution Neural Networks

Fully connected neural networks typically don't work well on images. This is because if each pixel is an input, then as we add more layers the amount of parameters increases exponentially. Let's say you had a 32 by 32 image. That's 32 wide, 32 high, and with three color channels. A single, fully connected neuron in the first hidden layer of a regular neural network would have 32 multiplied by 32 multiplied by three, and that's 3,072 weights. A color image, which isn't significantly larger, so something that is 200 wide by 200 high, with three color channels, so a fully connected neuron in the first hidden layer of a regular neural network would have 200 multiplied by 200 multiplied by three, which is 120,000 weights. The other challenge is that the number of parameters this large can quickly lead to over-fitting. One work around is that we can use smaller images, but clearly we will lose information. What we have not taken into account is that what makes one image distinguishable from another is its spacial structure. Areas close to each other are highly significant for something like an image. In the next video, we will look at how we can preserve the spacial size of our input volumes

 As before, we need to load the libraries first. In neural networks, we only had the fully connected layer, otherwise known as the dense layer. With convolution neural networks, we have far more operations, such as the convolution operation, max pooling, flattening, and also a fully connected or dense layer. We'll use sequential under models, because this will give us a linear stack of neural network layers, and we'll use the MNIST data set as in the previous example, as this is one of the data sets available with Keras. And finally, two categorical allows us to reshape the data and then show the labeled data has 10 categories or bins.

## Import the libraries

In [1]:
from keras.layers import Conv2D, MaxPooling2D, Flatten,Dense
from keras.models import Sequential
from keras.datasets import mnist
from keras.utils import to_categorical

import matplotlib.pyplot as plt
%matplotlib inline

## Load the data

In [2]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [3]:
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)


## Pre-processing
Our MNIST images only have a depth of 1, but we must explicitly declare that

In [4]:
num_classes = 10
epochs = 3

X_train = X_train.reshape(60000,28,28,1)
X_test = X_test.reshape(10000,28,28,1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0
y_train = to_categorical(y_train,num_classes)
y_test = to_categorical(y_test, num_classes)

In [5]:
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

(60000, 28, 28, 1)
(60000, 10)
(10000, 28, 28, 1)
(10000, 10)


## Create and compile the model.

As before, it's probably helpful for us to have a view of what the model we're trying to create looks like. We have our original image which is 28 by 28 with one channel so it's a grayscale image. We then do the convolution operation with the five by five kernel and then there are 32 filters. We then do a pooling so our image drops from 24 by 24 to 12 by 12. We then do another convolution operation with the five by five kernel and this time with 64 filters. We do another pooling. Again, we see a reduction in our image by half. That's from eight by eight to four by four. Finally, there's a flattening so there's a fully connected network. Then we've got the output. We've got all of the 1,024 notes terminating in the ten outputs. The ten outputs correspond to the ten digits, zero to nine. 

![CNN](images/cnn-model.jpg)

In [6]:
cnn = Sequential()

In [7]:
cnn.add(Conv2D(32, kernel_size=(5,5), input_shape=(28,28,1), padding='same', activation='relu'))

In [8]:
cnn.add(MaxPooling2D())

In [9]:
cnn.add(Conv2D(64, kernel_size=(5,5), padding='same', activation='relu'))

In [10]:
cnn.add(MaxPooling2D())

In [11]:
cnn.add(Flatten())

In [12]:
cnn.add(Dense(1024, activation='relu'))

In [13]:
cnn.add(Dense(10,activation='softmax'))

In [14]:
cnn.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])

In [15]:
print(cnn.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 28, 28, 32)        832       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 14, 14, 64)        51264     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 3136)              0         
_________________________________________________________________
dense (Dense)                (None, 1024)              3212288   
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1

## Train the model

 When training this model it will take between 15 to 20 minutes per epoch. So you can train the model this way if you want, but I've left the code commented out and I'm going to show you an alternate way of determining the weights. Now Keras allows us to do that very easily, so all we need to do is type the model name, cnn, load weights, and the weights is stored in the folder weights, and it's called cnn-model5.h5. 

In [16]:
#history_cnn = cnn.fit(X_train,y_train,epochs=5,verbose=1,validation_data=(X_train,y_train))

In [17]:
#plt.plot(history_cnn.history_cnn['acc'])
#plt.plot(history_cnn.history_cnn['val_acc'])

In [18]:
cnn.load_weights('weights/cnn-model5.h5')

In [19]:
score = cnn.evaluate(X_test,y_test)



In [20]:
score

[0.026782656088471413, 0.9929999709129333]

 Now if you recall, in our neural networks model we had an accuracy of 97.78%. So let's take a look at how well our convolution neural network does. So remember that score is a list. And we're looking for the second argument. And so we can see that the accuracy of our convolution neural network is 99.29 or 99.3%, which is better than that of our neural network model. Now the difference between 97.8% and 99.3% might not seem significant, but when you're looking at thousands or tens of thousands of images that small difference in percentage can make a huge difference between the predictive accuracy and capability of a model.