# Convolutional Neural Networks with keras


This will illustrate the basics of a convolutional neural networks using keras to create a digit classification network. WITHOUT A LOT OF MATH

## What is a Convolutional whatnot?

A better question is this: 

Is the image below a dog or a cat?

![Dog Image](https://www.what-dog.net/Images/faces2/scroll0015.jpg)


Obviously it's a dog!



It took you, a human, no time at all to say that that is a dog. Our brains are cool like that. So how did you know that's a dog? Well, you were taught what dogs look like, vs what cats look like, by your parents. When you asked what that fluffy thing was, they told you it was a dog, and the same goes for the cat.

Ok, easy, teach babies what cats and dogs are. Not that hard, right? Well, yes. Our brains, even at a young age are var more advanced than computers. So how would you teach a computer?


## Computers vs. Humans: Vision

If you know the physics behind our eyes, you will know that our eyes take in wavelengths of light that bounce off objects around us, and our brains interpret it. It's not hard for us to see because that process is done subconsciously. It's very easy to recognize images just by looking at them, but computers don't have the same ability.

A picture like this:
![Lion Image](https://a-z-animals.com/media/animals/images/original/lion7.jpg)

Is easy to recognize as a lion.


For computers it's a bit different. Images are stored as a bunch of numbers.

![Matrix](http://chem4823.usask.ca/images/matrices1.gif)

Try classifiying this, you can't. A, it means nothing, its just a bunch of random numbers I found on Google. B, it's not in a format you understand.

Computers have to take images as a bunch of numbers. Which they can understand. So how can computers classify images? Well they use something called Neural Networks.

## Computers vs. Humans: Networks

The neural network actually stems from biology, and is the basis for how our brain learns.

![Neural Network](https://upload.wikimedia.org/wikipedia/commons/thumb/b/b5/Neuron.svg/1200px-Neuron.svg.png)

The neural network a computer uses follows a similar pattern. Input comes from a previous neuron and gets transferred through, with a certain strength, or the stregth of the synapse between the two neurons.


## Back to Convolutional whatnot...

For classifiy images using a computer, we use a type of neural network called a Convolutional Neural Network (CNN). A CNN iterates over an image and sends pixel values to the network for classification.

![Neural Network Image](http://cs231n.github.io/assets/nn1/neural_net2.jpeg)

![Network Image 2](http://cs231n.github.io/assets/cnn/convnet.jpeg)

## Let's build a CNN with python and keras

In [1]:
# dependencies
from __future__ import print_function

# We're imported our Keras submodules, such as Dense, Dropout, and MaxPooling2D. All things
# we will need to make our network
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten, MaxPooling2D, Conv2D
from keras.datasets import cifar10
from keras.optimizers import rmsprop
from keras.preprocessing.image import ImageDataGenerator
import keras.utils as np_utils
import numpy as np

import os

Using TensorFlow backend.


Now that we've imported our dependencies, we can get started. We're going so define some constants that we'll use later on in our program.

In [2]:
batch_size = 32
number_of_classes = 10
epochs = 200
data_augmentation = True
number_of_predictions = 20
save_directory = os.path.join(os.getcwd(), 'saved_models')
model_name = 'keras_first_cnn_test.h5'

The number_of_classes is the amount of classification classes we want. i.e. 0,1,2,3...10. Each one of those numbers is a class. We also define our save_directory and the model_name that we'll call it.

In [3]:
# We're going to shuffle our test data that keras provides into testing and training data randomly
# The data, shuffled and split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# Convert class vectors to binary class matrices.
y_train = np_utils.to_categorical(y_train, number_of_classes)
y_test = np_utils.to_categorical(y_test, number_of_classes)

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples


### Model Time

We're going to create a Keras sequential model. With this we can add layers easily with the .add() method.

In [4]:
model = Sequential()
model.add(Conv2D(32, (3,3), padding='same', input_shape=x_train.shape[1:]))

Our first layer is out initial convolutional layer. We're going to be following this basic format:

![Hello](https://fr.mathworks.com/content/mathworks/fr/fr/discovery/convolutional-neural-network/jcr:content/mainParsys/image_copy.adapt.full.high.jpg/1506591540823.jpg)


The only exception is that we will be implimenting dropout functions as well as some other additions.


In [5]:
# add an activation layer with ReLU
model.add(Activation('relu'))
# add Dnother convolutional layer
model.add(Conv2D(32, (3,3)))
# add another activation layer now
model.add(Activation('relu'))
# perform maxPooling
model.add(MaxPooling2D(pool_size=(2,2)))
# finally add the dropout layer

What did we just do?

Well we added a ReLU activation layer, another convolutional layer with a filter of 32, and a kernal_size of [3,3], then another ReLU activation, and finally we added a MaxPooling layer with a pool size of [2,2]. Easy... 😉

## Activation Functions

It sounds like something you could say to your teacher to make them think you're some kind of genius, but it's actually quite simple. An activation function mimics the human brain. Inside the neuron, a chemical reaction takes place to determine if the signal should pass to the next neuron. That's what an activation function does in basic.

We're using a ReLU (Rectified Linear Unit) activation function as it's what most closly mimics the brain.

Some other activation functions include:

* Sigmoid: https://en.wikipedia.org/wiki/Sigmoid_function

* Softplus

* Gaussian

* Softmax

Some more info is listed here: https://en.wikipedia.org/wiki/Rectifier_(neural_networks)


## Pooling

So what's pooling?

Pooling is a way to reduce noise in an image in order to make analysis easier for the next layer.

Our pooling layer, MaxPooling2D, using the Max Pooling algorithm to reduce extra information.
![Max Pooling](https://upload.wikimedia.org/wikipedia/commons/e/e9/Max_pooling.png)


It takes the largest value in the matrix and that is the new pixel value of the output matrix.



In [6]:
# add the dropout layer now, and drop out 25%
model.add(Dropout(0.25))

## Dropout Function

The dropout function is interesting because it prevents "overfitting" by doing something that may seem counter intuitive. The dropout function "destroys" a given percentage of the network setting weights back to 0. It's basically like smashing this delicatly constructed piece of art with a hammer. The reasoning for this is that it keeps the model from being to biased towards one type of class. By implementing a dropout, the model can better classify images of dogs, let's say, that it hasn't seen before. Overfitting can reduce the amount of correct classifications on data the network hasn't seen yet. Another analogy is taking the blinders off of a racehorse, now it can see so many more places to run, besides just straight.


In [7]:
# same as before just increase kernal size by 2
model.add(Conv2D(64, (3,3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))

In [8]:
## our final layer bunch
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(number_of_classes))
model.add(Activation('softmax'))


## Flattening a Layer

Here, we flatten our layer and turn the vetor output of our last calculation into a scalar value, for use in the next layer.

## Dense Layers

A dense layer is just another name for a fully connected layer. A dense layer is where all input nodes are connected to all hidden nodes, which are connected to all output nodes. In our Dense layer, we specify our units, which are the dimensions of the output. i.e. (*, 512).

The keras documentation shows

```python
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# now the model will take as input arrays of shape (*, 16)
# and output arrays of shape (*, 32)


```

## Softmax classification

Our final activation is softmax. Softmax is multinomial logistic regression. Basically, we're trying to calculate the probabilities to which class it is. So for example it might output:

```
0: 0.98
8: 0.70
9: 0.56
...
```

Meaning that the network thinks that with 98% accuracy, the input image is a 0

In [9]:
# initiate RMSProp optimizer
opt = rmsprop(lr=0.0001, decay=1e-6)

## RMS Optimization

According to cs.toronto.edu:

rmsprop: Divide the learning rate for a weight by a running average of the magnitudes of recent gradients for that weight.

We want to minimize the output of our loss function so we're using RMS prop optimization, which is a type of gradient descent

In [10]:
#train the model now using RMSprop
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

## Cross Entropy

Here, we're using a loss function called categorical cross entropy. We also use the optimizer as the optimizer we previously created `opt`. Our desired metrics are accuracy

More on cross entropy here: https://en.wikipedia.org/wiki/Cross_entropy

In [11]:
# Finally we're going to convert our training data to a float32
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255


In [12]:
# If we're using data augmentation, we are going to fit our model with our training data, and our batch size, as well as our epochs.
# We're also using some validation_data to make sure our model works fine

if not data_augmentation:
    print('Not using data augmentation.')
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test),
              shuffle=True)
else:
     print('Using real-time data augmentation.')
     datagen = ImageDataGenerator(
        featurewise_center=False,
        samplewise_center=False,
        featurewise_std_normalization=False,
        samplewise_std_normalization=False,
        zca_whitening=False,
        rotation_range=0,
        width_shift_range=0.1,
        height_shift_range=0.1,
        horizontal_flip=True,
        vertical_flip=False
     )
     datagen.fit(x_train)

     # Fit the model on the batches generated by datagen.flow().
     model.fit_generator(datagen.flow(x_train, y_train,
                                     batch_size=batch_size),
                        steps_per_epoch=x_train.shape[0] // batch_size,
                        epochs=epochs,
                        validation_data=(x_test, y_test),
                        workers=4)

Using real-time data augmentation.
Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200

Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 

NOTE: Executing the above will train the model and will take some time. A few hours with a decent laptop via Jupyter

In [15]:
# Save model and weights
if not os.path.isdir(save_directory):
    os.makedirs(save_directory)
model_path = os.path.join(save_directory, model_name)
model.save(model_path)
print('Saved trained model at %s ' % model_path)

# Score trained model.
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

Saved trained model at /Users/ryan/GitHub/AITesting/Misc./saved_models/keras_first_cnn_test.h5 
Test accuracy: 0.6293


## Accuracy

Our outputed accuracy is about 0.63. Which is terrible, but with better models, we can improve that accuracy.

Original Source code located:
https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py