# Convolutional Neural Network

make sure you have acces to the keras and tensorflow packages
```pip install keras``` & ```pip install tensorflow```

A Convolutional neural network is mainly used to categorize images. by copying the behavoir of the brain. the layers are capable of extracting features from images. CNN's have two components:
1. The hidden layers / feature extraction
2. The Classification layers

<img src="./images/architecture.png">

### Sources
[https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050](https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050)

[https://keras.io/layers/convolutional/#conv2d](https://keras.io/layers/convolutional/#conv2d)


In [4]:
import keras
from keras.models import Sequential,Input,Model
from keras.layers import Dense, Flatten, Activation
from keras.layers import Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import LeakyReLU

batch_size = 64
epochs = 20
num_classes = 10

# Setup the model
model = Sequential()

# Hidden layers / Feature extraction
## Conv2D Layer
This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs. If ```use_bias``` is True, a bias vector is created and added to the outputs. Finally, if ```activation``` is not ```None```, it is applied to the outputs as well.

When using this layer as the first layer in a model, provide the keyword argument  
```input_shape ``` (tuple of integers, does not include the sample axis), e.g.  
```input_shape=(128, 128, 3)``` for 128x128 RGB pictures in  
```data_format="channels_last".```

### Arguments
* __filters__: Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).
* __kernel_size__: An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.
* __strides__: An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any ```dilation_rate``` value != 1.
* __padding__: one of ```"valid"``` or ```"same"``` (case-insensitive). Note that ```"same"``` is slightly inconsistent across backends with  strides != 1.
* __data_format__: A string, one of ```"channels_last"``` or ```"channels_first"```. The ordering of the dimensions in the inputs.  ```"channels_last"``` corresponds to inputs with shape ```(batch, height, width, channels)``` while  ```"channels_first"``` corresponds to inputs with shape ```(batch, channels, height, width)```. It defaults to the  ```image_data_format``` value found in your Keras config file at ```~/.keras/keras.json.``` If you never set it, then it will be "channels_last".
* __dilation_rate__: an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any ```dilation_rate value``` != 1 is incompatible with specifying any stride value != 1.
* __activation__: Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: ```a(x) = x```).
* __use_bias__: Boolean, whether the layer uses a bias vector.
* __kernel_initializer__: Initializer for the ```kernel``` weights matrix.
* __bias_initializer__: Initializer for the bias vector.
* __kernel_regularizer__: Regularizer function applied to the ```kernel``` weights matrix.
* __bias_regularizer__: Regularizer function applied to the bias vector.
* __activity_regularizer__: Regularizer function applied to the output of the layer (its "activation").
* __kernel_constraint__: Constraint function applied to the kernel matrix.
* __bias_constraint__: Constraint function applied to the bias vector.

In [5]:
# Images fed into this model are 512x512 pixels with 3 channels (RGB)
image_input_shape  = (512,512,3)

# Add convolutional layer with 3x3x3 filter 32 times and a stride size of 1
model.add(Conv2D(32, kernel_size=(3, 3),activation='linear',input_shape=image_input_shape,padding='same'))

## Activation Layer _(ReLU Layer)_

Rectified Linear Unit.

With default values, it returns element-wise ```max(x, 0)```.

Otherwise, it follows: ```f(x) = max_value``` for ```x >= max_value```, ```f(x) = x``` for ```threshold <= x < max_value, f(x) = alpha * (x - threshold)``` otherwise.

### Arguments

* __activation__: name of activation function to use, or alternatively, a Theano or TensorFlow operation.

In [6]:
# Add relu activation to the layer 
model.add(Activation('relu'))

## Pooling Layer

Max pooling operation for spatial data.

### Arguments

* __pool_size__: integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.
* __strides__: Integer, tuple of 2 integers, or None. Strides values. If None, it will default to ```pool_size```.
* __padding__: One of ```"valid"``` or ```"same"``` (case-insensitive).
* __data_format__: A string, one of ```channels_last``` (default) or ```channels_first```. The ordering of the dimensions in the inputs.  ```channels_last``` corresponds to inputs with shape  ```(batch, height, width, channels)``` while ```channels_first``` corresponds to inputs with shape  ```(batch, channels, height, width)```. It defaults to the  ```image_data_format``` value found in your Keras config file at ```~/.keras/keras.json```. If you never set it, then it will be "channels_last".

In [8]:
#Pooling
model.add(MaxPooling2D(2))

# Classification Layers
## Flatten Layer

Flattens the input. Does not affect the batch size.

### Arguments

* __data_format__: A string, one of ```channels_last``` (default) or ```channels_first```. The ordering of the dimensions in the inputs. The purpose of this argument is to preserve weight ordering when switching a model from one data format to another.  ```channels_last``` corresponds to inputs with shape  ```(batch, ..., channels)``` while  ```channels_first``` corresponds to inputs with shape ```(batch, channels, ...)```. It defaults to the ```image_data_format``` value found in your Keras config file at  ```~/.keras/keras.json```. If you never set it, then it will be "channels_last".

### Example

```Python
model = Sequential()
model.add(Conv2D(64, (3, 3),
                 input_shape=(3, 32, 32), padding='same',))
# now: model.output_shape == (None, 64, 32, 32)

model.add(Flatten())
# now: model.output_shape == (None, 65536)
```

In [9]:
# Use Flatten to convert 3D data to 1D
model.add(Flatten())

## Dense Layers

Just your regular densely-connected NN layer.

```Dense``` implements the operation: ```output = activation(dot(input, kernel) + bias)``` where ```activation``` is the element-wise activation function passed as the ```activation``` argument, ```kernel``` is a weights matrix created by the layer, and ```bias``` is a bias vector created by the layer (only applicable if ```use_bias``` is ```True```).

Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with ```kernel```.

### Arguments

* __units__: Positive integer, dimensionality of the output space.
* __activation__: Activation function to use (see activations). If you don't specify anything, no activation is applied (ie. "linear" activation: ```a(x) = x)```.
* __use_bias__: Boolean, whether the layer uses a bias vector.
* __kernel_initializer__: Initializer for the ```kernel``` weights matrix.
* __bias_initializer__: Initializer for the bias vector.
* __kernel_regularizer__: Regularizer function applied to the ```kernel``` weights matrix.
* __bias_regularizer__: Regularizer function applied to the bias vector.
* __activity_regularizer__: Regularizer function applied to the output of the layer (its "activation").
* __kernel_constraint__: Constraint function applied to the ```kernel``` weights matrix.
* __bias_constraint__: Constraint function applied to the bias vector.

### Example
```Python
# as first layer in a sequential model:
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# now the model will take as input arrays of shape (*, 16)
# and output arrays of shape (*, 32)

# after the first layer, you don't need to specify
# the size of the input anymore:
model.add(Dense(32))
```

In [10]:
# Add dense layer with 10 neurons
model.add(Dense(10))

## Activation Layer (Softmax Layer)

Softmax activation function.

produces just the result of applying the softmax function to an input tensor. The softmax "squishes" the inputs so that ```sum(input) = 1;``` it's a way of normalizing. The shape of output of a softmax is the same as the input - it just normalizes the values. The outputs of softmax can be interpreted as probabilities.

### Arguments

* __activation__: name of activation function to use, or alternatively, a Theano or TensorFlow operation.

In [11]:
# we use the softmax activation function for our last layer
model.add(Activation('softmax'))

In [12]:
# give an overview of our model
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 512, 512, 32)      896       
_________________________________________________________________
activation_1 (Activation)    (None, 512, 512, 32)      0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 256, 256, 32)      0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 2097152)           0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                20971530  
_________________________________________________________________
activation_2 (Activation)    (None, 10)                0         
Total params: 20,972,426
Trainable params: 20,972,426
Non-trainable params: 0
________________________________________________________________

# Training

Training a CNN works in the same way as a regular neural network, using backpropagration or gradient descent. However, here this is a bit more mathematically complex because of the convolution operations.

In [22]:
# Before the training process, we have to put together a learning process in a particular form. 
# It consists of 3 elements: an optimiser, a loss function and a metric.

model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adam(),metrics=['accuracy'])