# Digit recognition Notebook
This notebook should outline how the digit recognition script included in this project should work. It should outline how a Convolutional Neural Network works. It should also cover how to run the script and the different functions that it performs.

The script was trained using the MNIST dataset that was covered in a previous notebook. MNIST is a dataset that consists of 60,000 training images and labels with hand written digits in the range of 0-9 that was taken from the NIST dataset. 

## Running the script
To run the script you will need to clone the git repository from [here](https://github.com/lanodburke/Emerging-Technologies-Project.git).

This script takes in command line arguments which are outlined below: 
- -h (help): will print some instructions describing the different parameters
- -b (build): this will build the keras model that will be used to run the digit recognition
- -t (test): this will take in an input file as an argument such as a PNG, JPG or any other image format

### Build the model 
Builds the model and saves it with the .h5 file format
```zsh
python digitrec.py -b
```

### Test the model
Run the saved model and make a prediction on the image passed in as an argument
```zsh
python digitrec.py -t image.png
```

## Convolutional Neural Network
> Convolutional Neural Networks have a different architecture than regular Neural Networks. Regular Neural Networks transform an input by putting it through a series of hidden layers. Every layer is made up of a set of neurons, where each layer is fully connected to all neurons in the layer before. Finally, there is a last fully-connected layer — the output layer — that represent the predictions.

> Convolutional Neural Networks are a bit different. First of all, the layers are organised in 3 dimensions: width, height and depth. Further, the neurons in one layer do not connect to all the neurons in the next layer but only to a small region of it. Lastly, the final output will be reduced to a single vector of probability scores, organized along the depth dimension.

### Neural Network Architecture 
![alt text](http://cs231n.github.io/assets/nn1/neural_net2.jpeg "Neural Network")

### Convolutional Neural Network Architecture
![alt text](http://cs231n.github.io/assets/cnn/cnn.jpeg "Convolutional Neural Network")


Convolutional Nerual networks have two parts: 
- Hidden layers/Feature extraction (Common in most image classifiers)
- Classification

### Feature Extraction
To understand feature extraction we first have to look at what is an image feature. 

An image feature can be described as a section of an image that is consistent with images of the same object. For example take an image of a cat, the cats eye can be seen as a visual feature of a cat. This feature is generally consistitent with most cats. 

Feature extraction extracts features in an image and finds a pattern between them and if these features are consistent between all images of an object it will learn that these features can be used to identify the object.

The main feature of feature extraction in image classification is to transform the visual feature to a mathematical vector for computational use.

We do this so computers will be able to compare the similarity of features easily.

### Convolution
> Convolution is the first layer to extract features from an input image. Convolution preserves the relationship between pixels by learning image features using small squares of input data. It is a mathematical operation that takes two inputs such as image matrix and a filter or kernal

#### Example
Consider a 5 x 5 whose image pixel values are 0, 1 and filter matrix 3 x 3 as shown in below

![example](https://cdn-images-1.medium.com/max/1600/1*4yv0yIH0nVhSOv3AkLUIiw.png)

Then the convolution of 5 x 5 image matrix multiplies with 3 x 3 filter matrix which is called “Feature Map” as output shown in below

![example](https://cdn-images-1.medium.com/max/1600/1*MrGSULUtkXc0Ou07QouV8A.gif)

Convolution of an image with different filters can perform operations such as edge detection, blur and sharpen by applying filters. The below example shows various convolution image after applying different types of filters (Kernels).

![example](https://cdn-images-1.medium.com/max/1600/1*uJpkfkm2Lr72mJtRaqoKZg.png)

The examples above were describing a convultion in 2D, but in reality these convolutions are performed in 3d as each image is represented by height, width and depth. The height and the width describe the position of the pixels in the image and the depth the describes the color channels i.e the RGB values. An example of a convultional operation can be seen below

![example](https://cdn-images-1.medium.com/max/1600/1*EuSjHyyDRPAQUdKCKLTgIQ.png)

### Classification 
>After the convolution layer, our classification part consists of a few fully connected layers. However, these fully connected layers can only accept 1 Dimensional data. To convert our 3D data to 1D, we use the function flatten in Python. This essentially arranges our 3D volume into a 1D vector.

>The last layers of a Convolutional NN are fully connected layers. Neurons in a fully connected layer have full connections to all the activations in the previous layer. This part is in principle the same as a regular Neural Network.


## Refrences
- [High level outline of CNNs](https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050)
- [Section on Convultion in Nerual Networks](https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148)