### How can Deep Learning be used to identify a single object in an image.

We want to design an image classifier that takes as input a hand-written digit and predicts what the digit actually is. The class ideally correctly identifies the given image.  

To build this we will use the MNIST database:  
A collection of 70,000 grayscale images of hand-written digits.  
Each image depicts one of the digits 0 through to 9.  
All the images are 28x28 pixels in dimensions.  
This is perhaps one of the most famous databases in the fields of ML and DL.

Using Deep Learning we will take a data driven approach to make use of algorithms to find patterns that will help us to distinguish one number from another.  

Check out the most popular datasets of all time at https://www.kaggle.com/benhamner/popular-datasets-over-time

#### The MNIST database

![MNIST](img/mnist.png)

### Normalizing Image Inputs

Data normalization is an important pre-processing step. It ensures that each input (each pixel value, in this case) comes from a standard distribution. That is, the range of pixel values in one input image are the same as the range in another image. This standardization makes our model train and reach a minimum error, faster!  

Data normalization is typically done by subtracting the mean (the average of all pixel values) from each pixel, and then dividing the result by the standard deviation of all the pixel values. Sometimes you'll see an approximation here, where we use a mean and standard deviation of 0.5 to center the pixel values. Read more about the Normalize transformation in PyTorch at https://pytorch.org/docs/stable/torchvision/transforms.html#transforms-on-torch-tensor  

The distribution of such data should resemble a Gaussian function (http://mathworld.wolfram.com/GaussianFunction.html) centered at zero. For image inputs we need the pixel numbers to be positive, so we often choose to scale the data in a normalized range [0,1].

#### Input Output Layers

![io-layer](img/io-layer.png)

#### Class Score Predictions

![class-scores](img/class-scores.png)

#### Layers' Architecture

![CNN architecture](img/layer-architecture.png)

## A Basic Solution foe hand-written-digit-classification

The original code is to be found at https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py

### The code:

In [1]:
'''Trains a simple deep NN on the MNIST dataset.
Gets to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''

from __future__ import print_function

import keras
from keras.datasets import mnist     # Import MNIST dataset
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop

batch_size = 128
num_classes = 10
epochs = 20

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000, 784)     # Flatten each input image into a vector of size 784
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))     # Hidden Layer 1
model.add(Dropout(0.2))     # Dropout Layer 1 - 20% chance of dropout (nodes being dropped randomly)
model.add(Dense(512, activation='relu'))     # Hidden Layer 2
model.add(Dropout(0.2))     # Dropout Layer 2 - these layers help avoid overfitting
model.add(Dense(num_classes, activation='softmax'))     # num_classes is the output layer of 10 nodes

model.summary()

model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


60000 train samples
10000 test samples
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epo

![do-research](img/do-research.png)

##### This is just one solution to the task of hand-written digit classification.

The next step in a problem like this, is to:  
1. Keep looking for better solutions or structures, or  
2. If we do find a model that better appeals to us, it's best to try them out in code and see how well they perform.