# MNIST Digits recognition with Deep neural networks
The goal of this exercise is get basic understanding in Deep Neural Networks (DNN)

This example borrows bits and pieces from: https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py

## Tasks:
1. Load data
2. Define the architecture of neural network
3. Define loss function and optimizer to optimise it
4. Compile and train the network
5. Evaluate results

# Import some important libraries we need

In [29]:
# Keras is an easy to use Deep Learning library for Python
import keras

from keras import backend as K

# Load MNIST dataset loading function
from keras.datasets import mnist

# Load Sequential model architecture
from keras.models import Sequential

# Load Dense and Dropout layers ?
from keras.layers import Dense, Dropout

# Load RMSprop optimizer to minimize cost to train the network
from keras.optimizers import RMSprop

## Define training parameters

In [36]:
batch_size = 256
epochs = 10

## 1. Load Data

In [16]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [25]:
# Get the number of classes
num_classes = 10

## 2. Preprocessing

In [18]:
### Convert 28 x 28 images to 784 x 1 vectors

In [19]:
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)

In [20]:
### Normalize images from scale [0, 255] to [0, 1]
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

60000 train samples
10000 test samples


### Convert class vectors to binary class matrices

In [21]:
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

## 2. Define the architecture of neural network

In [37]:
# It is a good idea to clear the session (remove graphs etc from GPU/CPU) before defining a new model
K.clear_session()

model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
#model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
#model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________


## 3. Define loss function and optimizer to optimise it

In [38]:
model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

## 4. Compile and train the network

In [39]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## 5. Evaluate results

In [40]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.10230890556128044
Test accuracy: 0.9782


# Exercise part-b: Convolutional neural network
In convolutional neural network top Dense layers, which are extracting features from the images will be replaced with 2D Convolutional layers. 

## Convolutional layers

In keras on input layer:
```python
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
```
And for all the other layers after the input layer:
```python
model.add(Conv2D(64, (3, 3), activation='relu'))
```

## Max Pooling layers

It is also important to use Max Pooling layers to improve translation invariance and decrease the number of parameters. Typically, Max Pooling is added in the end of the each Convolution layer block. In Keras, one can add Max Pooling.

```python
model.add(MaxPooling2D(pool_size=(2, 2)))
```


## Typical design pattern for CNN network
1. Convolutional layer
2. Activation (typically ReLU)
3. MaxPooling

And then there are several of these blocks and then Dense classifier part in the end.

1. Block 1:
    1. Convolutional layer
    2. Activation
    3. MaxPooling
2. Block 2:
    1. Convolutional layer
    2. Activation
    3. MaxPooling
3. Block 3:
    1. Convolutional layer
    2. Activation
    3. MaxPooling
4. Dense classification part
    1. Flatten (to convert 2d matrices into a vert long vector)
    2. Dense layer with quite large number of hidden units
    3. Dense layer with `num_classes` hidden units
    4. SoftMax activation to normalize outputs to [0, 1]

## 1. Define your network here

In [None]:
# Clear previous models, it is important
K.clear_session()

# Init new model
model = Sequential()

# TODO: Add Convolutional layers here


# DO NOT REMOVE THIS LINE. We need to flatten 2d matrices to vectors before MLP part of the network
model.add(Flatten())

# TODO: You can change the number hidden units (512) and activation fucntion if you want
model.add(Dense(512, activation='relu'))

# DO NOT edit lines after this line (if you don't want to see how it breaks)
model.add(Dense(num_classes, activation='softmax'))
model.summary()