## Handwritten Digit Classification Usung Deep Learning 
A.M.N.Hirushan

In [26]:
# example of loading the mnist dataset
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import mnist
from matplotlib import pyplot
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD
#tf.keras.optimizers.SGD

## Load Dataset

In [21]:
def load_dataset():
    # load dataset
    (trainX, trainY), (testX, testY) = mnist.load_data()
    # reshape dataset to have a single channel
    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
    testX = testX.reshape((testX.shape[0], 28, 28, 1))
    # onverts a class vector (integers) to binary class matrix.
    trainY = to_categorical(trainY)
    testY = to_categorical(testY)
    return trainX, trainY, testX, testY

In [None]:
# ----->>>>>> from tensorflow.keras.utils import to_categorical

<img src ="to_categorical.jpg" style=width:700px;/>

## Prepare Pixel Data

A good starting point is to normalize the pixel values of grayscale images, e.g. rescale them to the range [0,1]. 
This involves first converting the data type from unsigned integers to floats, then dividing the pixel values by the maximum value.

In [3]:
# scale pixels
def prep_pixels(train, test):
    # convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')
    # normalize to range 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0
    return train_norm, test_norm

## Define Model

### Sequential
The core idea of Sequential API is simply arranging the Keras layers in a sequential order and so, it is called Sequential API. 

from keras.models import Sequential<br> 
model = Sequential()

### Activation function
Activation function helps to make non-linear Equation.<br><br>
<b>ReLU(z) = max(0,x)</b><br><br>
<b>Softmax</b><br>
probability = exp(1) / (exp(1) + exp(3) + exp(2))<br>
1) Softmax function convert real values into probabilities.<br>
2) It only as output layer of nural network.<br>
3) We can consider higher probability as actual output.<br>


### Filter
We will define the Conv2D with a single filter.<br>
The filter will be two-dimensional and square with the shape 3×3.<br> 
The layer will expect input samples to have the shape [columns, rows, channels] or [28, 28, 1].

### SGD optimizer
We use a conservative configuration for the stochastic gradient descent optimizer with a learning rate of 0.01 and <br>
a momentum of 0.9.

In [28]:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
# compile model
opt = SGD(lr=0.01, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])


In [27]:
trainX, trainY, testX, testY = load_dataset()
# prepare pixel data
trainX, testX = prep_pixels(trainX, testX)