<a href="https://colab.research.google.com/github/xslittlemaggie/Deep-Learning-Projects/blob/master/CNN_Mnist.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Step 1: Import librarious

In [0]:
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.optimizers import SGD, Adam, Adadelta
from keras.utils import np_utils
from keras.datasets import mnist
from keras import backend as K

## Step 2: Load mnist data

In [0]:
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == "channels_first":
  x_train = x_train.reshape(x_train.shape[0], 1, 28, 28)
  x_test = x_test.reshape(x_test.shape[0], 1, 28, 28)
  input_shape = (1, 28, 28)
else:
  x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
  x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
  input_shape = (28, 28, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train = x_train/255
x_test = x_test/255

# convert class vectors to binary class matrics
y_train = np_utils.to_categorical(y_train, 10)
y_test = np_utils.to_categorical(y_test, 10)

## Step 3: Get familiar with the data

In [0]:
print("x_train shape:", x_train.shape)
print("y_train shape:", y_train.shape)
print("x_test shape:", x_test.shape)
print("y_test shape:", y_test.shape)

x_train shape: (60000, 28, 28, 1)
y_train shape: (60000, 10)
x_test shape: (10000, 28, 28, 1)
y_test shape: (10000, 10)


In [0]:
print(y_train[0]) # the first value is 5

[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]


## Step 4: Create the CNN model

Models:
  
  1). **Dense -> fully connected** NN
  2). **MaxPooling** 
  3). **Conv2D -> CNN**      4). **Flatten**

Parameters:
1. Activation functions: 
    **1). relu** (more efficient)
    **2). sigmoid**
    **3). tanh**
    **4). softmax** (usually the last layer)
      
2. Loss functions: 
    **1). mse** (not good for classification)
    **2). categorical_crossentropy**
      
3. Optimizers: 
    **1). SGD(lr = 0.01)**
    **2). Adam**
    **3). Adadelta
      
  

In [0]:
model = Sequential()

model.add(Conv2D(32, kernel_size = (3, 3), activation = 'relu', input_shape = (28, 28, 1)))

model.add(Conv2D(64, (3, 3), activation = 'relu'))
model.add(MaxPooling2D((2, 2)))
# model.add(Dropout(0.25))  # one solution for overfitting, dropout

model.add(Flatten())

model.add(Dense(128, activation = 'relu'))
# model.add(Dropout(0.5))

model.add(Dense(10, activation ='softmax'))

model.compile(loss = "categorical_crossentropy", optimizer = Adadelta(), metrics = ['accuracy'])

model.fit(x_train, y_train,
         batch_size = 128, 
         epochs = 20,
         verbose = 1,
         validation_data = (x_test, y_test))

score = model.evaluate(x_train, y_train, verbose = 0) 
print("Train Acc:", score[1])

score = model.evaluate(x_test, y_test, verbose = 0) 

print("Test Acc:", score[1])

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Train Acc: 0.9999833333333333
Test Acc: 0.9909
