## Homework 2

- Build and train a MLP Model to classify Mnist dataset

 1- MLP Network accepts 1D data. So we should flatten our 2D image, then print the dimension of the result arrays.
 
 2- Normalize data by rescaling them to (0,1) 
 
 3- Convert label arrays to 1-hot representation (`keras.utils.to_categorical`)
 
 4- Define Model
    * Hidden Layer 1: Fully Conncted + Relu Activition (e.g. 512 Nuerons)
    * Hidden Layer 2: Fully Connected + Relu Activition (e.g. 512 Neurons)
    * Outout Layer: Fully Connected + Softmax Activition
 
 
- Build and train a CNN+MLP deep learning model with Keras with followings specs for MNIST dataset:

    1. Conv2D(32, kernel_size=(3, 3), activation='relu')
    2. Conv2D(64, kernel_size=(3, 3), activation='relu')
    3. MaxPooling2D(pool_size=(2, 2))
    4. Dense(128, activation='relu')
    5. Dense(num_classes, activation='softmax')

    Also build another model with BatchNormalization and Dropout.
    Compare these two CNN + MLP models performance for test data

In [1]:
import numpy as np
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras import utils
from tensorflow.keras.optimizers import SGD

(x_train, y_train), (x_test, y_test) = mnist.load_data()

print('train data dim:', x_train.shape)
print('test data dim:', x_test.shape)
print('test label dim:', y_test.shape)

print('max of training data:', np.max(x_train))

train data dim: (60000, 28, 28)
test data dim: (10000, 28, 28)
test label dim: (10000,)
max of training data: 255


### MLP Model

In [27]:
# Reshaping and normalizing data

# Max pixel value is 255
x_train = np.reshape(x_train, [-1, 28*28])
x_test = np.reshape(x_test, [-1, 28*28])
x_train = x_train / 255
x_test = x_test / 255

# Number of classes is 10 (arabic numerals 0 through 9)
y_train = utils.to_categorical(y_train, 10)
y_test = utils.to_categorical(y_test, 10)

In [28]:
# Defining our MLP model

model = Sequential()

model.add(Dense(512, input_shape=(784,), activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))

In [29]:
# Use Categorical Cross-Entropy as loss function
sgd = SGD(lr=0.01)
model.compile(loss='categorical_crossentropy',
             optimizer=sgd,
             metrics=['accuracy'])
print(model.summary())

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_12 (Dense)             (None, 512)               401920    
_________________________________________________________________
dense_13 (Dense)             (None, 512)               262656    
_________________________________________________________________
dense_14 (Dense)             (None, 10)                5130      
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
None


In [30]:
# Training the model
history = model.fit(x=x_train, y=y_train,
                   batch_size = 32,
                   epochs = 10,
                   verbose = 1,
                   validation_data=(x_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Not bad! Let's try going DEEPER.

### CNN + MLP Deep Learning Model

In [2]:
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.optimizers import Adadelta

# Define the model

deep_model = Sequential()
deep_model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=(28, 28, 1)))
deep_model.add(Conv2D(64, (3, 3), activation='relu'))
deep_model.add(MaxPool2D(pool_size=(2, 2)))
deep_model.add(Flatten())
deep_model.add(Dense(128, activation='relu'))
deep_model.add(Dense(10, activation='softmax'))

deep_model.compile(loss=categorical_crossentropy,
              optimizer=Adadelta(),
              metrics=['accuracy'])

print(deep_model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 9216)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               1179776   
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
Total params: 1,199,882
Trainable params: 1,199,882
Non-trainable params: 0
______________________________________________

In [4]:
# Loading the data

(dx_train, dy_train), (dx_test, dy_test) = mnist.load_data()

print('train data dim:', dx_train.shape)
print('test data dim:', dx_test.shape)
print('test label dim:', dy_test.shape)

print('max of training data:', np.max(dx_train))

train data dim: (60000, 28, 28)
test data dim: (10000, 28, 28)
test label dim: (10000,)
max of training data: 255


In [5]:
# Reshaping the data

img_rows, img_cols = 28, 28
dx_train = dx_train.reshape(dx_train.shape[0], img_rows, img_cols, 1)
dx_test = dx_test.reshape(dx_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Number of classes is 10 (arabic numerals 0 through 9)
dy_train = utils.to_categorical(dy_train, 10)
dy_test = utils.to_categorical(dy_test, 10)

print(dx_train[0].shape)
print(dx_train[1].shape)
print(dy_train[0].shape)
print(dy_test[0].shape)

(28, 28, 1)
(28, 28, 1)
(10,)
(10,)


In [36]:
# Training the model
d_history = deep_model.fit(x=dx_train, y=dy_train,
                   batch_size = 32,
                   epochs = 10,
                   verbose = 1,
                   validation_data=(dx_test, dy_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


DEEEPER

### CNN + MLP Deep Learning Model with Batch Normalization and Dropout

In [3]:
from tensorflow.keras.layers import BatchNormalization, Dropout

# Define the model

bd_model = Sequential()
bd_model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=(28, 28, 1)))
bd_model.add(BatchNormalization())
bd_model.add(Conv2D(64, (3, 3), activation='relu'))
bd_model.add(BatchNormalization())
bd_model.add(MaxPool2D(pool_size=(2, 2)))
bd_model.add(Dropout(0.25))
bd_model.add(Flatten())
bd_model.add(Dense(128, activation='relu'))
bd_model.add(BatchNormalization())
bd_model.add(Dropout(0.5))
bd_model.add(Dense(10, activation='softmax'))

bd_model.compile(loss=categorical_crossentropy,
              optimizer=Adadelta(),
              metrics=['accuracy'])

print(bd_model.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
batch_normalization (BatchNo (None, 26, 26, 32)        128       
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
batch_normalization_1 (Batch (None, 24, 24, 64)        256       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 64)        0         
_________________________________________________________________
dropout (Dropout)            (None, 12, 12, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 9216)             

In [6]:
# Training the model
bd_history = bd_model.fit(x=dx_train, y=dy_train,
                   batch_size = 32,
                   epochs = 10,
                   verbose = 1,
                   validation_data=(dx_test, dy_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


🤷‍♀️