Instructions<br>

**Based on the MNIST dataset, design and implement a proper convolutional neural network.**<br>
**Based on CNN classifiers, please implement an object detection task (including face recognition).**

(1)Based on the MNIST dataset, design and implement a proper convolutional neural network.<br>
    LeNet-5 CNN for handwrite digits

In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Load dataset

In [3]:
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Reshape the input to fit LeNet-5

In [4]:
X_train = np.array(X_train)
X_test = np.array(X_test)

X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)

Padding the images by 2 pixels since in the paper input images were 32x32

In [5]:
X_train = np.pad(X_train, ((0,0),(2,2),(2,2),(0,0)), 'constant')
X_test = np.pad(X_test, ((0,0),(2,2),(2,2),(0,0)), 'constant')

Standardization

In [6]:
mean_px = X_train.mean().astype(np.float32)
std_px = X_train.std().astype(np.float32)
X_train = (X_train - mean_px)/(std_px)

In [9]:
import keras 
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

network1 = Sequential()

Layer 1: Conv Layer 1

In [10]:
network1.add(Conv2D(filters = 6,
                 kernel_size = 5,
                 strides = 1,
                 activation = 'relu',
                 input_shape = (32, 32, 1)))

Pooling layer 1

In [11]:
network1.add(MaxPooling2D(pool_size = 2, strides = 2))

Layer 2: Conv Layer 2

In [12]:
network1.add(Conv2D(filters = 16, 
                 kernel_size = 5,
                 strides = 1,
                 activation = 'relu',
                 input_shape = (14,14,6)))

Pooling Layer 2

In [13]:
network1.add(MaxPooling2D(pool_size = 2, strides = 2))

Flatten

In [14]:
network1.add(Flatten())

Layer 3: Fully connected layer 1

In [15]:
network1.add(Dense(units = 120, activation = 'relu'))

Layer 4: Fully connected layer 2

In [16]:
network1.add(Dense(units = 84, activation = 'relu'))

Layer 5: Output Layer

In [17]:
network1.add(Dense(units = 10, activation = 'softmax'))
network1.compile(optimizer = 'adam', loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

Prepare the label

In [19]:
from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

Train network1

In [21]:
network1.fit(X_train ,y_train, batch_size=128, epochs = 5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x2381fbcb240>

Test network1 with dataset

In [22]:
test_loss1, test_acc1 = network1.evaluate(X_test, y_test)
print('test_acc of network1: ', test_acc1)

test_acc of network1:  0.9774


(2)Based on CNN classifiers, please implement an object detection task (including face recognition).
I implement a face recognition network, tested it with data `olivettifaces.gif`

In [13]:
import numpy as np
from PIL import Image
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
from keras import backend as K

In [14]:
def get_load_data(dataset_path):
    img = Image.open(dataset_path)
    img_ndarray = np.asarray(img, dtype = 'float64')/255
    # 400 pictures, size: 57*47 = 2679  
    faces_data = np.empty((400, 2679))
    for row in range(20):  
       for column in range(20):
           faces_data[row*20+column] = np.ndarray.flatten(img_ndarray[row*57:(row+1)*57, column*47:(column+1)*47])
    label = np.empty(400)
    for i in range(40):
        label[i*10:(i+1)*10] = i
    label = label.astype(np.int)

    train_data = np.empty((320, 2679))
    train_label = np.empty(320)
    valid_data = np.empty((40, 2679))
    valid_label = np.empty(40)
    test_data = np.empty((40, 2679))
    test_label = np.empty(40)
    for i in range(40):
        train_data[i*8:i*8+8] = faces_data[i*10:i*10+8] 
        train_label[i*8:i*8+8] = label[i*10 : i*10+8] 
        valid_data[i] = faces_data[i*10+8]   
        valid_label[i] = label[i*10+8]       
        test_data[i] = faces_data[i*10+9]   
        test_label[i] = label[i*10+9]       
    train_data = train_data.astype('float32')
    valid_data = valid_data.astype('float32')
    test_data = test_data.astype('float32')

    result = [(train_data, train_label), (valid_data, valid_label), (test_data, test_label)]
    return result

In [15]:
def get_set_model(lr=0.005,decay=1e-6,momentum=0.9):
    model = Sequential()
    if K.image_data_format() == 'channels_first':
        model.add(Conv2D(nb_filters1, kernel_size=(3, 3), input_shape = (1, img_rows, img_cols)))
    else:
        model.add(Conv2D(nb_filters1, kernel_size=(2, 2), input_shape = (img_rows, img_cols, 1)))
    model.add(Activation('tanh'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(nb_filters2, kernel_size=(3, 3)))
    model.add(Activation('tanh'))  
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))  

    model.add(Flatten())  
    model.add(Dense(1000))       #Full connection
    model.add(Activation('tanh'))  
    model.add(Dropout(0.5))  
    model.add(Dense(40))
    model.add(Activation('softmax'))  

    sgd = SGD(lr=lr, decay=decay, momentum=momentum, nesterov=True)  
    model.compile(loss='categorical_crossentropy', optimizer=sgd)
    return model  

In [16]:
def get_train_model(model,X_train, Y_train, X_val, Y_val):
    model.fit(X_train, Y_train, batch_size = batch_size, epochs = epochs,  
          verbose=1, validation_data=(X_val, Y_val))

    model.save_weights('model_weights.h5', overwrite=True)  
    return model  

In [17]:
def get_test_model(model,X,Y):
    model.load_weights('model_weights.h5')  
    score = model.evaluate(X, Y, verbose=0)
    return score  

In [18]:
# [start]
epochs = 35          
batch_size = 40     
img_rows, img_cols = 57, 47         
nb_filters1, nb_filters2 = 20, 40  

In [19]:
(X_train, y_train), (X_val, y_val),(X_test, y_test) = get_load_data('C:/Users/28347/olivettifaces.gif')

if K.image_data_format() == 'channels_first':    
    X_train = X_train.reshape(X_train.shape[0],1,img_rows,img_cols)
    X_val = X_val.reshape(X_val.shape[0], 1, img_rows, img_cols)  
    X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)  
    input_shape = (1, img_rows, img_cols)
else:
    X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)  
    X_val = X_val.reshape(X_val.shape[0], img_rows, img_cols, 1)  
    X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)  
    input_shape = (img_rows, img_cols, 1)

print('X_train shape:', X_train.shape)
# convert class vectors to binary class matrices  
Y_train = np_utils.to_categorical(y_train, 40)
Y_val = np_utils.to_categorical(y_val, 40)
Y_test = np_utils.to_categorical(y_test, 40)

model = get_set_model()
get_train_model(model, X_train, Y_train, X_val, Y_val)
score = get_test_model(model, X_test, Y_test)

model.load_weights('model_weights.h5')
classes = model.predict_classes(X_test, verbose=0)  
test_accuracy = np.mean(np.equal(y_test, classes))
print("last accuarcy:", test_accuracy)
for i in range(0,40):
    if y_test[i] != classes[i]:
        print(y_test[i], 'be misclassified as: ', classes[i]);

X_train shape: (320, 57, 47, 1)
Train on 320 samples, validate on 40 samples
Epoch 1/35
Epoch 2/35
Epoch 3/35
Epoch 4/35
Epoch 5/35
Epoch 6/35
Epoch 7/35
Epoch 8/35
Epoch 9/35
Epoch 10/35
Epoch 11/35
Epoch 12/35
Epoch 13/35
Epoch 14/35
Epoch 15/35
Epoch 16/35
Epoch 17/35
Epoch 18/35
Epoch 19/35
Epoch 20/35
Epoch 21/35
Epoch 22/35
Epoch 23/35
Epoch 24/35
Epoch 25/35
Epoch 26/35
Epoch 27/35
Epoch 28/35
Epoch 29/35
Epoch 30/35
Epoch 31/35
Epoch 32/35
Epoch 33/35
Epoch 34/35
Epoch 35/35
last accuarcy: 0.975
18.0 be misclassified as:  14
