## 作業
礙於不是所有同學都有 GPU ，這邊的範例使用的是簡化版本的 ResNet，確保所有同學都能夠順利訓練!


最後一天的作業請閱讀這篇非常詳盡的[文章](https://blog.gtwang.org/programming/keras-resnet-50-pre-trained-model-build-dogs-cats-image-classification-system/)，基本上已經涵蓋了所有訓練　CNN 常用的技巧，請使用所有學過的訓練技巧，盡可能地提高 Cifar-10 的 test data 準確率，截圖你最佳的結果並上傳來完成最後一次的作業吧!

另外這些技巧在 Kaggle 上也會被許多人使用，更有人會開發一些新的技巧，例如使把預訓練在 ImageNet 上的模型當成 feature extractor 後，再拿擷取出的特徵重新訓練新的模型，這些技巧再進階的課程我們會在提到，有興趣的同學也可以[參考](https://www.kaggle.com/insaff/img-feature-extraction-with-pretrained-resnet)

In [30]:
from __future__ import print_function
import numpy as np
import os
import keras
from keras.layers import Dense, Conv2D, BatchNormalization, Activation
from keras.layers import AveragePooling2D, Input, Flatten
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras.callbacks import ReduceLROnPlateau
from keras.preprocessing.image import ImageDataGenerator
from keras.regularizers import l2
from keras import backend as K
from keras.models import Model
from keras.datasets import cifar10

In [31]:
batch_size = 128  
epochs = 30
data_augmentation = True
num_classes = 10
n = 16
depth = 6 * n + 2

In [32]:
# The data, shuffled and split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
x_train = x_train / 255.
x_test = x_test / 255.

input_shape = x_train.shape[1:]

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples


In [33]:
def lr_schedule(epoch):
    lr = 1e-5
    if epoch > 180:
        lr *= 0.5e-5
    elif epoch > 160:
        lr *= 1e-3
    elif epoch > 120:
        lr *= 1e-2
    elif epoch > 80:
        lr *= 1e-1
    print('Learning rate: ', lr)
    return lr

In [34]:
def resnet_layer(inputs,
                 num_filters=16,
                 kernel_size=3,
                 strides=1,
                 activation='relu',
                 batch_normalization=True,
                 conv_first=True):
    
    conv = Conv2D(num_filters,
                  kernel_size=kernel_size,
                  strides=strides,
                  padding='same',
                  kernel_initializer='he_normal',
                  kernel_regularizer=l2(1e-4))

    x = inputs
    if conv_first:
        x = conv(x)
        if batch_normalization:
            x = BatchNormalization()(x)
        if activation is not None:
            x = Activation(activation)(x)
    else:
        if batch_normalization:
            x = BatchNormalization()(x)
        if activation is not None:
            x = Activation(activation)(x)
        x = conv(x)
    return x

In [35]:
def resnet_v1(input_shape, depth = depth, num_classes=10):
    
    num_filters = 16
    num_res_blocks = int((depth - 2) / 6)
    
    inputs = Input(shape=input_shape)
    
    x = resnet_layer(inputs=inputs)
    
    for stack in range(3):
        
        for res_block in range(num_res_blocks):
            strides = 1
            if stack > 0 and res_block == 0:  
                strides = 2  
            y = resnet_layer(inputs=x,
                             num_filters=num_filters,
                             strides=strides)
            y = resnet_layer(inputs=y,
                             num_filters=num_filters,
                             activation=None)
            if stack > 0 and res_block == 0: 
                
                
                x = resnet_layer(inputs=x,
                                 num_filters=num_filters,
                                 kernel_size=1,
                                 strides=strides,
                                 activation=None,
                                 batch_normalization=False)
            x = keras.layers.add([x, y]) 
            x = Activation('relu')(x)
        num_filters *= 2

    x = AveragePooling2D(pool_size=8)(x)
    y = Flatten()(x)
    
    outputs = Dense(num_classes,
                    activation='softmax',
                    kernel_initializer='he_normal')(y)

    model = Model(inputs=inputs, outputs=outputs)
    return model

In [36]:
model = resnet_v1(input_shape=input_shape, depth= depth)

model.compile(loss='binary_crossentropy',
              optimizer=Adam(lr=lr_schedule(0)),
              metrics=['accuracy'])
model.summary()

Learning rate:  1e-05
Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 32, 32, 3)    0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 32, 32, 16)   448         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 32, 32, 16)   64          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 32, 32, 16)   0           batch_normalization_1[0][0]      
______________________________________________________________________

batch_normalization_51 (BatchNo (None, 16, 16, 32)   128         conv2d_52[0][0]                  
__________________________________________________________________________________________________
add_25 (Add)                    (None, 16, 16, 32)   0           activation_49[0][0]              
                                                                 batch_normalization_51[0][0]     
__________________________________________________________________________________________________
activation_51 (Activation)      (None, 16, 16, 32)   0           add_25[0][0]                     
__________________________________________________________________________________________________
conv2d_53 (Conv2D)              (None, 16, 16, 32)   9248        activation_51[0][0]              
__________________________________________________________________________________________________
batch_normalization_52 (BatchNo (None, 16, 16, 32)   128         conv2d_53[0][0]                  
__________

Total params: 1,546,986
Trainable params: 1,539,786
Non-trainable params: 7,200
__________________________________________________________________________________________________


In [37]:
lr_scheduler = LearningRateScheduler(lr_schedule)

lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1),
                               cooldown=0,
                               patience=5,
                               min_lr=0.5e-6)

callbacks = [lr_reducer, lr_scheduler]

In [38]:
augment_generator = ImageDataGenerator(rotation_range=10, width_shift_range=0.1, height_shift_range=0.1, horizontal_flip=True)

In [39]:
history = model.fit_generator(augment_generator.flow(x_train, y_train, batch_size=batch_size),
                    steps_per_epoch=int(len(x_train)/batch_size),
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test),
                    workers=4,
                    callbacks=callbacks)

scores = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

Epoch 1/30
Learning rate:  1e-05
Epoch 2/30
Learning rate:  1e-05
Epoch 3/30
Learning rate:  1e-05
Epoch 4/30
Learning rate:  1e-05
Epoch 5/30
Learning rate:  1e-05
Epoch 6/30
Learning rate:  1e-05
Epoch 7/30
Learning rate:  1e-05
Epoch 8/30
Learning rate:  1e-05
Epoch 9/30
Learning rate:  1e-05
Epoch 10/30
Learning rate:  1e-05
Epoch 11/30
Learning rate:  1e-05
Epoch 12/30
Learning rate:  1e-05
Epoch 13/30
Learning rate:  1e-05
Epoch 14/30
Learning rate:  1e-05
Epoch 15/30
Learning rate:  1e-05
Epoch 16/30
Learning rate:  1e-05
Epoch 17/30
Learning rate:  1e-05
Epoch 18/30
Learning rate:  1e-05
Epoch 19/30
Learning rate:  1e-05
Epoch 20/30
Learning rate:  1e-05
Epoch 21/30
Learning rate:  1e-05
Epoch 22/30
Learning rate:  1e-05
Epoch 23/30
Learning rate:  1e-05
Epoch 24/30
Learning rate:  1e-05
Epoch 25/30
Learning rate:  1e-05
Epoch 26/30
Learning rate:  1e-05
Epoch 27/30
Learning rate:  1e-05
Epoch 28/30
Learning rate:  1e-05
Epoch 29/30
Learning rate:  1e-05
Epoch 30/30
Learning ra