This tutorial will guide you through the process of using _transfer learning_ to learn an accurate image classifier from a small number of training samples. Transfer learning refers to the process of taking an existing neural network which was previously trained to good performance on a larger dataset, and using it as the basis for a new model which leverages the previous model's accuracy for the new task. Another name for this procedure is called "fine-tuning" because we will take a previous neural network and fine-tune its weights to a new dataset.

How it works: we will first load a previously-trained neural net, Keras's built-in VGG16 model which was trained to do ImageNet classification on 1000 classes, and remove its final layer, the 1000-neuron softmax classification layer, and replace it with a new classification layer for the number of classes we are training on. We then "freeze" all the weights in the network except the ones connecting to the new classification layer, and then re-train the model on our new dataset. By keeping the earlier weights fixed, we get to use the features that were discoverd in the previous model.

We will compare using this method to training a small neural network from scratch on the new dataset, and as we shall see, it will dramatically improve our accuracy.

In [71]:
import os
import random
#from tqdm import tqdm
import numpy as np
import keras
from keras.preprocessing import image
from keras.applications.imagenet_utils import preprocess_input

from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D


In [7]:
model = keras.applications.VGG16(weights='imagenet', include_top=True)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
__________

In [None]:

frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')

layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')

frozen_model.fit(data, labels)  # this does NOT update the weights of `layer`
trainable_model.fit(data, labels)  # this updates the weights of `layer`


In [16]:
from keras.models import Model

#x = Input(shape=(32,))
#layer = Dense(2)
#layer.trainable = False
#y = layer(x)


#m2 = Model(inputs=model.input, outputs=vgg.get_layer("fc2").output)

#model.layers.pop()
model.summary()

from keras.layers import Dense



new_layer = Dense(5, activation='softmax')

inp = model.input
out = new_layer(model.layers[-1].output)

model2 = Model(inp, out)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
__________

In [17]:
model2.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
__________

In [44]:
model2.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])

#model2.fit(x_train, y_train, batch_size=64, epochs=10, verbose=1, validation_data=(x_test, y_test))
idx=0
for l in model2.layers[:-1]:
    idx+=1
#    print(l.trainable)
#    print(idx)
    l.trainable = False



In [49]:
def get_image(path):
    img = image.load_img(path, target_size=(224, 224))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    return x

root = '/Users/gene/Teaching/ML4A/ml4a-guides/data/101_ObjectCategories'
categories = ['cellphone', 'chair', 'chandelier', 'cougar_body', 'cougar_face']

dataset = []
for c, category in enumerate(categories):
    folder_path = os.path.join(root, category)
    images = [os.path.join(dp, f) for dp, dn, filenames in os.walk(folder_path) for f in filenames if os.path.splitext(f)[1].lower() in ['.jpg','.png','.jpeg']]
    for img in images:
        dataset.append([get_image(img), c])



In [65]:

random.shuffle(dataset)

num_classes = 5

n_train = int(0.8 * len(dataset))
train = dataset[:n_train]
test = dataset[n_train:]

x_train, y_train = np.array([t[0][0] for t in train]), [t[1] for t in train]
x_test, y_test = np.array([t[0][0] for t in test]), [t[1] for t in test]

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)


print(model2.input_shape)
print(np.array(x_train).shape)
print(np.array(x_test).shape)
print(np.array(y_train).shape)
print(np.array(y_test).shape)

model2.summary()

(None, 224, 224, 3)
(275, 224, 224, 3)
(69, 224, 224, 3)
(275, 5)
(69, 5)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
bl

In [66]:
model2.fit(x_train, y_train, batch_size=16, epochs=10, verbose=1, validation_data=(x_test, y_test))

Train on 275 samples, validate on 69 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x113920650>

In [67]:
score = model2.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

input_shape = model.input_shape
print(input_shape)

('Test loss:', 0.15770488923442536)
('Test accuracy:', 0.97101449275362317)
(None, 224, 224, 3)


In [79]:
model3 = Sequential()
model3.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=model.input_shape[1:]))
model3.add(Conv2D(64, (3, 3), activation='relu'))
model3.add(MaxPooling2D(pool_size=(2, 2)))
model3.add(Dropout(0.25))
model3.add(Flatten())
model3.add(Dense(128, activation='relu'))
model3.add(Dropout(0.5))
model3.add(Dense(num_classes, activation='softmax'))


model3.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])

print(model3.summary())
model3.fit(x_train, y_train,
          batch_size=64,
          epochs=10,
          verbose=1,
          validation_data=(x_test, y_test))
score = model3.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_8 (Conv2D)            (None, 222, 222, 32)      896       
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 220, 220, 64)      18496     
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 110, 110, 64)      0         
_________________________________________________________________
dropout_7 (Dropout)          (None, 110, 110, 64)      0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 774400)            0         
_________________________________________________________________
dense_12 (Dense)             (None, 128)               99123328  
_________________________________________________________________
dropout_8 (Dropout)          (None, 128)               0         
__________

In [80]:
yy = model3.predict(x_test)
print(yy)

[[  4.67201098e-05   3.29545233e-04   9.99495208e-01   5.86423012e-05
    6.99033480e-05]
 [  9.86965537e-01   5.12156216e-03   2.68430007e-03   7.02193181e-04
    4.52640792e-03]
 [  2.31203809e-03   9.95093107e-01   8.53715814e-04   1.50004285e-03
    2.41120099e-04]
 [  2.17481679e-03   3.42527893e-03   9.66433145e-05   3.88829976e-01
    6.05473280e-01]
 [  2.27084711e-05   7.57834528e-07   1.38215762e-06   5.07252710e-03
    9.94902611e-01]
 [  2.46833963e-07   9.99996305e-01   1.27271235e-07   3.17139620e-06
    1.26873019e-07]
 [  3.08975868e-05   1.52450838e-07   5.61762761e-07   9.97974455e-01
    1.99394347e-03]
 [  1.04177769e-08   1.18090320e-08   9.89422162e-08   8.46949297e-06
    9.99991417e-01]
 [  8.79706495e-05   4.50998954e-07   4.88280457e-05   9.98866141e-01
    9.96620860e-04]
 [  2.39015008e-05   5.19912355e-05   2.35802308e-03   1.04002702e-05
    9.97555673e-01]
 [  4.24749214e-06   9.80457008e-01   1.91632938e-02   1.48924650e-04
    2.26530974e-04]
 [  6.5529