## Modify the Model

### Retrain last layer's linear model

The original VGG16 network's last layer is Dense (a linear model), so it is a little odd and wasterful that we are adding an additional linear model on top of it in lesson 2. 

Also, you may notice that the last layer had a softmax activation, which is an odd choice for an intermediate layer after we add another linear layer to a model.

So, we start by removing the last layer, and telling Keras to fix the weights in all the other layers.

In [1]:
%matplotlib inline
from importlib import reload
# import utils; reload(utils)
# from utils import *
import keras
print(keras.__version__)
print(keras.__path__)
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential
from keras.layers import Input
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD, RMSprop
from keras.preprocessing import image

ModuleNotFoundError: No module named 'keras'

In [9]:
from vgg16 import Vgg16
vgg = Vgg16()
model = vgg.model

vgg.model.summary()

ModuleNotFoundError: No module named 'keras'

In [3]:
model.pop()  # remove the last layer
for layer in model.layers:
    layer.trainable = False

**WARNING:** Now that we have modified the definition of _model_, be careful not to rerun any code in the previous sections.

In [4]:
model.add(Dense(2, activation = 'softmax'))

Now, compile our updated model, and set up our batches to use the preprocessed images.

In [7]:
path = "data/dogscats/sample/"
# path = "data/dogscats/"
model_path = path + 'models/'

trn_data = load_array(model_path+'train_data.bc')
val_data = load_array(model_path+'valid_data.bc')

# Use batch size of 1 since we're just doing preprocessing on the CPU
val_batches = get_batches(path+'valid', shuffle=False, batch_size=1)
batches = get_batches(path+'train', shuffle=False, batch_size=1)

val_classes = val_batches.classes
trn_classes = batches.classes
val_labels = onehot(val_classes)
trn_labels = onehot(trn_classes)

batch_size = 16
gen = image.ImageDataGenerator()
batches = gen.flow(trn_data, trn_labels, batch_size=batch_size, shuffle=True)
val_batches = gen.flow(val_data, val_labels, batch_size=batch_size, shuffle=False)

Found 50 images belonging to 2 classes.
Found 200 images belonging to 2 classes.


We define a simple function for fitting models, just to save some typing

In [13]:
def fit_model(model, batches, val_batches, nb_epoch=1):
    model.fit_generator(batches, samples_per_epoch=batches.N, nb_epoch=nb_epoch,
                       validation_data=val_batches, nb_val_samples=val_batches.N)

...and now, we can use it to train the last layer of our model!
Be warned, it will run quite slowy, because it still has to calculate all the previous layers in order to know what input to pass to the new final layer.

You can always precalculate the output of the penultimate layer, like what we did earlier - but since we're only likle to want 1 or 2 iterations, let's just run it.

In [9]:
opt = RMSprop(lr=0.1)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

In [14]:
fit_model(model, batches, val_batches, nb_epoch=2)

Epoch 1/2
Epoch 2/2


### How many layers to retrain?

Well, for dogs vs. cats problems, the classes are similar to the imageNet models output, so no need to retrain more layers. But for state farms, we may consider to retrain more Dense layers. 

However, for state farm, there is also no need to retrain the convolution layers, because the spacial relationships in pictures are very likely to be the same. Figuring out whether someone is playing mobile phones is not gonna use different spatial features. 