## Introduction to our first task: 'Dogs vs Cats'

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [2]:
PATH = "/home/paperspace/data/dogscats/"
sz=224
batch_size=64

In [3]:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
from keras.layers import Dropout, Flatten, Dense
from keras.applications import ResNet50
from keras.models import Model, Sequential
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
from keras.applications.resnet50 import preprocess_input

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [4]:
train_data_dir = f'{PATH}train'
validation_data_dir = f'{PATH}valid'

Rather then creating a Data object, first need to create Data Generator, to define how we generate the data: what kind of data augmentation and data normalization we'd like to do.

We kinda need to not a little bit of what is expected for resnet50.

Generally speaking copy&pasting Keras code from the internet is a good way to be sure you've got the right stuff to make that work

In [5]:
train_datagen = ImageDataGenerator(
                                   rescale=1. / 255,
                                   #preprocessing_function=preprocess_input,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True)

# It's up to you to create the generator that doesn't have data augmentation
test_datagen = ImageDataGenerator(
                                 rescale=1. / 255
                                 #preprocessing_function=preprocess_input
                                    )

# We then create a data generator from that, by taking that data generator
# by looking from a directory
train_generator = train_datagen.flow_from_directory(train_data_dir,
                                                    target_size=(sz,sz),
                                                    batch_size=batch_size,
                                                    class_mode='binary')

# You have to do the same for the validation set
validation_generator = \
    test_datagen.flow_from_directory(validation_data_dir,
                                     shuffle=False,
                                     target_size=(sz, sz),
                                     batch_size=batch_size,
                                     class_mode='binary')

Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


I use ResNet50 cause Keras doesn't have ResNet34 unfortunately

You have to construct a model on top of base model by hand

In [6]:
%time base_model = ResNet50(weights='imagenet', include_top=False)

CPU times: user 7.84 s, sys: 616 ms, total: 8.46 s
Wall time: 8.25 s


In [7]:
# base model
x = base_model.output
# layers on top of that
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(1, activation='sigmoid')(x)

There is no concept of automatically freezing things or API for that, so you have to look through the layers you want to freeze

In Keras there is a concept we don't have in fastai or Pytorch of compiling a model. With fastai we know what loss is the right loss to use. You can always overwrite it, but for particular model we give you good defaults

In [8]:
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
    layer.trainable = False
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])

* Passing in `train_generator` and `validation_generator`
* For some reason Keras also expects you to tell it how many batches there are per epoch.
    * The number of batches is equal to the size of generator divided by the batch size
* You can tell it how many epochs
* Just like fastai you can tell it how many processes(`workers`) to use for preprocessing
    * Unlike fastai the default in Keras is not to use any. So to get good speed you gonna make sure to include this
    
That's basically enough to start fintuning the last layers

In [9]:
%%time
model.fit_generator(train_generator,
                    train_generator.n // batch_size,
                    epochs=3,
                    workers=4,
                    validation_data=validation_generator,
                    validation_steps=validation_generator.n // batch_size)

Epoch 1/3
Epoch 2/3
Epoch 3/3
CPU times: user 19min 36s, sys: 41.3 s, total: 20min 18s
Wall time: 9min 49s


<keras.callbacks.History at 0x7fe3ab61a0b8>

There is not concept of layer goups or differential learning rates or partial unfreezing.

So I have to print out all of the layers and decide manually how many I want to fine-tune. So I decide to fine-tune everything from layer `140` onwards.

After you change this you have to recompile the model, and then I run another step.

In [10]:
split_at = 140
for layer in model.layers[:split_at]:
    layer.trainable = False
for layer in model.layers[split_at:]:
    layer.trainable = True
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])    

In [11]:
%%time
model.fit_generator(train_generator,
                    train_generator.n // batch_size,
                    epochs=1, workers=3,
                    validation_data=validation_generator,
                    validation_steps=validation_generator.n // batch_size)

Epoch 1/1
CPU times: user 7min 4s, sys: 25.2 s, total: 7min 30s
Wall time: 4min 1s


<keras.callbacks.History at 0x7fe329b48da0>