In [12]:
# Install packages
!pip install -q keras-trainer keras-model-specs stored

In [2]:
# Import dependencies
import os 
import numpy as np
import json
%reload_ext autoreload
%autoreload 2
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
from keras_trainer import Trainer
from keras_model_specs import ModelSpec

In [2]:
import stored
# Download Cats vs Dogs Dataset
stored.sync('https://storage.googleapis.com/sample-datasets/cats-vs-dogs/train.zip', 'data/train')
stored.sync('https://storage.googleapis.com/sample-datasets/cats-vs-dogs/valid.zip', 'data/valid')

## Keras Trainer

An abstraction to train Keras CNN models for image classification. To use it is required to have also installed the `keras-model-specs` package. 

The list of models supported is the following:

`vgg16`, `vgg19`, `resnet50`, `resnet152`, `mobilenet_v1`, `xception`,
`inception_resnet_v2`, `inception_v3`, `inception_v4`, `nasnet_large`, `nasnet_mobile`, `densenet_169`,
`densenet_121`, `densenet_201`

And the defaults are specified [here](https://github.com/triagemd/keras-model-specs/blob/master/keras_model_specs/model_specs.json).

This will get the model_spec by default of the `mobilenet_v1` arquitecture:

In [11]:
model_spec = ModelSpec.get('mobilenet_v1')

Here you can see the contents:

In [4]:
print(json.dumps(model_spec.as_json(), indent=True))

{'preprocess_func': 'between_plus_minus_1', 'name': 'mobilenet_v1', 'preprocess_args': None, 'klass': 'keras.applications.mobilenet.MobileNet', 'target_size': [224, 224, 3]}


You can override the defaults, passing different parameters. Let's use `preprocess_func= mean_subtraction` as an image preprocessing function, and let's also set the mean to subtract as `preprocess_args=dataset_mean`.

In [13]:
dataset_mean = [142.69182214, 119.05833338, 106.89884415]
model_spec = ModelSpec.get('mobilenet_v1', preprocess_func='mean_subtraction', preprocess_args=dataset_mean)

We'll see the changes now:

In [14]:
print(json.dumps(model_spec.as_json(), indent=True))

{'klass': 'keras.applications.mobilenet.MobileNet', 'preprocess_func': 'mean_subtraction', 'target_size': [224, 224, 3], 'name': 'mobilenet_v1', 'preprocess_args': [142.69182214, 119.05833338, 106.89884415]}


### Keras Trainer definition

These are the default options:

In [5]:
Trainer.OPTIONS

{'activation': {'default': 'softmax', 'type': str},
 'batch_size': {'default': 1, 'type': int},
 'callback_list': {'default': [], 'type': list},
 'checkpoint_path': {'default': None, 'type': str},
 'class_weights': {'default': None, 'type': None},
 'decay': {'default': 0.0005, 'type': float},
 'dropout_rate': {'default': 0.0, 'type': float},
 'epochs': {'default': 1, 'type': int},
 'freeze_layers_list': {'default': None, 'type': None},
 'include_top': {'default': False, 'type': bool},
 'input_shape': {'default': None, 'type': None},
 'loss_function': {'default': 'categorical_crossentropy', 'type': str},
 'max_queue_size': {'default': 16, 'type': int},
 'metrics': {'default': ['accuracy'], 'type': list},
 'model_kwargs': {'default': {}, 'type': dict},
 'model_spec': {'type': str},
 'momentum': {'default': 0.9, 'type': float},
 'num_gpus': {'default': 0, 'type': int},
 'optimizer': {'default': None, 'type': None},
 'output_logs_dir': {'type': str},
 'output_model_dir': {'type': str},
 'p

### Setting up the training data

To train a model the first thing you need is to have the data ready. There must be a parent folder containing one folder per each class. E.g. for the cats vs dogs classification problem: `'data/train/cats'` , `'data/train/dogs'`.
Also it is needed to have a validation set: `'data/valid/cats'` , `'data/valid/dogs'`

You will need to specify these under `train_dataset_dir` and `val_dataset_dir`. Also you will need to specify a path for the model logs and outputs, `output_model_dir` and `output_logs_dir`:

In [5]:
train_dataset_dir = 'data/train/'
val_dataset_dir = 'data/valid/'
output_model_dir = 'output/models/'
output_logs_dir = 'output/logs/'

By default Keras trainer will use keras generators with data augmentation as follows:

```
train_data_generator = image.ImageDataGenerator(
            rotation_range=180,
            width_shift_range=0,
            height_shift_range=0,
            preprocessing_function=self.model_spec.preprocess_input,
            shear_range=0,
            zoom_range=0.1,
            horizontal_flip=True,
            vertical_flip=True,
            fill_mode='nearest'
        )
```

But you can set custom ones under and pass them as parameters with `train_data_generator`, `val_data_generator` if you want to do data augmentation. Or `train_generator`, `val_generator` for a complete iterator. 

### Setting up the model, fine tuning a pre-trained model

By default weights from imagenet will be loaded (`weights='imagenet'`) and top dense layers will not be included (`include_top=False`) allowing to define new top-layers to fine tune the network. You can choose `weights='None'` to train from scratch.

You can specify layers to put on top if you specify a list of Keras layers inside `top_layers`:

In [4]:
from keras.layers import Dense, Dropout, Activation

# Create a dropout layer with dropout rate 0.5
dropout = Dropout(0.5)
# Create a dense layer with 10 outputs
dense = Dense(10, name='dense')
# Create a softmax activation layer
softmax = Activation('softmax', name='softmax_activation')

top_layers = [dropout, dense, softmax]

If you don't, by default we'll add a `Dense` linear layer with output `num_classes` followed by a `Softmax` layer activation.

### Optimizers, Callbacks, Metrics and Loss Functions

By default SGD optimizer will be used with the default parameters as shown in the OPTIONS.  

```
self.optimizer = self.optimizer or optimizers.SGD(
    lr=self.sgd_lr,
    decay=self.decay,
    momentum=self.momentum,
    nesterov=True
)
```

But we allow the use of any optimizer, you can define it and pass it with the `optimizer` variable. Moreover, you can define variable learning rates in the form of a Keras Callback.
You can define as much callbacks as you want! They go under `callback_list`
Let's see an example:

In [8]:
from keras.callbacks import LearningRateScheduler
from keras import optimizers

# Decrease learning rate by 10 in epochs 10 and 20 
def scheduler(epoch):
    if epoch == 10 or epoch == 20:
        lr = K.get_value(model.optimizer.lr)
        K.set_value(model.optimizer.lr, lr / 10)
        print("lr changed to {}".format(lr / 10))
    return K.get_value(model.optimizer.lr)

schedule_lr = LearningRateScheduler(scheduler)
callback_list = [schedule_lr]

optim = optimizers.SGD(lr=0.001, decay=0.0005, momentum=0.9, nesterov=True)

You can also define a dictionary of class weights:

In [9]:
class_weights = {0: 13.883058178601447, 1: 1.4222778260019158, 2: 9.875415960083256, 3: 1.8788427493250286}

Any custom metrics or loss functions can be also defined in `metrics` or `loss_function`, by default we will use `accuracy` and `categorical cross-entropy` respectively.

### Creating the Trainer

Once it is all ready, we create the trainer object:

In [16]:
trainer = Trainer(model_spec=model_spec,
                  train_dataset_dir=train_dataset_dir,
                  val_dataset_dir=val_dataset_dir,
                  output_model_dir=output_model_dir,
                  output_logs_dir=output_logs_dir,
                  batch_size=32,
                  epochs=10,
                  workers=16,
                  max_queue_size=128,
                  num_gpus=0,
                  optimizer=optim,
                  class_weights=class_weights,
                  verbose=False,
                  input_shape=(None, None, 3)
                )

Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


The trainer object contains the model, and you can have access to it:

In [17]:
trainer.model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
conv1 (Conv2D)               (None, 112, 112, 32)      864       
_________________________________________________________________
conv1_bn (BatchNormalization (None, 112, 112, 32)      128       
_________________________________________________________________
conv1_relu (Activation)      (None, 112, 112, 32)      0         
_________________________________________________________________
conv_dw_1 (DepthwiseConv2D)  (None, 112, 112, 32)      288       
_________________________________________________________________
conv_dw_1_bn (BatchNormaliza (None, 112, 112, 32)      128       
_________________________________________________________________
conv_dw_1_relu (Activation)  (None, 112, 112, 32)      0         
__________

### Freeze Layers

You can also freeze the layers you don't want to train, by making a list with their names or their indices:

In [25]:
# Let's freeze the first 10 layers
layers_to_freeze = np.arange(0,10,1)
print(layers_to_freeze)

[0 1 2 3 4 5 6 7 8 9]


In [38]:
trainer = Trainer(model_spec=model_spec,
                  train_dataset_dir=train_dataset_dir,
                  val_dataset_dir=val_dataset_dir,
                  output_model_dir=output_model_dir,
                  output_logs_dir=output_logs_dir,
                  batch_size=32,
                  epochs=2,
                  workers=16,
                  max_queue_size=128,
                  num_gpus=1,
                  optimizer=optim,
                  class_weights=class_weights,
                  verbose=False,
                  input_shape=(None, None, 3),
                  freeze_layers_list=layers_to_freeze
                )

Training data
Found 23000 images belonging to 2 classes.
Validation data
Found 2000 images belonging to 2 classes.


In [39]:
trainer.model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_12 (InputLayer)        (None, None, None, 3)     0         
_________________________________________________________________
conv1 (Conv2D)               (None, None, None, 32)    864       
_________________________________________________________________
conv1_bn (BatchNormalization (None, None, None, 32)    128       
_________________________________________________________________
conv1_relu (Activation)      (None, None, None, 32)    0         
_________________________________________________________________
conv_dw_1 (DepthwiseConv2D)  (None, None, None, 32)    288       
_________________________________________________________________
conv_dw_1_bn (BatchNormaliza (None, None, None, 32)    128       
_________________________________________________________________
conv_dw_1_relu (Activation)  (None, None, None, 32)    0         
__________

### Model Training

Now, let's train the model

In [40]:
trainer.run()

Epoch 1/2

Epoch 00001: val_acc improved from -inf to 0.90474, saving model to output/models/model_max_acc.hdf5

Epoch 00001: val_loss improved from inf to 0.27277, saving model to output/models/model_min_loss.hdf5
Epoch 2/2

Epoch 00002: val_acc improved from 0.90474 to 0.93648, saving model to output/models/model_max_acc.hdf5

Epoch 00002: val_loss improved from 0.27277 to 0.16118, saving model to output/models/model_min_loss.hdf5


After the model is trained we can access to its history:

In [41]:
history = trainer.history       

And inside the history we can find the training stats:

In [50]:
for i in range(0,len(history.history['val_acc'])):
    print('Epoch %d' %i)
    print('Training Accuracy was %.3f' %history.history['acc'][i])
    print('Training Loss was %.3f' %history.history['loss'][i])
    print('Validation Accuracy was %.3f' %history.history['val_acc'][i])
    print('Validation Loss was %.3f' %history.history['val_loss'][i])
    print()
    

Epoch 0
Training Accuracy was 0.868
Training Loss was 0.997
Validation Accuracy was 0.905
Validation Loss was 0.273

Epoch 1
Training Accuracy was 0.925
Training Loss was 0.457
Validation Accuracy was 0.936
Validation Loss was 0.161

