# General Principles for Computer Vision CNNs

A simple list of things to consider doing would be:
    1. Use Imagenet/VGG to pretrain conv features (for full colour photos)
    2. Add data augmentation by manually fiddling with each parameter (rotation, shift etc.) on a sample to find what is best and combine them all together. Use these params to create 5x more data upfront (augmentation can't be used with pretrained conv features otherwise)
    3. Add dropout (start with p=0.5 and experiment)
    4. Add pseudo-labelling by predicting on the validation and test sets, and adding the labels to the end.

## Sample Setup
This assumes that the data is in the structure we want already, including sample and validation.

In [1]:
from theano.sandbox import cuda
cuda.use('gpu0')

%matplotlib inline
from __future__ import print_function, division
import utils; reload(utils)
from utils import *
from IPython.display import FileLink
from keras_tqdm import TQDMCallback

Using gpu device 0: Tesla K80 (CNMeM is disabled, cuDNN 5103)
Using Theano backend.


In [2]:
batch_size=64

In [None]:
path = "data/state/sample/"
%cd /home/ubuntu/courses/deeplearning1/

In [None]:
# Remember not to shuffle the batches
batches = get_batches(path+'train', batch_size=batch_size)
val_batches = get_batches(path+'valid', batch_size=batch_size*2, shuffle=False)

In [None]:
(val_classes, trn_classes, val_labels, trn_labels, 
    val_filenames, filenames, test_filenames) = get_classes(path)

In [None]:
# Saving off samples for quick access
trn = get_data(path+'train')
val = get_data(path+'valid')

In [None]:
# Sample save!
save_array('data/state/results/sample_val.dat', val)
save_array('data/state/results/sample_trn.dat', trn)

In [None]:
# Sample load!
val = load_array('data/state/results/sample_val.dat')
trn = load_array('data/state/results/sample_trn.dat')

## Experimenting with the sample dataset

In [None]:
# Defining a simple 2-layer CNN
def conv2(batches):
    model = Sequential([
            BatchNormalization(axis=1, input_shape=(3,224,224)),
            Convolution2D(32,3,3, activation='relu'),
            BatchNormalization(axis=1),
            MaxPooling2D((3,3)),
            Convolution2D(64,3,3, activation='relu'),
            BatchNormalization(axis=1),
            MaxPooling2D((3,3)),
            Flatten(),
            Dense(200, activation='relu'),
            BatchNormalization(),
            Dense(10, activation='softmax')
        ])

    model.compile(Adam(lr=1e-4), loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit_generator(batches, batches.nb_sample, nb_epoch=2, validation_data=val_batches, 
                     nb_val_samples=val_batches.nb_sample)
    model.optimizer.lr = 0.001
    model.fit_generator(batches, batches.nb_sample, nb_epoch=4, validation_data=val_batches, 
                     nb_val_samples=val_batches.nb_sample)
    return model

In [None]:
conv2(batches)

This represents a reasonable base, so let's add regularisation.

## Data Augmentation

Tried a few different tests here, but essentially have gone with the defaults in the class.

In [None]:
gen_t = image.ImageDataGenerator(rotation_range=15, height_shift_range=0.05, 
                shear_range=0.1, channel_shift_range=20, width_shift_range=0.1)
batches = get_batches(path+'train', gen_t, batch_size=batch_size)
model = conv2(batches)

Move on to using the full data set.

# Full Data Set

In [3]:
path = "data/state/"
%cd /home/ubuntu/courses/deeplearning1/

/home/ubuntu/courses/deeplearning1


In [None]:
# Remember not to shuffle the validation batches
batches = get_batches(path+'train', batch_size=batch_size)
val_batches = get_batches(path+'valid', batch_size=batch_size*2, shuffle=False)

In [None]:
(val_classes, trn_classes, val_labels, trn_labels, 
    val_filenames, filenames, test_filenames) = get_classes(path)

In [None]:
# Saving off data for quick access
trn = get_data(path+'train')
val = get_data(path+'valid')

In [None]:
# save!
save_array(path+'results/val.dat', val)
save_array(path+'results/trn.dat', trn)

In [None]:
# load!
del val
del trn
val = load_array(path+'results/val.dat')
trn = load_array(path+'results/trn.dat')

### Making a deeper model

The statefarm full book shows that the results are unstable, so we should look at building a deeper network let's compare 3 conv layers to 4.

In [None]:
def conv3(batches):
    model = Sequential([
            BatchNormalization(axis=1, input_shape=(3,224,224)),
            Convolution2D(32,3,3, activation='relu'),
            BatchNormalization(axis=1),
            MaxPooling2D(),
            Convolution2D(64,3,3, activation='relu'),
            BatchNormalization(axis=1),
            MaxPooling2D(),
            Convolution2D(128,3,3, activation='relu'),
            BatchNormalization(axis=1),
            MaxPooling2D(),
            Flatten(),
            Dense(200, activation='relu'),
            BatchNormalization(),
            Dropout(0.5),
            Dense(200, activation='relu'),
            BatchNormalization(),
            Dropout(0.5),
            Dense(10, activation='softmax')
        ])

    model.compile(Adam(lr=1e-4), loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit_generator(batches, batches.nb_sample, nb_epoch=2, validation_data=val_batches, 
                     nb_val_samples=val_batches.nb_sample)
    model.optimizer.lr = 0.001
    model.fit_generator(batches, batches.nb_sample, nb_epoch=4, validation_data=val_batches, 
                     nb_val_samples=val_batches.nb_sample)
    return model

In [None]:
conv3(batches)

I looked at a 4-layer network, but that seemed to run out of memory - so skipped.

### Add Imagenet/VGG

Pre-trained models always help speed stuff up, let's add in VGG here. Jeremy used normal Vgg16() and then added a simplified version of the bn layers later.

In [9]:
vgg = Vgg16()
model=vgg.model
last_conv_idx = [i for i,l in enumerate(model.layers) if type(l) is Convolution2D][-1]
conv_layers = model.layers[:last_conv_idx+1]

In [10]:
conv_model = Sequential(conv_layers)

In [11]:
(val_classes, trn_classes, val_labels, trn_labels, 
    val_filenames, filenames, test_filenames) = get_classes(path)

Found 17272 images belonging to 10 classes.
Found 5152 images belonging to 10 classes.
Found 0 images belonging to 0 classes.


In [None]:
conv_feat = conv_model.predict_generator(batches, batches.nb_sample)
conv_val_feat = conv_model.predict_generator(val_batches, val_batches.nb_sample)

In [None]:
# Test batches is never set up in the State Farm book
test_batches = get_batches(path, batch_size=batch_size, shuffle=False)
conv_test_feat = conv_model.predict_generator(test_batches, test_batches.nb_sample)

In [None]:
save_array(path+'results/conv_val_feat.dat', conv_val_feat)
#save_array(path+'results/conv_test_feat.dat', conv_test_feat)
save_array(path+'results/conv_feat.dat', conv_feat)

In [15]:
try:
    del conv_feat
except:
    pass
try:
    del conv_val_feat
except:
    pass
#conv_feat = load_array(path+'results/conv_feat.dat')
conv_val_feat = load_array(path+'results/conv_val_feat.dat')
#conv_val_feat.shape

Pre-trained layers now saved. Next step is to add the batch-norm layers onto the end.

### Add dense layers (batch-normed) to pre-trained conv

In [None]:
def get_bn_layers(p):
    return [
        MaxPooling2D(input_shape=conv_layers[-1].output_shape[1:]),
        Flatten(),
        Dropout(p/2),
        Dense(128, activation='relu'),
        BatchNormalization(),
        Dropout(p/2),
        Dense(128, activation='relu'),
        BatchNormalization(),
        Dropout(p),
        Dense(10, activation='softmax')
        ]

In [None]:
p=0.8

In [None]:
bn_model = Sequential(get_bn_layers(p))
bn_model.compile(Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
bn_model.fit(conv_feat, trn_labels, batch_size=batch_size, nb_epoch=1, 
             validation_data=(conv_val_feat, val_labels))

In [None]:
bn_model.optimizer.lr=0.01

In [None]:
bn_model.fit(conv_feat, trn_labels, batch_size=batch_size, nb_epoch=2, 
             validation_data=(conv_val_feat, val_labels))

In [None]:
bn_model.save_weights(path+'models/vgg_state_noaug.h5')

### Pre-compute Data Augmentation

In [None]:
gen_t = image.ImageDataGenerator(rotation_range=15, height_shift_range=0.05, 
                shear_range=0.1, channel_shift_range=20, width_shift_range=0.1)
da_batches = get_batches(path+'train', gen_t, batch_size=batch_size, shuffle=False)

Use the original aug params to create 5x extra data

In [None]:
da_conv_feat = conv_model.predict_generator(da_batches, da_batches.nb_sample*5)

In [None]:
save_array(path+'results/da_conv_feat_aug5x_noorig.dat', da_conv_feat)

In [None]:
da_trn_labels = np.concatenate([trn_labels]*5)
save_array(path+'results/train_labels.bc', da_trn_labels)

In [33]:
da_conv_feat = bcolz.open(path + 'results/da_conv_feat_aug5x_noorig.dat', mode='a')
da_trn_labels = bcolz.open(path + 'results/train_labels.bc', mode='r')
trn_batches = BcolzArrayIterator(da_conv_feat, da_trn_labels, batch_size=da_conv_feat.chunklen * batch_size, shuffle=True)

Now adding the dense layers

In [8]:
def get_bn_da_layers(conv_layers,p=0.8):
    return [
        MaxPooling2D(input_shape=conv_layers[-1].output_shape[1:]),
        Flatten(),
        Dropout(p),
        Dense(256, activation='relu'),
        BatchNormalization(),
        Dropout(p),
        Dense(256, activation='relu'),
        BatchNormalization(),
        Dropout(p),
        Dense(10, activation='softmax')
        ]

In [14]:
bn_model = Sequential(get_bn_da_layers(conv_layers))
bn_model.compile(Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

In [20]:
bn_model.fit_generator(trn_batches, samples_per_epoch=trn_batches.N, nb_epoch=1, 
             validation_data=(conv_val_feat, val_labels), verbose=0, callbacks=[TQDMCallback()])


Training:   0%|          | 0/1 [00:00<?, ?it/s][A

Epoch: 0:   0%|          | 0/86360 [00:00<?, ?it/s][A[A

Epoch: 0 - loss: 5.117, acc: 0.100  1%|          | 640/86360 [00:00<01:01, 1400.28it/s][A[A

Epoch: 0 - loss: 5.141, acc: 0.091  1%|▏         | 1280/86360 [00:00<00:54, 1571.02it/s][A[A

Epoch: 0 - loss: 5.002, acc: 0.094  2%|▏         | 1920/86360 [00:01<00:49, 1691.22it/s][A[A

Epoch: 0 - loss: 4.939, acc: 0.096  3%|▎         | 2560/86360 [00:01<00:46, 1799.86it/s][A[A

Epoch: 0 - loss: 4.888, acc: 0.101  4%|▎         | 3200/86360 [00:01<00:43, 1897.79it/s][A[A

Epoch: 0 - loss: 4.839, acc: 0.102  4%|▍         | 3840/86360 [00:01<00:42, 1960.79it/s][A[A

Epoch: 0 - loss: 4.820, acc: 0.104  5%|▌         | 4480/86360 [00:02<00:40, 2033.22it/s][A[A

Epoch: 0 - loss: 4.797, acc: 0.105  6%|▌         | 5120/86360 [00:02<00:39, 2043.52it/s][A[A

Epoch: 0 - loss: 4.753, acc: 0.107  7%|▋         | 5760/86360 [00:02<00:38, 2084.71it/s][A[A

Epoch: 0 - loss: 4.736, a

<keras.callbacks.History at 0x7fc6e1780c10>

In [21]:
bn_model.optimizer.lr=0.01

In [22]:
bn_model.fit_generator(trn_batches, samples_per_epoch=trn_batches.N, nb_epoch=1, 
             validation_data=(conv_val_feat, val_labels), verbose=0, callbacks=[TQDMCallback()])


Training:   0%|          | 0/1 [00:00<?, ?it/s][A

Epoch: 0:   0%|          | 0/86360 [00:00<?, ?it/s][A[A

Epoch: 0 - loss: 1.642, acc: 0.434  1%|          | 640/86360 [00:00<00:58, 1456.34it/s][A[A

Epoch: 0 - loss: 1.545, acc: 0.463  1%|▏         | 1280/86360 [00:00<00:52, 1614.20it/s][A[A

Epoch: 0 - loss: 1.579, acc: 0.459  2%|▏         | 1920/86360 [00:01<00:48, 1751.99it/s][A[A

Epoch: 0 - loss: 1.584, acc: 0.457  3%|▎         | 2560/86360 [00:01<00:45, 1842.02it/s][A[A

Epoch: 0 - loss: 1.599, acc: 0.455  4%|▎         | 3200/86360 [00:01<00:43, 1921.36it/s][A[A

Epoch: 0 - loss: 1.596, acc: 0.458  4%|▍         | 3840/86360 [00:01<00:41, 1985.61it/s][A[A

Epoch: 0 - loss: 1.587, acc: 0.462  5%|▌         | 4480/86360 [00:02<00:40, 2034.01it/s][A[A

Epoch: 0 - loss: 1.578, acc: 0.464  6%|▌         | 5120/86360 [00:02<00:39, 2044.62it/s][A[A

Epoch: 0 - loss: 1.583, acc: 0.464  7%|▋         | 5760/86360 [00:02<00:38, 2067.14it/s][A[A

Epoch: 0 - loss: 1.584, a

<keras.callbacks.History at 0x7fc6e1780210>

In [23]:
bn_model.optimizer.lr=0.0001

In [24]:
bn_model.fit_generator(trn_batches, samples_per_epoch=trn_batches.N, nb_epoch=1, 
             validation_data=(conv_val_feat, val_labels), verbose=0, callbacks=[TQDMCallback()])


Training:   0%|          | 0/1 [00:00<?, ?it/s][A

Epoch: 0:   0%|          | 0/86360 [00:00<?, ?it/s][A[A

Epoch: 0 - loss: 1.013, acc: 0.670  1%|          | 640/86360 [00:00<01:05, 1305.71it/s][A[A

Epoch: 0 - loss: 1.005, acc: 0.670  1%|▏         | 1280/86360 [00:00<00:58, 1450.00it/s][A[A

Epoch: 0 - loss: 0.986, acc: 0.668  2%|▏         | 1920/86360 [00:01<00:52, 1608.39it/s][A[A

Epoch: 0 - loss: 0.992, acc: 0.657  3%|▎         | 2560/86360 [00:01<00:48, 1737.79it/s][A[A

Epoch: 0 - loss: 0.993, acc: 0.656  4%|▎         | 3200/86360 [00:01<00:45, 1844.55it/s][A[A

Epoch: 0 - loss: 0.983, acc: 0.661  4%|▍         | 3840/86360 [00:02<00:43, 1907.93it/s][A[A

Epoch: 0 - loss: 0.991, acc: 0.657  5%|▌         | 4480/86360 [00:02<00:41, 1987.96it/s][A[A

Epoch: 0 - loss: 0.992, acc: 0.661  6%|▌         | 5120/86360 [00:02<00:41, 1976.70it/s][A[A

Epoch: 0 - loss: 0.992, acc: 0.661  7%|▋         | 5760/86360 [00:02<00:39, 2049.86it/s][A[A

Epoch: 0 - loss: 0.990, a

<keras.callbacks.History at 0x7fc6e1780b90>

In [26]:
bn_model.save_weights(path+'results/da5x_conv.h5')

### Pseudo-labelling

In [27]:
val_pseudo = bn_model.predict(conv_val_feat, batch_size=batch_size)

In [30]:
comb_pseudo = np.concatenate([da_trn_labels, val_pseudo])

In [34]:
da_conv_feat.append(conv_val_feat)

In [37]:
comb_batches = BcolzArrayIterator(da_conv_feat, comb_pseudo, batch_size=da_conv_feat.chunklen * batch_size, shuffle=True)

AttributeError: 'NoneType' object has no attribute 'chunklen'

In [35]:
bn_model.load_weights(path+'results/da5x_conv.h5')

In [39]:
bn_model.fit(da_conv_feat, comb_pseudo, batch_size=batch_size, nb_epoch=1, 
             validation_data=(conv_val_feat, val_labels))

Train on 91512 samples, validate on 5152 samples
Epoch 1/1


<keras.callbacks.History at 0x7fc6e0326ad0>

In [40]:
bn_model.fit(da_conv_feat, comb_pseudo, batch_size=batch_size, nb_epoch=1, 
             validation_data=(conv_val_feat, val_labels), verbose=0, callbacks=[TQDMCallback()])




Training:   0%|          | 0/1 [00:00<?, ?it/s][A[A[A



Epoch: 0:   0%|          | 0/91512 [00:00<?, ?it/s][A[A[A[A



Epoch: 0 - loss: 0.645, acc: 0.781  0%|          | 128/91512 [00:00<01:43, 879.69it/s][A[A[A[A



Epoch: 0 - loss: 0.703, acc: 0.773  0%|          | 256/91512 [00:00<01:42, 886.54it/s][A[A[A[A



Epoch: 0 - loss: 0.730, acc: 0.776  0%|          | 384/91512 [00:00<01:42, 886.27it/s][A[A[A[A



Epoch: 0 - loss: 0.728, acc: 0.781  1%|          | 512/91512 [00:00<01:42, 888.44it/s][A[A[A[A



Epoch: 0 - loss: 0.747, acc: 0.778  1%|          | 640/91512 [00:00<01:42, 888.52it/s][A[A[A[A



Epoch: 0 - loss: 0.723, acc: 0.781  1%|          | 768/91512 [00:00<01:41, 890.25it/s][A[A[A[A



Epoch: 0 - loss: 0.735, acc: 0.773  1%|          | 896/91512 [00:01<01:41, 891.25it/s][A[A[A[A



Epoch: 0 - loss: 0.736, acc: 0.779  1%|          | 1024/91512 [00:01<01:41, 894.23it/s][A[A[A[A



Epoch: 0 - loss: 0.726, acc: 0.783  1%|▏         | 11

<keras.callbacks.History at 0x7fc76c12b450>

In [41]:
bn_model.fit(da_conv_feat, comb_pseudo, batch_size=batch_size, nb_epoch=3, 
             validation_data=(conv_val_feat, val_labels), verbose=0, callbacks=[TQDMCallback()])




Training:   0%|          | 0/3 [00:00<?, ?it/s][A[A[A



Epoch: 0:   0%|          | 0/91512 [00:00<?, ?it/s][A[A[A[A



Epoch: 0 - loss: 0.735, acc: 0.805  0%|          | 128/91512 [00:00<01:43, 885.67it/s][A[A[A[A



Epoch: 0 - loss: 0.605, acc: 0.836  0%|          | 256/91512 [00:00<01:42, 886.12it/s][A[A[A[A



Epoch: 0 - loss: 0.558, acc: 0.849  0%|          | 384/91512 [00:00<01:42, 886.78it/s][A[A[A[A



Epoch: 0 - loss: 0.589, acc: 0.828  1%|          | 512/91512 [00:00<01:42, 887.81it/s][A[A[A[A



Epoch: 0 - loss: 0.559, acc: 0.839  1%|          | 640/91512 [00:00<01:42, 889.86it/s][A[A[A[A



Epoch: 0 - loss: 0.570, acc: 0.833  1%|          | 768/91512 [00:00<01:42, 887.67it/s][A[A[A[A



Epoch: 0 - loss: 0.569, acc: 0.835  1%|          | 896/91512 [00:01<01:42, 887.88it/s][A[A[A[A



Epoch: 0 - loss: 0.564, acc: 0.834  1%|          | 1024/91512 [00:01<01:41, 890.12it/s][A[A[A[A



Epoch: 0 - loss: 0.580, acc: 0.829  1%|▏         | 11

<keras.callbacks.History at 0x7fc6e1780d10>

In [42]:
bn_model.optimizer.lr=0.00001

In [43]:
bn_model.fit(da_conv_feat, comb_pseudo, batch_size=batch_size, nb_epoch=4, 
             validation_data=(conv_val_feat, val_labels))

Train on 91512 samples, validate on 5152 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4


<keras.callbacks.History at 0x7fc6e0326810>

In [44]:
bn_model.save_weights(path+'results/da5x_pseudo.h5')

### Submit

In [45]:
def do_clip(arr, mx): return np.clip(arr, (1-mx)/9, mx)

In [52]:
conv_val_feat.shape # Need to provide the validation labels (val_pseudo) and predictions (not conv_val_feat?)

(5152, 512, 14, 14)

In [47]:
keras.metrics.categorical_crossentropy(val_pseudo, do_clip(conv_val_feat, 0.93)).eval()

TypeError: rank mismatch between coding and true distributions

In [48]:
conv_test_feat = load_array(path+'results/conv_test_feat.dat')

IOError: [Errno 2] No such file or directory: 'data/state/results/conv_test_feat.dat/meta/sizes'