# Lesson 03 - v02 - Under and Over Fitting

We will build a VGG-16 model using Keras. This is the same notebook i used for lesson 02, here i will only remove the dropouts for dense layers and make them trainable too, then we will finetune the network and see if the performance increases.

## Build the Model

In [1]:
import numpy as np
import os, json

In [2]:
from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers.pooling import GlobalAveragePooling2D
from keras.layers.normalization import BatchNormalization
from keras.optimizers import SGD, RMSprop, Adam
from keras.preprocessing import image
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.callbacks import ModelCheckpoint

Using TensorFlow backend.


In [3]:
# We will use TensorFlow backend, thus we will explicitly set the Theano image ordering
K.set_image_dim_ordering('th')

 Here i will remove dropouts by setting the p=0. (Dropout(0.)) this will make each node accessible by not removing any of them. In addition we will not need to double the saved dense layer weights, since keras uses inverted dropout which makes the training weights the same as with the case of not using any dropout. https://www.quora.com/What-is-inverted-dropout

In [4]:
# Parameters and helper functions
vgg_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32).reshape((3,1,1))

def vgg_preprocess(x):
    """
        Subtracts the mean RGB value, and transposes RGB to BGR.
        The mean RGB was computed on the image set used to train the VGG model.
        Args: 
            x: Image array (height x width x channels)
        Returns:
        Image array (height x width x transposed_channels)
    """
    x = x - vgg_mean
    return x[:, ::-1] # reverse axis rgb->bgr

def conv_block(n_layers, n_filters):
    """Adds a convolutional block."""
    
    for i in range(n_layers):
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(n_filters, (3,3), activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

# Here i will remove dropouts by setting the p=0. (Dropout(0.))
# this will make each node accesable by not removing any of them.
# In addition we will not need to double the saved dense layer weights,
# since keras uses inverse dropout which makes the training weights
# the same as with the case of not using any dropout. 
def fc_block():
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.))

In [5]:
# Build model
model = Sequential()

model.add(Lambda(vgg_preprocess, input_shape=(3,224,224), output_shape=(3,224,224)))
conv_block(2, 64)
conv_block(2, 128)
conv_block(3, 256)
conv_block(3, 512)
conv_block(3, 512)
model.add(Flatten())
fc_block()
fc_block()
model.add(Dense(1000, activation='softmax'))

In [6]:
# Load weights
file_path = 'http://files.fast.ai/models/'
file_name = 'vgg16.h5'
model.load_weights(get_file(file_name, file_path+file_name, cache_subdir='models'))

In [7]:
# load imagenet class index
file_path = 'http://files.fast.ai/models/'
file_name = 'imagenet_class_index.json'
json_path = get_file(file_name, file_path+file_name, cache_subdir='models')
# load json
with open(json_path) as f:
    class_dict = json.load(f)
imagenet_classes = [class_dict[str(i)][1] for i in range(len(class_dict))]

In [8]:
print(len(imagenet_classes))
print(imagenet_classes[:5])

1000
['tench', 'goldfish', 'great_white_shark', 'tiger_shark', 'hammerhead']


In [9]:
class_names = ['cats', 'dogs']

We will remove the final output layer again but in addition we will leave the dense layers (and flattened layer for sure) trainable. Only the conv layers will be un-trainable. 

In [10]:
# get the last conv layer index
last_conv_index = [i for i,layer in enumerate(model.layers) if type(layer) is Convolution2D][-1]

print(last_conv_index)

30


In [11]:
# Remove the last layer and add a layer for cat and dog
# We can remove the last added layer in a Sequential model by calling .pop()
model.pop() # removed the softmax layer
for layer in model.layers[:last_conv_index+1]:
    layer.trainable=False
    #print(type(layer))
# add last layer
model.add(Dense(2, activation='softmax'))

Here we will use RMSprop and since we are really **fine** tuning the dense layers, we will choose a very small learning rate.

In [12]:
# Compile the model
model.compile(
    optimizer=RMSprop(lr=0.00001, rho=0.7), loss='categorical_crossentropy', metrics=['accuracy'])

## Train

In [13]:
batch_size = 32

# Generators
gen = image.ImageDataGenerator()

train_batches = gen.flow_from_directory(
    'data/train-fast-ai',
    target_size=(224,224),
    class_mode='categorical',
    shuffle=True,
    batch_size=batch_size)

valid_batches = gen.flow_from_directory(
    'data/valid-fast-ai',
    target_size=(224,224),
    class_mode='categorical',
    shuffle=False,
    batch_size=batch_size)

Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


In [14]:
print(train_batches.samples)
print(valid_batches.samples)

23000
2000


In [15]:
# Train
model.fit_generator(
    train_batches, steps_per_epoch=train_batches.samples//batch_size,
    epochs=7, validation_data=valid_batches,
    validation_steps=valid_batches.samples//batch_size)

Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7


<keras.callbacks.History at 0x7f0c04341160>

Ok, not bad... We have passed the previous underfitting val accuracy of ~0.92%. And train set is now overfitting, Jeremy uses data agmentation techniques to overcome this situation which i will not go that further now for this dataset.

In [16]:
# save weights
%mkdir -p saved
model.save_weights('saved/saved_weights_without_dropout.h5')

## Predict and submit

In [19]:
batch_size = 25
gen = image.ImageDataGenerator()

test_batches = gen.flow_from_directory(
    './data/test',
    target_size=(224,224),
    class_mode=None,
    shuffle=False,
    batch_size=batch_size)

Found 12500 images belonging to 1 classes.


In [20]:
preds = model.predict_generator(test_batches, test_batches.samples//batch_size)

In [36]:
from importlib import reload
import utils
reload(utils)
from utils import submit2redux

In [37]:
submit2redux(test_batches, preds, file_name='subm_without_dropout3.csv')

File saved as ./data/subm_without_dropout3.csv.
