# Simpsons Detector
## Creating dataset

The following Python notebook refers the [simpsons detector post](http://zachmoshe.com).

This notebook shows how to train the model.

Some classes are from my `general` and `simpsons` modules. They are available in [my github repo](https://github.com/zachmoshe/datalab)

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import keras
import keras.applications
import keras.preprocessing.image
from keras.layers import *
from keras.regularizers import *
from keras.optimizers import *
from keras.models import *
import tensorflow as tf
import collections
import PIL
import glob 

import scipy
import sklearn
import sklearn.metrics
from sklearn.pipeline import Pipeline

import math

import sys
import logging 
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger("py.warnings").setLevel(logging.ERROR)

%matplotlib inline

# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 


Using TensorFlow backend.


In [3]:
import general
from general.plotting import gridplot
general.dl.make_keras_picklable()

In [4]:
import simpsons

In [5]:
tf.__version__, keras.__version__, np.__version__

('1.0.0', '1.2.1', '1.12.1')

In [6]:
from tensorflow.python.client import device_lib
local_device_protos = device_lib.list_local_devices()
[x.name for x in local_device_protos]

['/cpu:0']

# Some configs

In [7]:
BASE_PATH = "/Users/zach/proj/datalab-data/simpsons/"

In [8]:
config = dict(
    training_img_shape=(202,360),     # 1/4 of HD image
    network_input_shape=(202,360),    
)

# Load datasets

In [9]:
import gzip
import pickle

with gzip.GzipFile(BASE_PATH+"dataset.pickle.zip", "r") as f:
    data = pickle.load(f)

In [10]:
print("train images by char: ", data.train.y.sum(axis=0))
print("train_dev images by char: ", data.train_dev.y.sum(axis=0))
print("dev  images by char: ", data.dev.y.sum(axis=0))
print("test images by char: ", data.test.y.sum(axis=0))
data.show()

train images by char:  [ 109.  118.   76.   49.]
train_dev images by char:  [ 36.  40.  25.  17.]
dev  images by char:  [  74.  102.   65.   63.]
test images by char:  [  73.  101.   65.   66.]
train
  - X: (663, 300, 300, 3)
  - y: (663, 4)
train_dev
  - X: (222, 300, 300, 3)
  - y: (222, 4)
dev
  - X: (351, 202, 360, 3)
  - y: (351, 4)
test
  - X: (351, 202, 360, 3)
  - y: (351, 4)


## Build network

`create_model` builds and compiles the actual neural network model based on many hyper-parameters.

In [11]:
def create_model(opt='adam', fc_detectors=[], train_all=False, batchnorm=True, dropout=0.5, base_output_layer='block5_pool', 
                 num_detectors=16, num_outputs=4, reg_penalty=0.0):
    if type(base_output_layer) is int:
        base_output_layer = "block{}_pool".format(base_output_layer)

    unknown_img_size_inp = Input(shape=(None,None,3), name='input')
    vgg = keras.applications.vgg16.VGG16(include_top=False)
    base_model = Model(input=vgg.input, output=vgg.get_layer(base_output_layer).output)
    
    base_model_output = x = base_model(unknown_img_size_inp)
    
    if batchnorm:
        x = BatchNormalization(name='bn')(x)
    x = Dropout(dropout, name="dropout")(x)

    for ind, fc_det in enumerate(fc_detectors):
        x = Convolution2D(fc_det, 1, 1, activation='relu', name="fc_detector_{}".format(ind), W_regularizer=keras.regularizers.l2(reg_penalty))(x)
    
    x = Convolution2D(num_detectors,1,1, activation='relu', name='detectors_spatial', W_regularizer=keras.regularizers.l2(reg_penalty))(x)
    x = GlobalMaxPooling2D(name='detectors')(x)
    
    x = Dense(num_outputs, name='output', activation='sigmoid')(x)
       
    model = Model(input=unknown_img_size_inp, output=x) 
    
    # Compile model    
    if not train_all:
        for l in base_model.layers:
            l.trainable = False
    
    model.compile(optimizer=opt, loss='mse', metrics=['mae', 'accuracy'])
    return model


## Train

chars_chooser is a generic class that allows me to transform also the `y` vector. I've used it before, when trying the model only on a singel character. Now it's practically disabled.

`BatchTrainingsetGenerator` is the generator that reads training set files (that we generated before) and feeds training images to the network.

In [12]:
# choose characters
chars_chooser = simpsons.preprocessing.CharactersChooser([0,1,2,3])  # bart,homer,lisa,marge
train_X, train_y         = chars_chooser.transform(data.train.X, data.train.y)
train_dev_X, train_dev_y = chars_chooser.transform(data.train_dev.X, data.train_dev.y)
dev_X, dev_y             = chars_chooser.transform(data.dev.X, data.dev.y)

# Preprocess pipeline
simpsons_preprocess = Pipeline([
        ('vgg_preprocessor', simpsons.preprocessing.VGGPreprocessing()),
        ('resize', general.utils.ImageResizeTransformer(config['network_input_shape'])),
    ])

class BatchTrainingsetGenerator:
    def __init__(self, folder):
        self.folder = folder
        
    def generate(self, X, y, batch_size=None):
        while True:
            for f in np.random.permutation(glob.glob(self.folder+"/*")):
                X, y = pickle.load(open(f, "rb"))
                bs = batch_size if batch_size is not None else X.shape[0]
                for i in range(0, X.shape[0], bs):
                    yield X[i:i+bs], y[i:i+bs]
            
frame_generator = BatchTrainingsetGenerator(BASE_PATH+"trainingset")

## Build params

After some tuning, I've disabled the batchnorm layer and used 128 detectors without any other fully-connected layers before.

In [None]:
# Run model only and visualize learning curve (loss/epoch)
build_params = [
    {
        'num_outputs': [4],
        'batchnorm': [False],
        'num_detectors': [128],
        'fc_detectors': [[]],
        'reg_penalty': [ 0.001 ],
    }
]
generator_params = [
    {
        'batch_size': [128],
    }
]

gch = general.dl.DLGridSearch(create_model, build_params=build_params,
                generator_fn=frame_generator.generate, generator_params=generator_params)

gch.fit(data.train.X, data.train.y, nb_epoch=2500, samples_per_epoch=1024, 
       callbacks=[
           keras.callbacks.ModelCheckpoint(BASE_PATH+"model_checkpoints/model.{epoch:d}.hdf5", monitor='val_loss', 
                                          verbose=0, save_best_only=False, save_weights_only=False, 
                                          mode='auto', period=100)
       ])

In [13]:
general.plotting.KerasHistoryPlotter().plot(gch.results, ylim=(0.,0.2))

In [14]:
simpsons.scoring.SimpsonsReporter(
        train_dev_X, train_dev_y, dev_X, dev_y, 
        dev_preprocess=simpsons_preprocess, generator_fn=frame_generator.generate,
        train_dev_num_images=500
).report(gch.results)

## Dumping the model and GridSearch results to disk

In [15]:
id = 0

model = gch.results[id].history.model
params = gch.results[id].params


In [16]:
import pickle
import gzip

# with gzip.GzipFile("/data/simpsons/gch.pickle.gz", "rb") as f:
#     gch = pickle.load(f)
    
with gzip.GzipFile("/data/simpsons/gch.pickle.gz", "wb") as f:
    pickle.dump(gch, f)

print("saving model for ", params)
model.save("/data/simpsons/model.hdf5")


saving model for  {'batchnorm': False, 'fc_detectors': [], 'reg_penalty': 0.001, 'num_outputs': 4, 'num_detectors': 128, 'batch_size': 128}


# Load a pre-trained model (if needed)

In [14]:
#model = keras.models.load_model(BASE_PATH + "model_checkpoints/model.999.hdf5")

# Optimize classifier's thresholds 

runs the selected model on the dev set and sets thresholds according to a specific precision, specific recall, maximize accuracy or maximize fscore.

In [15]:
dev_X = data.dev.X
dev_y = data.dev.y
dev_preds = model.predict(simpsons_preprocess.transform(dev_X))

In [16]:
cls = general.utils.ClassifierThresholdOptimizer(model, None, dev_y, preds=dev_preds)

In [None]:
cls.fit(maximize_fscore=True)
preds_fscore = cls.predict(None, preds=dev_preds)

cls.fit(maximize_accuracy=True)
preds_acc = cls.predict(None, preds=dev_preds)

cls.fit(recall=0.75)
preds_rec_075 = cls.predict(None, preds=dev_preds)

cls.fit(precision=0.75)
preds_pre_075 = cls.predict(None, preds=dev_preds)

num_imgs = 25
general.plotting.gridplot_sidebyside(
    [dev_X[:num_imgs], preds_pre_075[:num_imgs]], 
    [dev_X[:num_imgs], preds_rec_075[:num_imgs]], 
    [dev_X[:num_imgs], preds_fscore[:num_imgs]], 
    [dev_X[:num_imgs], preds_acc[:num_imgs]]
)

In [None]:
# diff between fscore and acc
diff_ind = (preds_fscore != preds_acc).any(axis=1)
general.plotting.gridplot_sidebyside(
    [dev_X[diff_ind], preds_fscore[diff_ind]],
    [dev_X[diff_ind], preds_acc[diff_ind]],
)

FScore was chosen as the best one

In [15]:
cls.fit(maximize_fscore=True)
cls.thresholds

# Performance on test set

In [20]:
test_preds = model.predict(simpsons_preprocess.transform(data.test.X))
test_preds = (test_preds > cls.thresholds).astype(int)
test_true = data.test.y

In [21]:
simpsons.scoring.final_report(test_true, test_preds)

Char #0 - bart
ACCURACY - 0.863
[[263  15]
 [ 33  40]]
             precision    recall  f1-score   support

        0.0       0.89      0.95      0.92       278
        1.0       0.73      0.55      0.62        73

avg / total       0.85      0.86      0.86       351




Char #1 - homer
ACCURACY - 0.744
[[188  62]
 [ 28  73]]
             precision    recall  f1-score   support

        0.0       0.87      0.75      0.81       250
        1.0       0.54      0.72      0.62       101

avg / total       0.78      0.74      0.75       351




Char #2 - lisa
ACCURACY - 0.889
[[272  14]
 [ 25  40]]
             precision    recall  f1-score   support

        0.0       0.92      0.95      0.93       286
        1.0       0.74      0.62      0.67        65

avg / total       0.88      0.89      0.88       351




Char #3 - marge
ACCURACY - 0.917
[[277   8]
 [ 21  45]]
             precision    recall  f1-score   support

        0.0       0.93      0.97      0.95       285
        1.0      

# Store final model and thresholds

In [31]:
model_and_thresholds = { "model": model, "preprocess_pipeline": simpsons_preprocess, "thresholds": cls.thresholds }
pickle.dump(model_and_thresholds, open(BASE_PATH+"model_and_thresholds.pickle", "wb"))