# Assignment 2

This assignment's purpose is to build a convolutional neural network to classify images as hot dogs or not-hot dogs. This is the same problem as seen in the HBO TV show "Silicon Valley" (https://www.youtube.com/watch?v=pqTntG1RXSY).  We'll be using the dataset put together by a user on Kaggle (https://www.kaggle.com/dansbecker/hot-dog-not-hot-dog) which contains 498 training images and 500 test images.

There are two parts to this assignment:

1. A simple CNN is given below.  Due to the small sample size it has a very poor test set accuracy (around 55%). Your task is to build a CNN that can beat this test set accuracy by a large margin (better than or equal to 70% test set accuracy).
2. Describe 3 changes that you made beyond what is given in this notebook and explain what effect they had on the test set accuracy (see below for more instructions).

### Submission

Submit this completed and executed notebook on Quercus that shows your best test set accuracy.

We will run a competition in class to see who can achieve the best test set accuracy.  The winner will recieve a small prize (and bragging rights).


# Student Info

Name: Alex Nolfi

Collaborators (if any): Nikita Fodtchouk

Nature of Collaboration (if any): Shared ideas about transfer learning and image preprocessing

# Code

In [1]:
import numpy as np
import matplotlib.pyplot as plt

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, UpSampling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
from keras_tqdm import TQDMNotebookCallback
from scipy.ndimage import zoom
from keras.datasets import cifar10
from keras.applications.mobilenet import MobileNet, preprocess_input

Using TensorFlow backend.


## Loading Hotdog-Not-Hotdog Dataset 

In [2]:
# Re-scaled dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = '/global/project/rotman/Col_NN/hotdog/train'
test_data_dir = '/global/project/rotman/Col_NN/hotdog/test'

if K.image_data_format() == 'channels_first':
    input_shape = (3, img_width, img_height)
else:
    input_shape = (img_width, img_height, 3)

# Model

In [4]:
#Initiate transfer learning model "MobileNet"
mobilenet_base = MobileNet(weights='imagenet', include_top=False)
mobilenet_base



<keras.engine.training.Model at 0x7f6780309c18>

In [5]:
def mymodel():
    ''' Improve this model! 
        Simple model from: https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d
    '''
    model = Sequential()
    
    model.add(mobilenet_base)
    model.add(GlobalAveragePooling2D())
    model.add(Dense(1))
    model.add(Activation("sigmoid"))
    
    for layer in mobilenet_base.layers:
        layer.trainable = False
    
    model.compile(loss='binary_crossentropy', metrics=['accuracy'], optimizer='rmsprop')
   
    return model

# Test function
mymodel().summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
mobilenet_1.00_224 (Model)   (None, None, None, 1024)  3228864   
_________________________________________________________________
global_average_pooling2d_1 ( (None, 1024)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 1025      
_________________________________________________________________
activation_1 (Activation)    (None, 1)                 0         
Total params: 3,229,889
Trainable params: 1,025
Non-trainable params: 3,228,864
_________________________________________________________________


### Loading data on the fly

We load the data directly from the images on disk via these Keras helper functions (`ImageDataGenerator` and `flow_from_directory`). It performs two transformations: 

* Rescaling pixels to be between [0, 1]
* Resizing images to be in `img_width`x`img_height` (150x150)

During training for each batch, the images are read from disk on the fly, loaded into memory and then the transformations are applied.

In [6]:
# You may optionally change these parameters
batch_size = 50
epochs = 10
train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)

# Data parameters (DO NOT MODIFY)
num_train_samples = 498
num_test_samples = 500

# Data generators (DO NOT MODIFY)
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary'
)

test_generator = test_datagen.flow_from_directory(
    test_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary'
)

Found 498 images belonging to 2 classes.
Found 500 images belonging to 2 classes.


In [7]:
def evaluate_model(runs=5):
    ''' DO NOT MODIFY THIS FUNCTION '''
    scores = [] 
    for i in range(runs):
        print('Executing run %d' % (i+1))
        model = mymodel()
        model.fit_generator(train_generator,
                            callbacks=[TQDMNotebookCallback()],
                            steps_per_epoch=num_train_samples // batch_size,
                            epochs=epochs, verbose=0)
        print(' * Evaluating model on test set')
        scores.append(model.evaluate_generator(test_generator, 
                                               steps=num_test_samples // batch_size,
                                               verbose=0))
        print(' * Test set Loss: %.4f, Accuracy: %.4f' % (scores[-1][0], scores[-1][1]))
        
    accuracies = [score[1] for score in scores]     
    return np.mean(accuracies), np.std(accuracies)
        
mean_accuracy, std_accuracy = evaluate_model(runs=5)

Executing run 1


HBox(children=(IntProgress(value=0, description='Training', max=10, style=ProgressStyle(description_width='ini…

HBox(children=(IntProgress(value=0, description='Epoch 0', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 1', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 2', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 3', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 4', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 5', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 6', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 7', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 8', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 9', max=9, style=ProgressStyle(description_width='initi…


 * Evaluating model on test set
 * Test set Loss: 0.3486, Accuracy: 0.8600
Executing run 2


HBox(children=(IntProgress(value=0, description='Training', max=10, style=ProgressStyle(description_width='ini…

HBox(children=(IntProgress(value=0, description='Epoch 0', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 1', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 2', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 3', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 4', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 5', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 6', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 7', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 8', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 9', max=9, style=ProgressStyle(description_width='initi…


 * Evaluating model on test set
 * Test set Loss: 0.5078, Accuracy: 0.7700
Executing run 3


HBox(children=(IntProgress(value=0, description='Training', max=10, style=ProgressStyle(description_width='ini…

HBox(children=(IntProgress(value=0, description='Epoch 0', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 1', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 2', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 3', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 4', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 5', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 6', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 7', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 8', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 9', max=9, style=ProgressStyle(description_width='initi…


 * Evaluating model on test set
 * Test set Loss: 0.3525, Accuracy: 0.8500
Executing run 4


HBox(children=(IntProgress(value=0, description='Training', max=10, style=ProgressStyle(description_width='ini…

HBox(children=(IntProgress(value=0, description='Epoch 0', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 1', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 2', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 3', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 4', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 5', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 6', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 7', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 8', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 9', max=9, style=ProgressStyle(description_width='initi…


 * Evaluating model on test set
 * Test set Loss: 0.5564, Accuracy: 0.7780
Executing run 5


HBox(children=(IntProgress(value=0, description='Training', max=10, style=ProgressStyle(description_width='ini…

HBox(children=(IntProgress(value=0, description='Epoch 0', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 1', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 2', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 3', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 4', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 5', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 6', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 7', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 8', max=9, style=ProgressStyle(description_width='initi…

HBox(children=(IntProgress(value=0, description='Epoch 9', max=9, style=ProgressStyle(description_width='initi…


 * Evaluating model on test set
 * Test set Loss: 0.4339, Accuracy: 0.8100


In [8]:
# You will be evaluated on your mean test set accuracy over 5 runs
print('Mean test set accuracy over 5 runs: %.4f +/- %.4f' % (mean_accuracy, std_accuracy))

Mean test set accuracy over 5 runs: 0.8136 +/- 0.0365


# Describe 3 Changes

Describe three modifications you made to your network that are not included in this notebook (1-3 paragraphs each) and the effect they had on the test set performance.  Not all of the changes need to be in your final version (e.g. some of the changes may have not improved the test set performance).

0. Using a data augmentation technique instead of transfer learning

When I first began this assignment, I didn't think I would need to apply transfer learning to attain the 70% threshold. After doing some research on how to improve predictive power for models with small training sets, I came across a lot of information that suggested making use of image preprocessing. More specifically, I applied the following arguments to the ImageDataGenerator function for the training set only: rescale=1. / 255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, rotation_range=60. The logic behind this preprocessing was to prevent the training set from overfitting by "learning" details from the training set that did not help it generalize to the test set. By randomly flipping, rotating, and shearing the images, I was able to get the model up to a test set accuracy of 65-67% (in conjunction with a few other changes such as adding layers and changing the number of epochs).

1. Applying the Adam optimizer

Although the Adam optimizer can be seen as an improvement over rmsprop in some ways, this change did not significantly affect the test set performance. In fact, when applied to the baseline model originally in this notebook, the test set accuracy score decreased a small amount from the 55% baseline. Of course, this may have simply been due to random chance, but I expected Adam to be more effective than rsmprop, as it stores an exponentially decaying average of past gradients, similar to a momentum term (rsmprop does not do this).

2. Using batch normalization instead of dropout

After 10 or so runs of my CNN, I realized that it was going to be extremely difficult to attain a 70% test score without making drastic changes to the model. While doing some research, I came across a video that discussed how to implement "model.add(BatchNormalization())", to speed up the learning rate. (It also mentioned that this technique should not be used in conjunction with dropout, because they have opposite effects on the model's overall learning rate.) Due to the small training sample, however, this approach actually resulted in overfitting, weakening the test set performance by a few points from the 55% baseline. I decided to abandon this approach altogether when I realized it is especially hard (and risky) to implement in models with small training sets.

3. Adding a group of new layers of the form:
    
    "model.add(Conv2D(64, (3, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))"
    
Originally, I thought I could improve the test set performance by making the CNN deeper. Unfortunately, this approach usually requires a lot more data than the 498 samples we have in this training set. (This is why I settled on a data augmentation plus transfer learning approach in the end.) While a shallow structure is best for this particular classification task, I found that adding a third layer with 32 units (as shown in this notebook) did seem to improve test set performance a little bit. Overall, this strategy did not prove to be very effective for this task.