# Springboard Capstone Project 2
## Comparison of different convolutional weights
___

The first aspect of the model to investigate is whether the convolutional weights pretrained on the ImageNet dataset are useful for this dataset. To evaluate this, three model conditions were compared. In the first condition, the ImageNet weights were discarded and all weights were trained from scratch. In the second condition, the ImageNet weights were kept, but the model was not able to alter the weights of the convolutional layers during training. In the final condition, the ImageNet weights were kept, and the model was able to further train these weights to fine-tune the feature extraction for this dataset.

In [1]:
import pandas as pd
import numpy as np
import matplotlib as plt
import tensorflow as tf
import keras.backend as K
from keras.optimizers import Adam
from keras.callbacks import EarlyStopping, ModelCheckpoint
from importlib import reload

# custom module for capstone 2
import cap2tools as c2t
reload(c2t)

Using TensorFlow backend.


<module 'cap2tools' from 'C:\\Users\\Nils\\Documents\\GitHub\\Springboard-Capstone-2-local-yelp\\cap2tools.py'>

In [2]:
# configure GPU memory usage by tensorflow
config = K.tf.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.80
K.tensorflow_backend.set_session(K.tf.Session(config=config))

In [3]:
# define paths to image directories
train_path = 'downsampled/train'
valid_path = 'downsampled/val'

# create image data generators to feed the model from image directories
train_batches, valid_batches = c2t.build_datagens(train_path, valid_path, augment=False)

Found 5480 images belonging to 5 classes.
Found 525 images belonging to 5 classes.


In [4]:
widths = (500, 500) # 500 nodes in the FC layers
replicates = 3 #run each condition in triplicate
n_epochs = 10
histories = dict()

# baseline ImageNet weights
condition = 'imagenet_baseline'
histories[condition] = c2t.run_in_replicate(widths, condition, train_batches, valid_batches, 
                                            replicates=replicates, n_epochs=n_epochs, new_weights=False, 
                                            trainable=False)

# trainable ImageNet weights
condition = 'imagenet_trainable'
histories[condition] = c2t.run_in_replicate(widths, condition, train_batches, valid_batches, 
                                            replicates=replicates, n_epochs=n_epochs, new_weights=False, 
                                            trainable=True)

# new weights
condition = 'new_weights'
histories[condition] = c2t.run_in_replicate(widths, condition, train_batches, valid_batches, 
                                            replicates=replicates, n_epochs=n_epochs, new_weights=True, 
                                            trainable=True)

2018-09-30 15:00:04 - Started training models/vgg16_imagenet_baseline_1
2018-09-30 15:10:00 - Started training models/vgg16_imagenet_baseline_2
2018-09-30 15:20:01 - Started training models/vgg16_imagenet_baseline_3
2018-09-30 15:29:56 - Started training models/vgg16_imagenet_trainable_1
2018-09-30 15:54:33 - Started training models/vgg16_imagenet_trainable_2
2018-09-30 16:19:21 - Started training models/vgg16_imagenet_trainable_3
2018-09-30 16:44:18 - Started training models/vgg16_new_weights_1
2018-09-30 17:09:09 - Started training models/vgg16_new_weights_2
2018-09-30 17:33:55 - Started training models/vgg16_new_weights_3


## Image Augmentation
___


In [4]:
widths = (500, 500)
replicates = 3
n_epochs = 10
histories = dict()

# create new data generators with image augmentation
train_batches, valid_batches = c2t.build_datagens(train_path, valid_path, augment=True)

# trainable ImageNet weights with image augmentation
condition = 'imagenet_trainable_augment'
histories[condition] = c2t.run_in_replicate(widths, condition, train_batches, valid_batches, 
                                            replicates=replicates, n_epochs=n_epochs, new_weights=False, 
                                            trainable=True)

Found 5480 images belonging to 5 classes.
Found 525 images belonging to 5 classes.
2018-09-30 21:59:28 - Started training models/vgg16_imagenet_trainable_augment_1
2018-09-30 22:24:37 - Started training models/vgg16_imagenet_trainable_augment_2
2018-09-30 22:49:53 - Started training models/vgg16_imagenet_trainable_augment_3


In [None]:
# save training history
hist_df = pd.DataFrame(histories).transpose()
hist_df.to_json('VGG16_pretraining_comparison_history.json')

In [4]:
# evaluate trained models on validation dataset
model_paths = {'ImageNet_baseline_1 - 1': 'models/vgg16_imagenet_baseline_1.h5', 
               'ImageNet_baseline_1 - 2': 'models/vgg16_imagenet_baseline_2.h5', 
               'ImageNet_baseline_1 - 3': 'models/vgg16_imagenet_baseline_3.h5', 
               'ImageNet_trainable_2 - 1': 'models/vgg16_imagenet_trainable_1.h5', 
               'ImageNet_trainable_2 - 2': 'models/vgg16_imagenet_trainable_2.h5', 
               'ImageNet_trainable_2 - 3': 'models/vgg16_imagenet_trainable_3.h5',
               'New_weights_3 - 1': 'models/vgg16_new_weights_1.h5', 
               'New_weights_3 - 2': 'models/vgg16_new_weights_2.h5', 
               'New_weights_3 - 3': 'models/vgg16_new_weights_3.h5', 
               'ImageNet_augmented_4 - 1': 'models/vgg16_imagenet_trainable_augment_1.h5', 
               'ImageNet_augmented_4 - 2': 'models/vgg16_imagenet_trainable_augment_2.h5', 
               'ImageNet_augmente_4 - 3': 'models/vgg16_imagenet_trainable_augment_3.h5'}

model_metrics = c2t.eval_models(model_paths, valid_path)

Building image generator...
Found 525 images belonging to 5 classes.
Loading models/vgg16_imagenet_baseline_1.h5
Evaluating models/vgg16_imagenet_baseline_1.h5
Loading models/vgg16_imagenet_baseline_2.h5
Evaluating models/vgg16_imagenet_baseline_2.h5
Loading models/vgg16_imagenet_baseline_3.h5
Evaluating models/vgg16_imagenet_baseline_3.h5
Loading models/vgg16_imagenet_trainable_1.h5
Evaluating models/vgg16_imagenet_trainable_1.h5
Loading models/vgg16_imagenet_trainable_2.h5
Evaluating models/vgg16_imagenet_trainable_2.h5
Loading models/vgg16_imagenet_trainable_3.h5
Evaluating models/vgg16_imagenet_trainable_3.h5
Loading models/vgg16_new_weights_1.h5
Evaluating models/vgg16_new_weights_1.h5
Loading models/vgg16_new_weights_2.h5
Evaluating models/vgg16_new_weights_2.h5
Loading models/vgg16_new_weights_3.h5
Evaluating models/vgg16_new_weights_3.h5
Loading models/vgg16_imagenet_trainable_augment_1.h5
Evaluating models/vgg16_imagenet_trainable_augment_1.h5
Loading models/vgg16_imagenet_tra

In [6]:
# create table of evaluation results
table = c2t.eval_table(model_metrics, 'Condition', decimals=3)

In [7]:
table

Unnamed: 0_level_0,acc,acc,loss,loss,mpcr,mpcr
Unnamed: 0_level_1,max,mean,min,mean,max,mean
Condition,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
1.0,0.89,0.877,0.865,0.904,0.89,0.877
2.0,0.874,0.872,0.357,0.375,0.874,0.872
3.0,0.712,0.707,0.744,0.763,0.712,0.707
4.0,0.863,0.854,0.353,0.381,0.863,0.854
