# Benchmark Model

In this notebook, we train a CNN without using transfer learning, to serve as a benchmark for the models using transfer learning in the next section.

### Splitting the data into training, validation and test set

Here we create the train, validation and test set. All the sets are shuffled but balanced in terms of classes (y always contains 50% of each class).

In [1]:
import numpy as np

X = np.load('data/processed_data/X_299.npy')
y = np.load('data/processed_data/y.npy')

In [2]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42, shuffle=True, stratify=y)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.15, 
                                                  random_state=42, shuffle=True, stratify=y_train)

In [3]:
print('The training set contains {} examples, the validation contains {} and the test set {}'
      .format(X_train.shape[0], X_val.shape[0], X_test.shape[0]))

The training set contains 289 examples, the validation contains 51 and the test set 60


### Creating the benchmark models

I will train three models as benchmarks to the next step of the project. For the three models, the best weight (evaluatead by validation loss) will be saved in the benchmark_models folder.

I will compare the three models by their validation accuracy and then obtain the test accuracy for the best model. This test accuracy will be the benchmark accuracy that we want our final model to get improve.

### First model

The first model is the most simple one, it contains two pairs of convolution/max pooling layers, a GAP layer and finally a 1-neuron dense layer that predict the probability of the picture being 1 (Kuki). 

In [30]:
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential

model1 = Sequential()

model1.add(Conv2D(filters=16, kernel_size=2, padding='same', activation='relu', input_shape=(299,299,3)))
model1.add(MaxPooling2D(pool_size=2))
model1.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model1.add(MaxPooling2D(pool_size=2))
model1.add(GlobalAveragePooling2D())
model1.add(Dense(1, activation='relu'))

model1.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_16 (Conv2D)           (None, 299, 299, 16)      208       
_________________________________________________________________
max_pooling2d_16 (MaxPooling (None, 149, 149, 16)      0         
_________________________________________________________________
conv2d_17 (Conv2D)           (None, 149, 149, 32)      2080      
_________________________________________________________________
max_pooling2d_17 (MaxPooling (None, 74, 74, 32)        0         
_________________________________________________________________
global_average_pooling2d_7 ( (None, 32)                0         
_________________________________________________________________
dense_11 (Dense)             (None, 1)                 33        
Total params: 2,321
Trainable params: 2,321
Non-trainable params: 0
_________________________________________________________________


In [31]:
from keras.callbacks import ModelCheckpoint  

model1.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

epochs = 20

checkpointer = ModelCheckpoint(filepath='models/benchmark_models/weights.best.benchmark_1.hdf5', 
                               verbose=1, save_best_only=True)

model1.fit(X_train, y_train, 
          validation_data=(X_val, y_val),
          epochs=epochs, batch_size=20, callbacks=[checkpointer], verbose=1)

Train on 289 samples, validate on 51 samples
Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.71765, saving model to benchmark_models/weights.best.benchmark_1.hdf5
Epoch 2/20

Epoch 00002: val_loss improved from 0.71765 to 0.69069, saving model to benchmark_models/weights.best.benchmark_1.hdf5
Epoch 3/20

Epoch 00003: val_loss did not improve from 0.69069
Epoch 4/20

Epoch 00004: val_loss improved from 0.69069 to 0.68993, saving model to benchmark_models/weights.best.benchmark_1.hdf5
Epoch 5/20

Epoch 00005: val_loss improved from 0.68993 to 0.68670, saving model to benchmark_models/weights.best.benchmark_1.hdf5
Epoch 6/20

Epoch 00006: val_loss improved from 0.68670 to 0.68566, saving model to benchmark_models/weights.best.benchmark_1.hdf5
Epoch 7/20

Epoch 00007: val_loss improved from 0.68566 to 0.68384, saving model to benchmark_models/weights.best.benchmark_1.hdf5
Epoch 8/20

Epoch 00008: val_loss improved from 0.68384 to 0.68373, saving model to benchmark_models/weights.b

<keras.callbacks.History at 0x2130f8cd940>

In [48]:
from sklearn.metrics import accuracy_score

predictions_1 = (model1.predict(X_val) > 0.5) * 1
val_accuracy_1 = accuracy_score(predictions_1, y_val)

print('For the first model, the validation accuracy was {}'.format(np.round(val_accuracy_1,3)))

For the first model, the validation accuracy was 0.667


### Second Model

The second model is a bit more complex; I added one more fully-connected layer with 16 neurons followed by a dropout of 0.2 before the final fully-connected layer.

In [38]:
model2 = Sequential()

model2.add(Conv2D(filters=16, kernel_size=2, padding='same', activation='relu', input_shape=(299,299,3)))
model2.add(MaxPooling2D(pool_size=2))
model2.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model2.add(MaxPooling2D(pool_size=2))
model2.add(GlobalAveragePooling2D())
model2.add(Dense(16, activation='relu'))
model2.add(Dropout(0.2))
model2.add(Dense(1, activation='relu'))

model2.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_18 (Conv2D)           (None, 299, 299, 16)      208       
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 149, 149, 16)      0         
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 149, 149, 32)      2080      
_________________________________________________________________
max_pooling2d_19 (MaxPooling (None, 74, 74, 32)        0         
_________________________________________________________________
global_average_pooling2d_8 ( (None, 32)                0         
_________________________________________________________________
dense_12 (Dense)             (None, 16)                528       
_________________________________________________________________
dropout_3 (Dropout)          (None, 16)                0         
__________

In [39]:
model2.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

epochs = 20

checkpointer = ModelCheckpoint(filepath='models/benchmark_models/weights.best.benchmark_2.hdf5', 
                               verbose=1, save_best_only=True)

model2.fit(X_train, y_train, 
          validation_data=(X_val, y_val),
          epochs=epochs, batch_size=20, callbacks=[checkpointer], verbose=1)

Train on 289 samples, validate on 51 samples
Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.80983, saving model to benchmark_models/weights.best.benchmark_2.hdf5
Epoch 2/20

Epoch 00002: val_loss improved from 0.80983 to 0.69328, saving model to benchmark_models/weights.best.benchmark_2.hdf5
Epoch 3/20

Epoch 00003: val_loss improved from 0.69328 to 0.68931, saving model to benchmark_models/weights.best.benchmark_2.hdf5
Epoch 4/20

Epoch 00004: val_loss did not improve from 0.68931
Epoch 5/20

Epoch 00005: val_loss did not improve from 0.68931
Epoch 6/20

Epoch 00006: val_loss did not improve from 0.68931
Epoch 7/20

Epoch 00007: val_loss improved from 0.68931 to 0.68744, saving model to benchmark_models/weights.best.benchmark_2.hdf5
Epoch 8/20

Epoch 00008: val_loss did not improve from 0.68744
Epoch 9/20

Epoch 00009: val_loss improved from 0.68744 to 0.68474, saving model to benchmark_models/weights.best.benchmark_2.hdf5
Epoch 10/20

Epoch 00010: val_loss improved from 0.6

<keras.callbacks.History at 0x213097bc978>

In [47]:
predictions_2 = (model2.predict(X_val) > 0.5) * 1
val_accuracy_2 = accuracy_score(predictions_2, y_val)

print('For the second model, the validation accuracy was {}'.format(np.round(val_accuracy_2,3)))

For the second model, the validation accuracy was 0.647


### Third model

The third model is similar to the second model, but I added one more convolutional / max pooling layer pair before the GAP layer. Also, I increased the size of the fully connected layer to 20 neurons.

In [42]:
model3 = Sequential()

model3.add(Conv2D(filters=16, kernel_size=2, padding='same', activation='relu', input_shape=(299,299,3)))
model3.add(MaxPooling2D(pool_size=2))
model3.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model3.add(MaxPooling2D(pool_size=2))
model3.add(Conv2D(filters=64, kernel_size=2, padding='same', activation='relu'))
model3.add(MaxPooling2D(pool_size=2))
model3.add(GlobalAveragePooling2D())
model3.add(Dense(20, activation='relu'))
model3.add(Dropout(0.2))
model3.add(Dense(1, activation='relu'))

model3.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_20 (Conv2D)           (None, 299, 299, 16)      208       
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 149, 149, 16)      0         
_________________________________________________________________
conv2d_21 (Conv2D)           (None, 149, 149, 32)      2080      
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 74, 74, 64)        8256      
_________________________________________________________________
max_pooling2d_22 (MaxPooling (None, 37, 37, 64)        0         
_________________________________________________________________
global_average_pooling2d_9 ( (None, 64)                0         
__________

In [43]:
model3.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

epochs = 20

checkpointer = ModelCheckpoint(filepath='models/benchmark_models/weights.best.benchmark_3.hdf5', 
                               verbose=1, save_best_only=True)

model3.fit(X_train, y_train, 
          validation_data=(X_val, y_val),
          epochs=epochs, batch_size=20, callbacks=[checkpointer], verbose=1)

Train on 289 samples, validate on 51 samples
Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.69755, saving model to benchmark_models/weights.best.benchmark_3.hdf5
Epoch 2/20

Epoch 00002: val_loss did not improve from 0.69755
Epoch 3/20

Epoch 00003: val_loss improved from 0.69755 to 0.69423, saving model to benchmark_models/weights.best.benchmark_3.hdf5
Epoch 4/20

Epoch 00004: val_loss did not improve from 0.69423
Epoch 5/20

Epoch 00005: val_loss improved from 0.69423 to 0.69290, saving model to benchmark_models/weights.best.benchmark_3.hdf5
Epoch 6/20

Epoch 00006: val_loss did not improve from 0.69290
Epoch 7/20

Epoch 00007: val_loss improved from 0.69290 to 0.69191, saving model to benchmark_models/weights.best.benchmark_3.hdf5
Epoch 8/20

Epoch 00008: val_loss improved from 0.69191 to 0.69044, saving model to benchmark_models/weights.best.benchmark_3.hdf5
Epoch 9/20

Epoch 00009: val_loss did not improve from 0.69044
Epoch 10/20

Epoch 00010: val_loss improved from 0.6

<keras.callbacks.History at 0x21309f649e8>

In [46]:
predictions_3 = (model3.predict(X_val) > 0.5) * 1
val_accuracy_3 = accuracy_score(predictions_3, y_val)

print('For the third model, the validation accuracy was {}'.format(np.round(val_accuracy_3,3)))

For the third model, the validation accuracy was 0.588


### Results

The three trained models obtained the following validation accuracy:

__First model:__ 0.666 <br>
__Second model:__ 0.647 <br>
__Third model:__ 0.588 <br>

Those results comproved my intuition that simpler models would have a better performance, given the small dataset that we have. Now, we calculate the test accuracy for the first model.

In [51]:
predictions = (model1.predict(X_test) > 0.5) * 1
test_accuracy = accuracy_score(predictions, y_test)

print('The test accuracy is {}'.format(np.round(test_accuracy, 3)))

The test accuracy is 0.617


The test accuracy that will be used as benchmark is 0.617