# The plan

[Data augmentation of convolutional feature maps](http://mtdat.blogspot.com/2017/02/data-augmentation-of-convolutional.html) post explains when we can want to spatially transform features maps.  
This notebook is to run experiment to see how data augmentation in feature maps space can improve performance of a classifier. 
 
To see whether we have any gains I'm going to do the following:

Use VGG with fixed weights as pre-trained network and will develop three model on top of it

1. The simple fully convolutional baseline model
2. The model above trained on additional images produced by applying image augmenation of the original images
3. The same model but trained with transformations of feature maps

The models will be trained on small subsets(10,20,40,80 samples) of Kaggle's [Dogs vs. Cats dataset](https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data).

Then I will compare the performance of the models and their training time. 




Francois Chollet in [his tutorial]() used Kaggle [Dogs vs Cats Redux](https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data) competition data to show how to use image augmentation and pre-trained models to train a classifier small amount of data. Francois used only 2,000 (1000 cats and 1000 dogs) out of 25,000 images to achieve accuracy of 0.94. 

I will make the task more challenging I will train a classifier on very small number of examples starting from 5 samples per class, then 10, 20, and 40 samples per class.



In [1]:
import sys
sys.path.append('..')
from util import *

import math

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
%load_ext autoreload
%autoreload 2


Using TensorFlow backend.


In [7]:
fpath = full_path('/train/*.jpg')
img_df = pd.DataFrame({'path':glob.glob(fpath)})
img_df['dog'] = (img_df.path.str.find('train/dog.') > 0).astype(int)
img_df.shape

(25000, 2)

In [8]:
from sklearn.model_selection import ShuffleSplit, StratifiedShuffleSplit
from sklearn.model_selection import train_test_split
test_idx, tmp_idx = next(StratifiedShuffleSplit(train_size = 0.1, test_size = 0.1, random_state= 517).split(img_df, img_df.dog))


In [338]:
#n_samples = 10
n_samples = 80

In [339]:
train_idx, val_idx = train_test_split(tmp_idx, train_size = n_samples, test_size=500, random_state=801, 
                                     stratify = img_df.iloc[tmp_idx].dog)
#next(ShuffleSplit(test_size = 0.8, random_state= 617).split(val_idx, val_idx))
val_idx.shape
test_idx.shape
train_idx.shape

(500,)

(2500,)

(80,)

We will have fixed test set of 2500 

In [340]:
for idx in [test_idx, val_idx,train_idx]:
    img_df.loc[idx].shape
    np.mean(img_df.loc[idx].dog)
    

(2500, 2)

0.5

(500, 2)

0.5

(80, 2)

0.5

In [341]:
val_df = img_df.iloc[val_idx]
val_df.shape
test_df = img_df.iloc[test_idx]
test_df.shape
train_df = img_df.iloc[train_idx]
train_df.shape

(500, 2)

(2500, 2)

(80, 2)

We will have a fixed train set of 2500 images, validation set of 500 images, and the train of *n_samples* images.  
All the sets have 50% cats and 50% dogs. 


In [278]:
img_dims = [load_img(p).size for p in test_df.path]
from collections import Counter
c = Counter(img_dims)
from operator import itemgetter
c_sort = sorted(c.items(), key=itemgetter(1), reverse = True)
c_sort[:10]

[((500, 374), 294),
 ((499, 375), 269),
 ((374, 500), 28),
 ((499, 333), 26),
 ((375, 499), 24),
 ((320, 239), 20),
 ((500, 332), 17),
 ((319, 240), 15),
 ((300, 224), 12),
 ((399, 300), 12)]

We see that the most common image sizes is (500,374) (and (499, 375) which is virtually the same.  
So I'm going to resize all the image to (500,374). 

In [342]:
target_size = (374,500) #(height,width)

In [343]:
X_train, y_train = prep_x_y(train_df, target_size = target_size)
X_train.shape, y_train.shape

X_val, y_val = prep_x_y(val_df, target_size = target_size)
X_val.shape, y_val.shape


((80, 374, 500, 3), (80,))

((500, 374, 500, 3), (500,))

In [18]:

X_test, y_test = prep_x_y(test_df, target_size = target_size)
X_test.shape, y_test.shape

((2500, 374, 500, 3), (2500,))

In [254]:
res_lst = []

## Baseline model

The first model is to set a baseline it will be a full convolution network attached to top of pre-trained VGG.  
As I'm not going to use image augmentation for the baseline I will first run VGG for train, validation, and test, sets and will keep output tensors as input to the model.


In [344]:
conv_out = attach_top_to_vgg(lambda x: x, layer_name = 'block5_conv3')

In [345]:
X2_val = conv_out.predict(X_val, batch_size=24)
X2_val.shape

X2_train = conv_out.predict(X_train, batch_size=24)
X2_train.shape

(500, 23, 31, 512)

(80, 23, 31, 512)

In [21]:
%time X2_test = conv_out.predict(X_test, batch_size=24)
X2_test.shape

CPU times: user 53.8 s, sys: 12.7 s, total: 1min 6s
Wall time: 1min 2s


(2500, 23, 31, 512)

In [151]:
def fc_bin_m1(x):
    """
    Fully Convolutional binary model
    """
    x = MaxPooling2D((2, 2), strides=(2, 2), name='b1_pool')(x)
    x = Convolution2D(256, 1, 1, activation='relu', name='conv_1x1')(x)
    x = BatchNormalization(axis=3)(x)
    x = Convolution2D(128, 7, 7, activation='relu', border_mode='valid', name='conv_FC1')(x)
    x = BatchNormalization(axis=3)(x)
    x = Convolution2D(1, 1, 1, activation='sigmoid', border_mode='valid', name='conv_FC_last')(x)
    x = GlobalAveragePooling2D(name='avg_pool')(x)
    return x

In [346]:
m_name = 'bs'
input_shape = (None, None, 512)
m_input = Input(shape=input_shape)
model = Model(m_input, fc_bin_m1(m_input))

In [347]:
model.compile(loss='binary_crossentropy',
              optimizer= Adam(lr=0.001),
              metrics=['accuracy'])

In [348]:
batch_size = 16 if n_samples > 16 else n_samples
batch_size

16

In [349]:
model.optimizer.lr = 1e-5
history = model.fit(X2_train, y_train, validation_data=(X2_val, y_val),
                    batch_size = batch_size,
                    nb_epoch = 30, verbose = 2)


Train on 80 samples, validate on 500 samples
Epoch 1/30
2s - loss: 0.6481 - acc: 0.6500 - val_loss: 0.5314 - val_acc: 0.7640
Epoch 2/30
0s - loss: 0.5398 - acc: 0.7875 - val_loss: 0.5283 - val_acc: 0.6840
Epoch 3/30
0s - loss: 0.4509 - acc: 0.9000 - val_loss: 0.5270 - val_acc: 0.6600
Epoch 4/30
0s - loss: 0.3870 - acc: 0.9125 - val_loss: 0.5134 - val_acc: 0.6680
Epoch 5/30
0s - loss: 0.3588 - acc: 0.9125 - val_loss: 0.4771 - val_acc: 0.7080
Epoch 6/30
0s - loss: 0.3135 - acc: 0.9375 - val_loss: 0.4483 - val_acc: 0.7320
Epoch 7/30
0s - loss: 0.2772 - acc: 0.9500 - val_loss: 0.4133 - val_acc: 0.7640
Epoch 8/30
0s - loss: 0.2763 - acc: 0.9250 - val_loss: 0.3852 - val_acc: 0.7820
Epoch 9/30
0s - loss: 0.2360 - acc: 0.9625 - val_loss: 0.3513 - val_acc: 0.8220
Epoch 10/30
0s - loss: 0.2769 - acc: 0.9125 - val_loss: 0.3322 - val_acc: 0.8340
Epoch 11/30
0s - loss: 0.2074 - acc: 0.9750 - val_loss: 0.3046 - val_acc: 0.8600
Epoch 12/30
0s - loss: 0.2387 - acc: 0.9250 - val_loss: 0.2838 - val_acc:

In [351]:
%%time
model.optimizer.lr = 1e-6
history = model.fit(X2_train, y_train, validation_data=(X2_val, y_val),
                    batch_size = batch_size,
                    nb_epoch = 20, verbose = 2)

Train on 80 samples, validate on 500 samples
Epoch 1/20
0s - loss: 0.1336 - acc: 0.9625 - val_loss: 0.1936 - val_acc: 0.9520
Epoch 2/20
0s - loss: 0.0818 - acc: 1.0000 - val_loss: 0.1925 - val_acc: 0.9480
Epoch 3/20
0s - loss: 0.0804 - acc: 0.9875 - val_loss: 0.1914 - val_acc: 0.9480
Epoch 4/20
0s - loss: 0.0979 - acc: 1.0000 - val_loss: 0.1900 - val_acc: 0.9500
Epoch 5/20
0s - loss: 0.0967 - acc: 1.0000 - val_loss: 0.1884 - val_acc: 0.9500
Epoch 6/20
0s - loss: 0.0803 - acc: 1.0000 - val_loss: 0.1870 - val_acc: 0.9500
Epoch 7/20
0s - loss: 0.0635 - acc: 1.0000 - val_loss: 0.1861 - val_acc: 0.9520
Epoch 8/20
0s - loss: 0.1609 - acc: 0.9625 - val_loss: 0.1857 - val_acc: 0.9500
Epoch 9/20
0s - loss: 0.0765 - acc: 1.0000 - val_loss: 0.1848 - val_acc: 0.9500
Epoch 10/20
0s - loss: 0.0620 - acc: 1.0000 - val_loss: 0.1839 - val_acc: 0.9500
Epoch 11/20
0s - loss: 0.0614 - acc: 1.0000 - val_loss: 0.1831 - val_acc: 0.9460
Epoch 12/20
0s - loss: 0.0743 - acc: 1.0000 - val_loss: 0.1824 - val_acc:

In [352]:
pred = model.predict(X2_test)
pred.shape
pred = reduce_conf(pred)
pred_class = (pred > 0.5).astype('float32')
sc = (n_samples, m_name, metrics.accuracy_score(y_test, pred_class), metrics.log_loss(y_test, pred))
sc
res_lst

(2500, 1)

(80, 'bs', 0.94079999999999997, 0.19860990840010345)

[(10, 'fm_aug', 0.85319999999999996, 0.36306085548214612),
 (10, 'bs', 0.73480000000000001, 0.47574473093301056),
 (10, 'img_aug', 0.80600000000000005, 0.39488133698888123),
 (20, 'bs', 0.90800000000000003, 0.29638660823404789),
 (20, 'fm_aug', 0.91479999999999995, 0.26443358783498405),
 (20, 'img_aug', 0.93200000000000005, 0.25656785827279088),
 (40, 'bs', 0.92879999999999996, 0.2364862256679684),
 (40, 'fm_aug', 0.93320000000000003, 0.22113356392942368),
 (40, 'img_aug', 0.93240000000000001, 0.21645076183713974)]

In [353]:
res_lst.append(sc)
res_lst

[(10, 'fm_aug', 0.85319999999999996, 0.36306085548214612),
 (10, 'bs', 0.73480000000000001, 0.47574473093301056),
 (10, 'img_aug', 0.80600000000000005, 0.39488133698888123),
 (20, 'bs', 0.90800000000000003, 0.29638660823404789),
 (20, 'fm_aug', 0.91479999999999995, 0.26443358783498405),
 (20, 'img_aug', 0.93200000000000005, 0.25656785827279088),
 (40, 'bs', 0.92879999999999996, 0.2364862256679684),
 (40, 'fm_aug', 0.93320000000000003, 0.22113356392942368),
 (40, 'img_aug', 0.93240000000000001, 0.21645076183713974),
 (80, 'bs', 0.94079999999999997, 0.19860990840010345)]

In [292]:
from sklearn.metrics import classification_report
print(classification_report(y_test, (pred > 0.5).astype('int')))

             precision    recall  f1-score   support

        0.0       0.93      0.88      0.91      1250
        1.0       0.89      0.93      0.91      1250

avg / total       0.91      0.91      0.91      2500



We have got a very strong baseline with very simple and fast model. Thank to using pre-trained model.
But let's see if we can improve that with image augmentation. 

## Image Augmentation

We are going to use Keras ImageDataGenerator to do the following augmentations:
 random crops, and random horizontal flips.  

Why those?
Because they are common on ImageNet Challenge. And also because similar transformations could be applied to feature maps. So we are giving feature map transformation a fair chance.  


In [363]:
train_datagen = ImageDataGenerator(
        #shear_range=0.2,
        width_shift_range=0.2,
        height_shift_range =0.1,
        horizontal_flip=True)

Because of image augmentation we cannot use prepared output of VGG convolution layers. So we need to attach our model and it is the same model but the training starts from the image level. And for every batch we run the input through the all layer of VGG - it takes time. 

In [364]:
m_name = 'img_aug'
model = attach_top_to_vgg(fc_bin_m1, layer_name = 'block5_conv3')


In [365]:
model.compile(loss='binary_crossentropy',
              optimizer= Adam(lr=0.001),
              metrics=['accuracy'])

In [366]:
model.optimizer.lr = 1e-5
history = model.fit_generator(train_datagen.flow(X_train, y_train, batch_size = batch_size),
                    samples_per_epoch=len(X_train), nb_epoch = 30, verbose = 2,
                    validation_data=(X_val, y_val))

Epoch 1/30
16s - loss: 0.7018 - acc: 0.5125 - val_loss: 0.7821 - val_acc: 0.4940
Epoch 2/30
14s - loss: 0.6080 - acc: 0.7500 - val_loss: 0.7070 - val_acc: 0.5180
Epoch 3/30
14s - loss: 0.5195 - acc: 0.8000 - val_loss: 0.6654 - val_acc: 0.5440
Epoch 4/30
14s - loss: 0.4838 - acc: 0.8375 - val_loss: 0.6292 - val_acc: 0.5700
Epoch 5/30
14s - loss: 0.4172 - acc: 0.8500 - val_loss: 0.5862 - val_acc: 0.6100
Epoch 6/30
14s - loss: 0.3925 - acc: 0.8875 - val_loss: 0.5511 - val_acc: 0.6400
Epoch 7/30
14s - loss: 0.3754 - acc: 0.8500 - val_loss: 0.5158 - val_acc: 0.6720
Epoch 8/30
14s - loss: 0.3545 - acc: 0.8875 - val_loss: 0.4804 - val_acc: 0.6940
Epoch 9/30
14s - loss: 0.3482 - acc: 0.9125 - val_loss: 0.4417 - val_acc: 0.7300
Epoch 10/30
14s - loss: 0.3137 - acc: 0.9250 - val_loss: 0.4054 - val_acc: 0.7780
Epoch 11/30
14s - loss: 0.3184 - acc: 0.9250 - val_loss: 0.3746 - val_acc: 0.8060
Epoch 12/30
14s - loss: 0.3016 - acc: 0.9500 - val_loss: 0.3499 - val_acc: 0.8200
Epoch 13/30
14s - loss: 0

In [368]:
%%time
model.optimizer.lr = 1e-6
history = model.fit_generator(train_datagen.flow(X_train, y_train, batch_size = batch_size),
                    samples_per_epoch=len(X_train), nb_epoch = 20, verbose = 2,
                    validation_data=(X_val, y_val))

Epoch 1/20
14s - loss: 0.1656 - acc: 0.9750 - val_loss: 0.2039 - val_acc: 0.9420
Epoch 2/20
14s - loss: 0.1591 - acc: 0.9750 - val_loss: 0.2013 - val_acc: 0.9420
Epoch 3/20
14s - loss: 0.1485 - acc: 0.9875 - val_loss: 0.1986 - val_acc: 0.9480
Epoch 4/20
14s - loss: 0.1612 - acc: 0.9625 - val_loss: 0.1962 - val_acc: 0.9460
Epoch 5/20
14s - loss: 0.1497 - acc: 1.0000 - val_loss: 0.1942 - val_acc: 0.9460
Epoch 6/20
14s - loss: 0.1437 - acc: 0.9875 - val_loss: 0.1925 - val_acc: 0.9500
Epoch 7/20
14s - loss: 0.1336 - acc: 1.0000 - val_loss: 0.1909 - val_acc: 0.9520
Epoch 8/20
14s - loss: 0.1285 - acc: 1.0000 - val_loss: 0.1895 - val_acc: 0.9520
Epoch 9/20
14s - loss: 0.1244 - acc: 0.9875 - val_loss: 0.1879 - val_acc: 0.9520
Epoch 10/20
14s - loss: 0.1382 - acc: 0.9750 - val_loss: 0.1867 - val_acc: 0.9520
Epoch 11/20
14s - loss: 0.1254 - acc: 0.9875 - val_loss: 0.1856 - val_acc: 0.9480
Epoch 12/20
14s - loss: 0.1313 - acc: 1.0000 - val_loss: 0.1848 - val_acc: 0.9480
Epoch 13/20
14s - loss: 0

In [369]:
pred = model.predict(X_test,batch_size=24)
pred.shape
pred = reduce_conf(pred)
pred_class = (pred > 0.5).astype('float32')
sc = (n_samples, m_name, metrics.accuracy_score(y_test, pred_class), metrics.log_loss(y_test, pred))
sc
res_lst

(2500, 1)

(80, 'img_aug', 0.93359999999999999, 0.20017809524908661)

[(10, 'fm_aug', 0.85319999999999996, 0.36306085548214612),
 (10, 'bs', 0.73480000000000001, 0.47574473093301056),
 (10, 'img_aug', 0.80600000000000005, 0.39488133698888123),
 (20, 'bs', 0.90800000000000003, 0.29638660823404789),
 (20, 'fm_aug', 0.91479999999999995, 0.26443358783498405),
 (20, 'img_aug', 0.93200000000000005, 0.25656785827279088),
 (40, 'bs', 0.92879999999999996, 0.2364862256679684),
 (40, 'fm_aug', 0.93320000000000003, 0.22113356392942368),
 (40, 'img_aug', 0.93240000000000001, 0.21645076183713974),
 (80, 'bs', 0.94079999999999997, 0.19860990840010345),
 (80, 'fm_aug', 0.94079999999999997, 0.20648835626021028)]

In [370]:
res_lst.append(sc)
res_lst

[(10, 'fm_aug', 0.85319999999999996, 0.36306085548214612),
 (10, 'bs', 0.73480000000000001, 0.47574473093301056),
 (10, 'img_aug', 0.80600000000000005, 0.39488133698888123),
 (20, 'bs', 0.90800000000000003, 0.29638660823404789),
 (20, 'fm_aug', 0.91479999999999995, 0.26443358783498405),
 (20, 'img_aug', 0.93200000000000005, 0.25656785827279088),
 (40, 'bs', 0.92879999999999996, 0.2364862256679684),
 (40, 'fm_aug', 0.93320000000000003, 0.22113356392942368),
 (40, 'img_aug', 0.93240000000000001, 0.21645076183713974),
 (80, 'bs', 0.94079999999999997, 0.19860990840010345),
 (80, 'fm_aug', 0.94079999999999997, 0.20648835626021028),
 (80, 'img_aug', 0.93359999999999999, 0.20017809524908661)]

You have to be patient - training gets 100 times slower. 

## Augmenting feature maps

Now we are going to apply similar spatial transformation to feature maps extracted by the pre-trained model

In [166]:
def aug_fun(x, cr = (0.1,0.1)):#crop_shape=(12,9)):
    shape = (x.shape[1],x.shape[2]) #expects a batch of image tensors
    crop_shape = ((int(shape[0] - cr[0]*shape[0])), (int(shape[1] - cr[1]*shape[1])))
    #print(crop_shape)
    stride = [0,0]
    stride[0] = np.random.randint(shape[0] - crop_shape[0] + 1)
    stride[1] = np.random.randint(shape[1] - crop_shape[1] + 1)
    #print("sride:", stride)
    res = x[:,stride[0]:(crop_shape[0]+stride[0]),stride[1]:(crop_shape[1]+stride[1]),:]
    #and random flip
    if np.random.binomial(1,0.5):
        res = np.flip(res, axis = 2)
    return res
        

In [167]:
def conv_crop_gen(X, y, cr = (0.2,0.2)):
    while True:
        for i in np.random.permutation(X.shape[0]):
            res = (aug_fun(X[i:i+1,::], cr), y[i:i+1])
            yield res

In [168]:
def aug_fun_flex(x, cr = (0.1,0.1)):#crop_shape=(12,9)):
    shape = (x.shape[1],x.shape[2]) #expects a batch of image tensors
    crop_shape = (shape[0] - np.random.randint(math.ceil(cr[0]*shape[0])+1), 
                  shape[1] - np.random.randint(math.ceil(cr[1]*shape[1])+1))
    #print(crop_shape)
    stride = [0,0]
    stride[0] = np.random.randint(shape[0] - crop_shape[0] + 1)
    stride[1] = np.random.randint(shape[1] - crop_shape[1] + 1)
    #print("sride:", stride)
    res = x[:,stride[0]:(crop_shape[0]+stride[0]),stride[1]:(crop_shape[1]+stride[1]),:]
    #and random flip
    if np.random.binomial(1,0.5):
        #res = np.fliplr(res)
        res = np.flip(res, axis = 2)
        #res = res[:,::-1]
    return res

In [217]:
def crop_gen_flex(X, y, cr = (0.3,0.3), batch_size = 8):
    while True:
        idx = np.random.permutation(X.shape[0])
        for j in range(0, X.shape[0], batch_size):
            #print(j)
            b_idx = idx[j:(j+batch_size)]
            res = (aug_fun_flex(X[b_idx,::], cr), y[b_idx])
            yield res

In [354]:
train_gen = gen_batch_wrap(conv_crop_gen(X2_train, y_train, cr = (0.2,0.2)), batch_size = batch_size)


In [381]:
m_name = 'fm_aug'
input_shape = (None, None, 512)
m_input = Input(shape=input_shape)
model = Model(m_input, fc_bin_m1(m_input))

In [382]:
model.compile(loss='binary_crossentropy',
              optimizer= Adam(lr=0.001),
              metrics=['accuracy'])

In [383]:
model.optimizer.lr = 1e-5
history = model.fit_generator(train_gen,
                    samples_per_epoch=len(X2_train), nb_epoch = 30, verbose = 2,
                    validation_data=(X2_val, y_val))


Epoch 1/30
2s - loss: 0.7343 - acc: 0.5500 - val_loss: 0.6855 - val_acc: 0.5620
Epoch 2/30
0s - loss: 0.6479 - acc: 0.5875 - val_loss: 0.6158 - val_acc: 0.6340
Epoch 3/30
0s - loss: 0.5221 - acc: 0.7750 - val_loss: 0.6177 - val_acc: 0.5800
Epoch 4/30
0s - loss: 0.4715 - acc: 0.8500 - val_loss: 0.6221 - val_acc: 0.5760
Epoch 5/30
0s - loss: 0.3986 - acc: 0.9125 - val_loss: 0.5918 - val_acc: 0.6040
Epoch 6/30
0s - loss: 0.3876 - acc: 0.9125 - val_loss: 0.5585 - val_acc: 0.6240
Epoch 7/30
0s - loss: 0.3368 - acc: 0.9375 - val_loss: 0.5114 - val_acc: 0.6760
Epoch 8/30
0s - loss: 0.2901 - acc: 0.9500 - val_loss: 0.4685 - val_acc: 0.7180
Epoch 9/30
0s - loss: 0.3008 - acc: 0.9125 - val_loss: 0.4372 - val_acc: 0.7540
Epoch 10/30
0s - loss: 0.2615 - acc: 0.9875 - val_loss: 0.4075 - val_acc: 0.7820
Epoch 11/30
0s - loss: 0.2598 - acc: 0.9625 - val_loss: 0.3877 - val_acc: 0.8040
Epoch 12/30
0s - loss: 0.2812 - acc: 0.9125 - val_loss: 0.3659 - val_acc: 0.8240
Epoch 13/30
0s - loss: 0.1997 - acc: 

In [384]:
%%time
model.optimizer.lr = 1e-6
history = model.fit_generator(train_gen,
                    samples_per_epoch=len(X2_train), nb_epoch = 20, verbose = 2,
                    validation_data=(X2_val, y_val))

Epoch 1/20
0s - loss: 0.1087 - acc: 0.9875 - val_loss: 0.2179 - val_acc: 0.9440
Epoch 2/20
0s - loss: 0.1159 - acc: 0.9875 - val_loss: 0.2150 - val_acc: 0.9420
Epoch 3/20
0s - loss: 0.1145 - acc: 0.9875 - val_loss: 0.2125 - val_acc: 0.9420
Epoch 4/20
0s - loss: 0.1232 - acc: 0.9750 - val_loss: 0.2101 - val_acc: 0.9420
Epoch 5/20
0s - loss: 0.1088 - acc: 0.9875 - val_loss: 0.2082 - val_acc: 0.9420
Epoch 6/20
0s - loss: 0.1088 - acc: 1.0000 - val_loss: 0.2064 - val_acc: 0.9420
Epoch 7/20
0s - loss: 0.1036 - acc: 0.9875 - val_loss: 0.2047 - val_acc: 0.9420
Epoch 8/20
0s - loss: 0.0779 - acc: 1.0000 - val_loss: 0.2034 - val_acc: 0.9460
Epoch 9/20
0s - loss: 0.0845 - acc: 1.0000 - val_loss: 0.2016 - val_acc: 0.9440
Epoch 10/20
0s - loss: 0.0854 - acc: 1.0000 - val_loss: 0.2000 - val_acc: 0.9440
Epoch 11/20
0s - loss: 0.0957 - acc: 1.0000 - val_loss: 0.1982 - val_acc: 0.9440
Epoch 12/20
0s - loss: 0.0901 - acc: 1.0000 - val_loss: 0.1966 - val_acc: 0.9460
Epoch 13/20
0s - loss: 0.1016 - acc: 

In [360]:
pred = model.predict(X2_test)
pred.shape
pred = reduce_conf(pred)
pred_class = (pred > 0.5).astype('float32')
sc = (n_samples, m_name, metrics.accuracy_score(y_test, pred_class), metrics.log_loss(y_test, pred))
sc
res_lst

(2500, 1)

(80, 'fm_aug', 0.94079999999999997, 0.20648835626021028)

[(10, 'fm_aug', 0.85319999999999996, 0.36306085548214612),
 (10, 'bs', 0.73480000000000001, 0.47574473093301056),
 (10, 'img_aug', 0.80600000000000005, 0.39488133698888123),
 (20, 'bs', 0.90800000000000003, 0.29638660823404789),
 (20, 'fm_aug', 0.91479999999999995, 0.26443358783498405),
 (20, 'img_aug', 0.93200000000000005, 0.25656785827279088),
 (40, 'bs', 0.92879999999999996, 0.2364862256679684),
 (40, 'fm_aug', 0.93320000000000003, 0.22113356392942368),
 (40, 'img_aug', 0.93240000000000001, 0.21645076183713974),
 (80, 'bs', 0.94079999999999997, 0.19860990840010345)]

In [362]:
res_lst.append(sc)
res_lst

[(10, 'fm_aug', 0.85319999999999996, 0.36306085548214612),
 (10, 'bs', 0.73480000000000001, 0.47574473093301056),
 (10, 'img_aug', 0.80600000000000005, 0.39488133698888123),
 (20, 'bs', 0.90800000000000003, 0.29638660823404789),
 (20, 'fm_aug', 0.91479999999999995, 0.26443358783498405),
 (20, 'img_aug', 0.93200000000000005, 0.25656785827279088),
 (40, 'bs', 0.92879999999999996, 0.2364862256679684),
 (40, 'fm_aug', 0.93320000000000003, 0.22113356392942368),
 (40, 'img_aug', 0.93240000000000001, 0.21645076183713974),
 (80, 'bs', 0.94079999999999997, 0.19860990840010345),
 (80, 'fm_aug', 0.94079999999999997, 0.20648835626021028)]

In [388]:
pred_lst = [model.predict(aug_fun(X2_test, cr = (0.1,0.3))) for i in range(10)]

In [389]:
sc = [metrics.log_loss(y_test, reduce_conf(pr)) for pr in pred_lst] 
sc
pred = reduce_conf(np.mean(pred_lst, axis = 0))
metrics.log_loss(y_test, pred)
pred_class = (pred > 0.5).astype('float32')
metrics.accuracy_score(y_test, pred_class)

[0.17868495296649634,
 0.18569375315867365,
 0.20369601135216653,
 0.18570210080705582,
 0.18602345951274038,
 0.19475754460133612,
 0.20732106727436184,
 0.18169053114876152,
 0.18219134603515266,
 0.17960556001923977]

0.1826728599641472

0.94040000000000001

## Conclusion

In [371]:
df = pd.DataFrame(res_lst, columns=['n_samples','model','accuracy','logloss'])
df.shape
df.head()

(12, 4)

Unnamed: 0,n_samples,model,accuracy,logloss
0,10,fm_aug,0.8532,0.363061
1,10,bs,0.7348,0.475745
2,10,img_aug,0.806,0.394881
3,20,bs,0.908,0.296387
4,20,fm_aug,0.9148,0.264434


In [378]:
df = df.sort_values(by=['n_samples','model'])
df

Unnamed: 0,n_samples,model,accuracy,logloss,sec/epoch
1,10,baseline,0.7348,0.475745,0
0,10,feature maps augmentation,0.8532,0.363061,0
2,10,image augmentation,0.806,0.394881,0
3,20,baseline,0.908,0.296387,0
4,20,feature maps augmentation,0.9148,0.264434,0
5,20,image augmentation,0.932,0.256568,0
6,40,baseline,0.9288,0.236486,0
7,40,feature maps augmentation,0.9332,0.221134,0
8,40,image augmentation,0.9324,0.216451,0
9,80,baseline,0.9408,0.19861,0


In [377]:
df['sec/epoch'] = 0
df = df.replace({'model':{'bs':'baseline','img_aug':'image augmentation','fm_aug':'feature maps augmentation'}})

In [390]:
#df
df.dtypes
#print(df.to_html(index = False, float_format='%.3f'))
df

n_samples      int64
model         object
accuracy     float64
logloss      float64
sec/epoch      int64
dtype: object

Unnamed: 0,n_samples,model,accuracy,logloss,sec/epoch
1,10,baseline,0.7348,0.475745,0
0,10,feature maps augmentation,0.8532,0.363061,0
2,10,image augmentation,0.806,0.394881,0
3,20,baseline,0.908,0.296387,0
4,20,feature maps augmentation,0.9148,0.264434,0
5,20,image augmentation,0.932,0.256568,0
6,40,baseline,0.9288,0.236486,0
7,40,feature maps augmentation,0.9332,0.221134,0
8,40,image augmentation,0.9324,0.216451,0
9,80,baseline,0.9408,0.19861,0
