## using backbone from pre-trained keras resnet50/vgg19 apps

In [1]:
%matplotlib inline
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.applications.resnet50 import *
from tensorflow.keras.datasets import cifar10
# from keras.datasets import cifar10
# from keras.applications.resnet50 import *
# from keras.layers import *
# from keras.models import *
# import keras

import numpy as np
import pandas as pd
keras.__version__

'2.2.4-tf'

In [2]:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

### let's train a bin classifier, find what happens when pos/neg samples are extremely unbalanced

In [5]:
x_train_bin = x_train.copy()
y_train_bin = y_train==0
x_train_bin.shape, y_train_bin.shape

((50000, 32, 32, 3), (50000, 1))

In [6]:
x = preprocess_input(x_train_bin[0])

In [7]:
x = np.expand_dims(x, axis=0)
x.shape

(1, 32, 32, 3)

In [7]:
m = ResNet50(weights='imagenet', include_top=False)
m.summary()



__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, None, None, 3 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, None, None, 3 0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, None, None, 6 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, None, None, 6 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation

In [8]:
for layer in m.layers:
    layer.trainable = False

In [10]:
x_features = m.predict(x=x)
x_features.shape

(1, 1, 1, 2048)

In [9]:
x = m.output
x = GlobalAveragePooling2D()(x)
x = Dense(10, activation='relu')(x)
prediction = Dense(1, activation='sigmoid')(x)

In [10]:
baseline_model = Model(inputs=m.input, outputs=prediction)
baseline_model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, None, None, 3 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, None, None, 3 0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, None, None, 6 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, None, None, 6 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation

In [11]:
baseline_model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['acc'])

In [12]:
x_valid_bin = x_test.copy()
y_valid_bin = y_test==1

In [13]:
import pandas as pd
hist = baseline_model.fit(x=x_train_bin, y=y_train_bin, validation_data=(x_valid_bin, y_valid_bin), epochs=10, 
                          verbose=1)

Train on 50000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [31]:
baseline_model.evaluate(x=x_valid_bin, y=y_valid_bin)



[0.3152909840464592, 0.8947]

### train the bin classifier with skewed data

In [14]:
feats = m.output

In [42]:
x = GlobalAveragePooling2D()(feats)
x = Dense(10, activation='relu')(x)
y = Dense(1, activation='sigmoid')(x)
test_model = Model(inputs=m.inputs, outputs=y)

In [43]:
test_model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['acc'])

In [38]:
mask = (y_train==1)
mask = np.squeeze(mask)
mask.shape

(50000,)

In [39]:
x_train_bin_sess1 = x_train_bin[mask]
x_train_bin_sess1.shape

(5000, 32, 32, 3)

In [40]:
y_train_bin_sess1 = y_train_bin[mask]
y_train_bin_sess1.shape

(5000, 1)

In [42]:
test_model.fit(x=x_train_bin_sess1, y=y_train_bin_sess1, epochs=10, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f9d369d9eb8>

In [43]:
test_model.evaluate(x=x_valid_bin, y=y_valid_bin)



[10.203698847961427, 0.1]

Appearantly, that wont work since this model is only taught by possitive examples. An auto negative example gens seems to be necessary. Just keep stream in negative samples and see what happens

In [56]:
# this time stream in data with 2nd class
mask = y_train==2
mask = np.squeeze(mask)
x_train_bin_sess2 = x_train_bin[mask]
y_train_bin_sess2 = y_train_bin[mask]
mask.shape, x_train_bin_sess2.shape, y_train_bin_sess2.shape

((50000,), (5000, 32, 32, 3), (5000, 1))

In [57]:
test_model.fit(x=x_train_bin_sess2, y=y_train_bin_sess2, epochs=10, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f9d27e9e7b8>

In [59]:
testout = test_model.predict(x_valid_bin)

array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])

The model simply goes to the other extreme end

In [63]:
np.sum(testout>0.1)

1

Even in bin classifier, skewed data hurt model badly

## Train the bin classifier incrementally

Next, we create 'perfect' learning sessions manually trying to defeat the baseline model (89% valid acc). The purpose of this experiment is to figure out whether extra negtive samples help with the model

In [3]:
def get_model_bin_classify(feature_extractor):
    x = feature_extractor.output
    x = GlobalAveragePooling2D()(x)
    x = Dense(10, activation='relu')(x)
    x = Dense(1, activation='sigmoid')(x)
    model = Model(inputs=feature_extractor.input, outputs=x)
    model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['acc'])
    return model

In [4]:
def get_train_samples(X_train, y_train, cls_li):
    mask = np.zeros(y_train.shape, dtype=np.bool)
    for cls in cls_li:
        mask += (y_train==cls)
    mask = np.squeeze(mask)
    return X_train[mask], y_train[mask]
    
def eva_bin_on_cls(model, X_valid, y_valid, cls_li, pos_samples):
    for cls in cls_li:
        mask = (y_valid==cls)
        mask = np.squeeze(mask)
        X_valid_cur_sess = X_valid[mask]
        y_valid_cur_sess = y_valid[mask]
        y_valid_cur_sess = convert_to_bin_samples(y_valid_cur_sess, pos_samples)
        loss, metric = model.evaluate(x=X_valid_cur_sess, y=y_valid_cur_sess)
        print('Evaluation loss: %f. Acc: %f' % (loss, metric))

def convert_to_bin_samples(labels, pos_samples):
    # arg1 = original labels, pos_samples = samples mark possitive
    re_label = np.zeros(labels.shape, dtype=np.bool)
    for it in pos_samples:
        re_label += (labels == it)
    return re_label

In [5]:
feat_ext = ResNet50(weights='imagenet', include_top=False)



In [6]:
test_model = get_model_bin_classify(feat_ext)

In [16]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 1])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1], [0])

Evaluation loss: 0.108045. Acc: 0.964000
Evaluation loss: 0.099776. Acc: 0.970000


In [19]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 2])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.203777. Acc: 0.941000
Evaluation loss: 2.102987. Acc: 0.589000
Evaluation loss: 0.203168. Acc: 0.941000


In [21]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 3])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.226183. Acc: 0.939000
Evaluation loss: 1.429062. Acc: 0.710000
Evaluation loss: 0.496108. Acc: 0.886000
Evaluation loss: 0.070448. Acc: 0.978000


In [23]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 4])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3, 4], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.220857. Acc: 0.948000
Evaluation loss: 2.498029. Acc: 0.561000
Evaluation loss: 0.620862. Acc: 0.870000
Evaluation loss: 0.155282. Acc: 0.951000
Evaluation loss: 0.072679. Acc: 0.980000


In [24]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 5])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3, 4, 5], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.245425. Acc: 0.938000
Evaluation loss: 1.942354. Acc: 0.628000
Evaluation loss: 0.467712. Acc: 0.904000
Evaluation loss: 0.067387. Acc: 0.977000
Evaluation loss: 0.116070. Acc: 0.966000
Evaluation loss: 0.037063. Acc: 0.989000


In [25]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 6])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3, 4, 5, 6], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.227224. Acc: 0.942000
Evaluation loss: 1.758390. Acc: 0.667000
Evaluation loss: 0.471796. Acc: 0.888000
Evaluation loss: 0.117186. Acc: 0.966000
Evaluation loss: 0.140975. Acc: 0.958000
Evaluation loss: 0.060888. Acc: 0.985000
Evaluation loss: 0.030794. Acc: 0.993000


In [26]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 7])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3, 4, 5, 6, 7], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.402271. Acc: 0.921000
Evaluation loss: 1.474920. Acc: 0.723000
Evaluation loss: 0.479810. Acc: 0.903000
Evaluation loss: 0.062342. Acc: 0.979000
Evaluation loss: 0.070819. Acc: 0.986000
Evaluation loss: 0.017832. Acc: 0.992000
Evaluation loss: 0.041253. Acc: 0.984000
Evaluation loss: 0.033257. Acc: 0.993000


In [27]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 8])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3, 4, 5, 6, 7, 8], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.517809. Acc: 0.878000
Evaluation loss: 0.418596. Acc: 0.892000
Evaluation loss: 0.961993. Acc: 0.818000
Evaluation loss: 0.383018. Acc: 0.913000
Evaluation loss: 0.346881. Acc: 0.908000
Evaluation loss: 0.248727. Acc: 0.937000
Evaluation loss: 0.221799. Acc: 0.945000
Evaluation loss: 0.417072. Acc: 0.873000
Evaluation loss: 0.100033. Acc: 0.964000


In [28]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 9])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=3)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.575581. Acc: 0.870000
Evaluation loss: 0.069536. Acc: 0.979000
Evaluation loss: 1.360952. Acc: 0.749000
Evaluation loss: 0.297052. Acc: 0.921000
Evaluation loss: 0.418982. Acc: 0.885000
Evaluation loss: 0.172563. Acc: 0.954000
Evaluation loss: 0.137431. Acc: 0.961000
Evaluation loss: 0.211672. Acc: 0.931000
Evaluation loss: 0.369526. Acc: 0.901000
Evaluation loss: 0.052197. Acc: 0.985000


The result is rather satisfied yet confusing. Why the perf on 2nd class differs by turns? Let's see how a partial trained bin classifier perform on unseened data

## Experiments on unseen negative samples

In [7]:
X_train_sess, y_train_sess = get_train_samples(x_train, y_train, [0, 1])
y_train_sess = convert_to_bin_samples(y_train_sess, [0])
test_model.fit(x=X_train_sess, y=y_train_sess, epochs=5, verbose=1)
eva_bin_on_cls(test_model, x_test, y_test, [0, 1, 2, 3, 4, 5], [0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Evaluation loss: 0.200314. Acc: 0.934000
Evaluation loss: 0.188399. Acc: 0.922000
Evaluation loss: 3.093190. Acc: 0.224000
Evaluation loss: 1.510724. Acc: 0.454000
Evaluation loss: 2.346873. Acc: 0.255000
Evaluation loss: 1.400316. Acc: 0.506000


In [28]:
X_test_subset, Y_test_set = get_train_samples(x_test, y_test, [2])
prediction = test_model.predict(X_test_subset)

In [38]:
np.sum(prediction < 0.9) / np.squeeze(prediction).shape[0]

0.476

The model **does not generate itself well enough on unseen negative samples** even by increasing the threshold up to 0.9, yet it will **degrade its performance on positive samples sufficiently**. By which, every/relavent classifiers need to be updated by turns.

## conclusion
- Performance on seen negative samples will be intefered when newly arrived negative samples observed by model. Sometimes increasing sometimes decreasing. 
- It seems like that the model's generation capability is performing better and better when enough negative classes have been trained before.
- The model performs poorly on unseen negative samples which emphasizes the necessity of upgrade when new negative class came