# Exercise on the value of unsupervised constructed features for training a classifier with few labeled examples: 

To get unsupervised constructed features of an image, we can use a pretrained CNN as feature extractor. 

We have done this to extract features from 100 Cifar10 images.  As pretrained CNN we use a VGG16 architecture that was trained on ImageNet data and was the second winner of the ImageNet competition in 2014. 

As a check on the quality of the feature representation of the CIFAR10 data, we will use once the pixel-features and once the VGG-features to train a classifier using this 100 labeled data (on average 10 per class). If the VGG-feature are indeed better than the raw pixel values, we would expect to achieve a better classifier when using the VGG-feature compared to the pixel feature.

a) Which accuracy would you expect for a classifier which cannot distinguish between the 10 classes and is only guessing?



b) Go through the code which is used to set-up, train, and evaluate a CNN classifier using the raw pixel features. Discuss your thoughts on the achieved accuracy (e.g. with your neighbor).


b) Now we use the unsupervised constructed VGG features. We want to check, if these VGG features are good enough to train a classifier with only few labeled data and still get a satisfying performance. For this purpose, please complet the code to set up a fully connected NN and run the provided subsequent code to train it and determine its accuracy on the test set. Compare it to the accuracy which we achieve with a RF. Discuss the results (e.g. with your neighbor).




### Imports

In [3]:
%matplotlib inline

import matplotlib.pyplot as plt
import matplotlib.image as imgplot
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix
from pylab import *

import time
import tensorflow as tf

import keras
import sys
print ("Keras {} TF {} Python {}".format(keras.__version__, tf.__version__, sys.version_info))

Keras 2.4.3 TF 2.4.1 Python sys.version_info(major=3, minor=7, micro=10, releaselevel='final', serial=0)


### CIFAR Data preparation

In [4]:
#downlad cifar data
from keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
del [x_test,y_test]

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [5]:
#loop over each class label and sample 100 random images over each label and save the idx to subset
np.random.seed(seed=222)
idx=np.empty(0,dtype="int8")
for i in range(0,len(np.unique(y_train))):
    idx=np.append(idx,np.random.choice(np.where((y_train[0:len(y_train)])==i)[0],100,replace=False))

x_train= x_train[idx]
y_train= y_train[idx]

In [6]:
print(x_train.shape)
print(y_train.shape)
print(np.unique(y_train,return_counts=True))

(1000, 32, 32, 3)
(1000, 1)
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100]))


In [7]:
#make train vaild and test
#loop over each class label and sample 100 random images over each label and save the idx to subset
np.random.seed(seed=123)
idx_train=np.empty(0,dtype="int8")
for i in range(0,len(np.unique(y_train))):
    idx_train=np.append(idx_train,np.random.choice(np.where((y_train[0:len(y_train)])==i)[0],10,replace=False))

x_train_new = x_train[idx_train]
y_train_new = y_train[idx_train]

In [8]:
x_test_new=(np.delete(x_train,idx_train,axis=0))
y_test_new=(np.delete(y_train,idx_train,axis=0))

In [9]:
np.random.seed(seed=127)
idx_vaild=np.empty(0,dtype="int8")
for i in range(0,len(np.unique(y_test_new))):
    idx_vaild=np.append(idx_vaild,np.random.choice(np.where((y_test_new[0:len(y_test_new)])==i)[0],10,replace=False))

x_vaild_new = x_test_new[idx_vaild]
y_valid_new = y_test_new[idx_vaild]

In [10]:
x_test_new=(np.delete(x_test_new,idx_vaild,axis=0))
y_test_new=(np.delete(y_test_new,idx_vaild,axis=0))

In [11]:
x_train_new = np.reshape(x_train_new, (100,32,32,3))
x_vaild_new = np.reshape(x_vaild_new, (100,32,32,3))
x_test_new = np.reshape(x_test_new, (800,32,32,3))

In [12]:
print(np.unique(y_train_new,return_counts=True))
print(np.unique(y_valid_new,return_counts=True))
print(np.unique(y_test_new,return_counts=True))

(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10]))
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10]))
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([80, 80, 80, 80, 80, 80, 80, 80, 80, 80]))


In [13]:
from keras.utils.np_utils import to_categorical   

y_train_new=to_categorical(y_train_new,10)
y_valid_new=to_categorical(y_valid_new,10)
y_test_new=to_categorical(y_test_new,10)



In [14]:
print(x_train_new.shape)
print(y_train_new.shape)

print(x_vaild_new.shape)
print(y_valid_new.shape)

print(x_test_new.shape)
print(y_test_new.shape)

(100, 32, 32, 3)
(100, 10)
(100, 32, 32, 3)
(100, 10)
(800, 32, 32, 3)
(800, 10)


In [15]:
# center and standardize the data
X_mean = np.mean( x_train_new, axis = 0)
X_std = np.std( x_train_new, axis = 0)

x_train_new = (x_train_new - X_mean ) / (X_std + 0.0001)
x_vaild_new = (x_vaild_new - X_mean ) / (X_std + 0.0001)
x_test_new = (x_test_new - X_mean ) / (X_std + 0.0001)

### Setting up the the CNN classifier based on raw image data

In [16]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, BatchNormalization
from keras.layers import Convolution2D, MaxPooling2D, Flatten


In [17]:
# here we define  hyperparameter of the NN
batch_size = 10
nb_classes = 10
nb_epoch = 30
img_rows, img_cols = 32, 32
kernel_size = (3, 3)
input_shape = (img_rows, img_cols, 3)
pool_size = (2, 2)

In [18]:
model = Sequential()

model.add(Convolution2D(8,kernel_size,padding='same',input_shape=input_shape))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Convolution2D(8, kernel_size,padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))

model.add(Convolution2D(16, kernel_size,padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(Convolution2D(16,kernel_size,padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))

model.add(Flatten())
model.add(Dense(40))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(Activation('relu'))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])


In [19]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 32, 32, 8)         224       
_________________________________________________________________
batch_normalization (BatchNo (None, 32, 32, 8)         32        
_________________________________________________________________
activation (Activation)      (None, 32, 32, 8)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 8)         584       
_________________________________________________________________
batch_normalization_1 (Batch (None, 32, 32, 8)         32        
_________________________________________________________________
activation_1 (Activation)    (None, 32, 32, 8)         0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 8)         0

In [20]:
history=model.fit(x_train_new, y_train_new, 
                  batch_size=10, 
                  epochs=30,
                  verbose=2, 
                  validation_data=(x_vaild_new, y_valid_new),shuffle=True)

Epoch 1/30
10/10 - 2s - loss: 2.7014 - accuracy: 0.1300 - val_loss: 2.2885 - val_accuracy: 0.1400
Epoch 2/30
10/10 - 0s - loss: 2.0478 - accuracy: 0.2500 - val_loss: 2.2634 - val_accuracy: 0.1600
Epoch 3/30
10/10 - 0s - loss: 1.7819 - accuracy: 0.3800 - val_loss: 2.2411 - val_accuracy: 0.1400
Epoch 4/30
10/10 - 0s - loss: 1.5334 - accuracy: 0.5200 - val_loss: 2.2143 - val_accuracy: 0.2100
Epoch 5/30
10/10 - 0s - loss: 1.3252 - accuracy: 0.6200 - val_loss: 2.2050 - val_accuracy: 0.1800
Epoch 6/30
10/10 - 0s - loss: 1.2511 - accuracy: 0.6500 - val_loss: 2.2441 - val_accuracy: 0.1600
Epoch 7/30
10/10 - 0s - loss: 1.1097 - accuracy: 0.7200 - val_loss: 2.2666 - val_accuracy: 0.1600
Epoch 8/30
10/10 - 0s - loss: 0.9868 - accuracy: 0.7500 - val_loss: 2.2543 - val_accuracy: 0.1600
Epoch 9/30
10/10 - 0s - loss: 0.7885 - accuracy: 0.8600 - val_loss: 2.2802 - val_accuracy: 0.1400
Epoch 10/30
10/10 - 0s - loss: 0.8120 - accuracy: 0.8400 - val_loss: 2.3055 - val_accuracy: 0.1400
Epoch 11/30
10/10 -

### Evaluation of the CNN classifier that was trained on raw image data

In [21]:
from sklearn.metrics import confusion_matrix
pred=model.predict(x_test_new)
print(confusion_matrix(np.argmax(y_test_new,axis=1),np.argmax(pred,axis=1)))
print("Acc = " ,np.sum(np.argmax(y_test_new,axis=1)==np.argmax(pred,axis=1))/len(y_test_new))


[[36  6  6  6  3  0  1  1 16  5]
 [ 7 16  5  2  8  4  3  2 18 15]
 [ 3  5 17  7 27  0  8  0 10  3]
 [ 0  9 17 15 14  2 17  0  4  2]
 [ 5  5  8  6 28  2 12  0 13  1]
 [ 5  5 11 13 19  7 10  3  6  1]
 [ 4  4 10 13 22  2 21  1  1  2]
 [ 4  2 10  9 24  4  5 10  9  3]
 [18  2  8  3  5  2  3  2 31  6]
 [ 5 10  7  4 13  3  3  1 19 15]]
Acc =  0.245


### Getting the VGG features for CIFAR

In [22]:
# Downloading embeddings
import urllib
import os
if not os.path.isfile('cifar_EMB_1000.npz'):
    urllib.request.urlretrieve(
    "https://www.dropbox.com/s/si287al91c1ls0d/cifar_EMB_1000.npz?dl=1",
    "cifar_EMB_1000.npz")
%ls -hl cifar_EMB_1000.npz

-rw-r--r-- 1 root root 18M Mar  8 20:41 cifar_EMB_1000.npz


In [23]:
Data=np.load("cifar_EMB_1000.npz")
vgg_features_cifar = Data["arr_0"]

In [24]:
vgg_features_cifar_train = vgg_features_cifar[idx_train]
vgg_features_cifar_test=(np.delete(vgg_features_cifar,idx_train,axis=0))
vgg_features_cifar_valid = vgg_features_cifar_test[idx_vaild]
vgg_features_cifar_test=(np.delete(vgg_features_cifar_test,idx_vaild,axis=0))


In [25]:
print(vgg_features_cifar_train.shape)
print(vgg_features_cifar_valid.shape)
print(vgg_features_cifar_test.shape)

(100, 4096)
(100, 4096)
(800, 4096)


### Setting up the the CNN classifier based on VGG feature

In [26]:
model = Sequential()
model.add(Dense(200,batch_input_shape=(None, 4096)))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Dense(200))

#### we still need to add the last layers to get the predictions on the 10 classes
### your code here


####### end of your code ######


model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])


In [27]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 200)               819400    
_________________________________________________________________
batch_normalization_5 (Batch (None, 200)               800       
_________________________________________________________________
dropout_1 (Dropout)          (None, 200)               0         
_________________________________________________________________
activation_6 (Activation)    (None, 200)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 200)               40200     
_________________________________________________________________
dense_4 (Dense)              (None, 10)                2010      
_________________________________________________________________
activation_7 (Activation)    (None, 10)               

In [28]:
history=model.fit(vgg_features_cifar_train, y_train_new, 
                  batch_size=10, 
                  epochs=20,
                  verbose=2, 
                  validation_data=(vgg_features_cifar_valid, y_valid_new),shuffle=True)

Epoch 1/20
10/10 - 1s - loss: 2.3058 - accuracy: 0.2400 - val_loss: 7.8908 - val_accuracy: 0.3400
Epoch 2/20
10/10 - 0s - loss: 1.0309 - accuracy: 0.6500 - val_loss: 5.0176 - val_accuracy: 0.3900
Epoch 3/20
10/10 - 0s - loss: 0.6810 - accuracy: 0.8000 - val_loss: 3.8941 - val_accuracy: 0.4000
Epoch 4/20
10/10 - 0s - loss: 0.5009 - accuracy: 0.8600 - val_loss: 2.7359 - val_accuracy: 0.4700
Epoch 5/20
10/10 - 0s - loss: 0.3253 - accuracy: 0.9100 - val_loss: 2.2944 - val_accuracy: 0.5100
Epoch 6/20
10/10 - 0s - loss: 0.3167 - accuracy: 0.8800 - val_loss: 2.1687 - val_accuracy: 0.5100
Epoch 7/20
10/10 - 0s - loss: 0.2178 - accuracy: 0.9400 - val_loss: 2.1188 - val_accuracy: 0.5100
Epoch 8/20
10/10 - 0s - loss: 0.2724 - accuracy: 0.8900 - val_loss: 2.1433 - val_accuracy: 0.5300
Epoch 9/20
10/10 - 0s - loss: 0.1461 - accuracy: 0.9700 - val_loss: 2.1637 - val_accuracy: 0.5000
Epoch 10/20
10/10 - 0s - loss: 0.1870 - accuracy: 0.9500 - val_loss: 1.9074 - val_accuracy: 0.5400
Epoch 11/20
10/10 -

### Evaluation of the CNN classifier that was trained on VGG features

In [29]:
pred=model.predict(vgg_features_cifar_test)

#### we now want to get the confusion matrix for the predictions on the test data
### your code here


########## end of your code ###############################


[[31  1  5  1  1  1  0  1 34  5]
 [ 1 66  0  0  1  0  0  1  6  5]
 [ 8  1 26  3 23  6  6  1  5  1]
 [ 0  1  1 26  3 17 13  3 10  6]
 [ 0  0  0  0 52  3  7 10  7  1]
 [ 0  0  0 18  5 47  2  8  0  0]
 [ 4  0 16  1 11  2 44  1  1  0]
 [ 0  0  0  3 23  5  0 44  4  1]
 [ 1  1  0  0  0  0  0  2 74  2]
 [ 0 19  0  0  2  1  0  0  7 51]]
Acc =  0.57625


### Baseline: use VGG feature to train a Random Forest model

In [30]:
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier()
clf.fit(vgg_features_cifar_train,np.argmax(y_train_new, axis=1))

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=100,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)

In [31]:
from sklearn.metrics import confusion_matrix
pred=clf.predict(vgg_features_cifar_test)
print(confusion_matrix(np.argmax(y_test_new, axis=1), pred))
np.sum(pred==np.argmax(y_test_new, axis=1))/len(np.argmax(y_test_new, axis=1))


[[35  2  4  1  0  0  1  3 27  7]
 [ 2 61  0  0  2  0  0  0  5 10]
 [ 8  2 21 11 22  7  4  3  1  1]
 [ 3  0  8 27  0 18 11  6  2  5]
 [ 0  0 10  6 38  2  4 14  5  1]
 [ 3  0  1 17  2 47  1  9  0  0]
 [ 3  0 19  5  7  5 38  3  0  0]
 [ 1  0  0  6 16  3  0 51  1  2]
 [ 8  4  2  3  0  0  0  0 58  5]
 [ 0 19  0  0  1  0  0  0  4 56]]


0.54