# Exercise on the value of unsupervised constructed features for training a classifier with few labeled examples: SOLUTION

To get unsupervised constructed features of an image, we can use a pretrained CNN as feature extractor. 

We have done this to extract features from 100 Cifar10 images.  As pretrained CNN we use a VGG16 architecture that was trained on ImageNet data and was the second winner of the ImageNet competition in 2014. 

As a check on the quality of the feature representation of the CIFAR10 data, we will use once the pixel-features and once the VGG-features to train a classifier using this 100 labeled data (on average 10 per class). If the VGG-feature are indeed better than the raw pixel values, we would expect to achieve a better classifier when using the VGG-feature compared to the pixel feature.

a) Which accuracy would you expect for a classifier which cannot distinguish between the 10 classes and is only guessing?

**Solution: 10%**


b) Go through the code which is used to set-up, train, and evaluate a CNN classifier using the raw pixel features. Discuss your thoughts on the achieved accuracy (e.g. with your neighbor).

**Solution: The accuracy is with around 20% better then guessing but still very bad. However, this is not surprising since the resolution of the images are very low and it is alread by eye quite difficult to distinguish between the classes. Moreover, we have  only very few training examples (only 10 per class), quite bad features (the raw pixel values) and a model with many parameters (around 45k parameter).**

b) Now we use the unsupervised constructed VGG features. We want to check, if these VGG features are good enough to train a classifier with only few labeled data and still get a satisfying performance. For this purpose, please complete the code to set up a fully connected NN and run the provided subsequent code to train it and determine its accuracy on the test set. Compare it to the accuracy which we achieve with a RF. Discuss the results (e.g. with your neighbor).

**Solution: For code completion see below. The accuracy of the fcNN is with more than 55% much better than the accuray of the from scratch trained CNN which was 20%. This implies that the VGG-features are quite good and more informative than the raw pixel features. With the RF we achieve a similar performance.**



## Imports

In [1]:
%matplotlib inline

import matplotlib.pyplot as plt
import matplotlib.image as imgplot
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix
from pylab import *

import time
import tensorflow as tf

import  tensorflow.keras as keras
import sys
print ("Keras {} TF {} Python {}".format(keras.__version__, tf.__version__, sys.version_info))

Keras 2.4.0 TF 2.3.0 Python sys.version_info(major=3, minor=7, micro=11, releaselevel='final', serial=0)


## CIFAR Data preparation

In [2]:
#downlad cifar data
from tensorflow.keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
del [x_test,y_test]

In [3]:
#loop over each class label and sample 100 random images over each label and save the idx to subset
np.random.seed(seed=222)
idx=np.empty(0,dtype="int8")
for i in range(0,len(np.unique(y_train))):
    idx=np.append(idx,np.random.choice(np.where((y_train[0:len(y_train)])==i)[0],100,replace=False))

x_train= x_train[idx]
y_train= y_train[idx]

In [4]:
print(x_train.shape)
print(y_train.shape)
print(np.unique(y_train,return_counts=True))

(1000, 32, 32, 3)
(1000, 1)
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100], dtype=int64))


In [5]:
#make train valid and test
#loop over each class label and sample 100 random images over each label and save the idx to subset
np.random.seed(seed=123)
idx_train=np.empty(0,dtype="int8")
for i in range(0,len(np.unique(y_train))):
    idx_train=np.append(idx_train,np.random.choice(np.where((y_train[0:len(y_train)])==i)[0],10,replace=False))

x_train_new = x_train[idx_train]
y_train_new = y_train[idx_train]

In [6]:
x_test_new=(np.delete(x_train,idx_train,axis=0))
y_test_new=(np.delete(y_train,idx_train,axis=0))

In [7]:
np.random.seed(seed=127)
idx_valid=np.empty(0,dtype="int8")
for i in range(0,len(np.unique(y_test_new))):
    idx_valid=np.append(idx_valid,np.random.choice(np.where((y_test_new[0:len(y_test_new)])==i)[0],10,replace=False))

x_valid_new = x_test_new[idx_valid]
y_valid_new = y_test_new[idx_valid]

In [8]:
x_test_new=(np.delete(x_test_new,idx_valid,axis=0))
y_test_new=(np.delete(y_test_new,idx_valid,axis=0))

In [9]:
x_train_new = np.reshape(x_train_new, (100,32,32,3))
x_valid_new = np.reshape(x_valid_new, (100,32,32,3))
x_test_new = np.reshape(x_test_new, (800,32,32,3))

In [10]:
print(np.unique(y_train_new,return_counts=True))
print(np.unique(y_valid_new,return_counts=True))
print(np.unique(y_test_new,return_counts=True))

(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10], dtype=int64))
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10], dtype=int64))
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([80, 80, 80, 80, 80, 80, 80, 80, 80, 80], dtype=int64))


In [11]:
from tensorflow.keras.utils import to_categorical   

y_train_new=to_categorical(y_train_new,10)
y_valid_new=to_categorical(y_valid_new,10)
y_test_new=to_categorical(y_test_new,10)



In [12]:
print(x_train_new.shape)
print(y_train_new.shape)

print(x_valid_new.shape)
print(y_valid_new.shape)

print(x_test_new.shape)
print(y_test_new.shape)

(100, 32, 32, 3)
(100, 10)
(100, 32, 32, 3)
(100, 10)
(800, 32, 32, 3)
(800, 10)


In [13]:
# center and standardize the data
X_mean = np.mean( x_train_new, axis = 0)
X_std = np.std( x_train_new, axis = 0)

x_train_new = (x_train_new - X_mean ) / (X_std + 0.0001)
x_valid_new = (x_valid_new - X_mean ) / (X_std + 0.0001)
x_test_new = (x_test_new - X_mean ) / (X_std + 0.0001)

## Baseline 1: use raw images to train a Random Forest model

In [14]:
# reshape images for rf
x_train_rf = x_train_new.reshape(len(x_train_new),32*32*3)
x_valid_rf = x_valid_new.reshape(len(x_valid_new),32*32*3)
x_test_rf = x_test_new.reshape(len(x_test_new),32*32*3)

In [15]:
from sklearn.ensemble import RandomForestClassifier

clf = RandomForestClassifier()
clf.fit(x_train_rf, np.argmax(y_train_new, axis=1))

RandomForestClassifier()

In [16]:
from sklearn.metrics import confusion_matrix
pred = clf.predict(x_test_rf)
print(confusion_matrix(np.argmax(y_test_new, axis=1), pred))
np.sum(pred == np.argmax(y_test_new, axis=1)) / len(np.argmax(y_test_new, axis=1))


[[34  2  7  4  1  4  2  8 12  6]
 [ 5 26  7  2  7  5  4  2  8 14]
 [ 6  3 14 10 12  5 11  9  7  3]
 [ 5  5  9 15  9  3  9 12  3 10]
 [ 5  5 12  5 23  6  7  9  5  3]
 [ 5  5  8 15  8  4  7 18  4  6]
 [ 2  5  7 13 17  1 16 15  2  2]
 [ 5  3  2  8 25 10  3 17  1  6]
 [11  9  3  6  2  1  0  6 30 12]
 [ 6 14  4  3  4  5  1  6 13 24]]


0.25375

## Setting up the the CNN classifier based on raw image data

In [17]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, BatchNormalization
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, Flatten


In [18]:
# here we define hyperparameter of the NN
batch_size = 10
nb_classes = 10
nb_epoch = 30
img_rows, img_cols = 32, 32
kernel_size = (3, 3)
input_shape = (img_rows, img_cols, 3)
pool_size = (2, 2)

In [19]:
model = Sequential()

model.add(Convolution2D(8,kernel_size,padding='same',input_shape=input_shape))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Convolution2D(8, kernel_size,padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))

model.add(Convolution2D(16, kernel_size,padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(Convolution2D(16,kernel_size,padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))

model.add(Flatten())
model.add(Dense(40))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(Activation('relu'))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])


In [20]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 32, 32, 8)         224       
_________________________________________________________________
batch_normalization (BatchNo (None, 32, 32, 8)         32        
_________________________________________________________________
activation (Activation)      (None, 32, 32, 8)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 8)         584       
_________________________________________________________________
batch_normalization_1 (Batch (None, 32, 32, 8)         32        
_________________________________________________________________
activation_1 (Activation)    (None, 32, 32, 8)         0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 8)         0

In [21]:
history=model.fit(x_train_new, y_train_new, 
                  batch_size=10, 
                  epochs=30,
                  verbose=2, 
                  validation_data=(x_valid_new, y_valid_new),shuffle=True)

Epoch 1/30
10/10 - 2s - loss: 2.7816 - accuracy: 0.0600 - val_loss: 2.2889 - val_accuracy: 0.1000
Epoch 2/30
10/10 - 1s - loss: 2.1237 - accuracy: 0.2400 - val_loss: 2.2922 - val_accuracy: 0.1100
Epoch 3/30
10/10 - 1s - loss: 1.7232 - accuracy: 0.4100 - val_loss: 2.2888 - val_accuracy: 0.1000
Epoch 4/30
10/10 - 1s - loss: 1.4993 - accuracy: 0.4400 - val_loss: 2.2759 - val_accuracy: 0.1200
Epoch 5/30
10/10 - 1s - loss: 1.3135 - accuracy: 0.6400 - val_loss: 2.2627 - val_accuracy: 0.1500
Epoch 6/30
10/10 - 1s - loss: 1.1685 - accuracy: 0.6900 - val_loss: 2.2608 - val_accuracy: 0.1500
Epoch 7/30
10/10 - 1s - loss: 1.0235 - accuracy: 0.7900 - val_loss: 2.2737 - val_accuracy: 0.1100
Epoch 8/30
10/10 - 1s - loss: 0.9387 - accuracy: 0.8900 - val_loss: 2.2844 - val_accuracy: 0.1000
Epoch 9/30
10/10 - 1s - loss: 0.9045 - accuracy: 0.8100 - val_loss: 2.3088 - val_accuracy: 0.1000
Epoch 10/30
10/10 - 1s - loss: 0.7858 - accuracy: 0.9200 - val_loss: 2.3236 - val_accuracy: 0.1000
Epoch 11/30
10/10 -

### Evaluation of the CNN classifier that was trained on raw image data

In [22]:
from sklearn.metrics import confusion_matrix
pred = model.predict(x_test_new)
print(confusion_matrix(np.argmax(y_test_new,axis=1), np.argmax(pred,axis=1)))
print("Acc = " ,np.sum(np.argmax(y_test_new,axis=1)==np.argmax(pred,axis=1))/len(y_test_new))


[[35  4  2 10  4  0  1  7 13  4]
 [11  9  4  5 11 10  3 10  5 12]
 [ 3  3 12  7 24  9  5  5  9  3]
 [ 3  2 15 20 15  7  2  5  8  3]
 [ 1  1 13  7 28  3  3  5 14  5]
 [ 4  1  6 19 19  6  1 14  9  1]
 [ 3  3 15  7 25  8 12  2  3  2]
 [ 4  1  5 17 20  6  0 16  7  4]
 [ 8  4 11  2  4  6  1  9 29  6]
 [ 6 14  7  3  9  4  1 10  8 18]]
Acc =  0.23125


## Getting the VGG features for CIFAR

In [23]:
# Downloading embeddings
import urllib
import os
if not os.path.isfile('cifar_EMB_1000.npz'):
    urllib.request.urlretrieve(
    "https://www.dropbox.com/s/si287al91c1ls0d/cifar_EMB_1000.npz?dl=1",
    "cifar_EMB_1000.npz")
%ls -hl cifar_EMB_1000.npz

 Datentr„ger in Laufwerk C: ist Windows
 Volumeseriennummer: 6A65-4827

 Verzeichnis von C:\Users\brdd\Documents\GitHub\dl_course_2023\notebooks


 Verzeichnis von C:\Users\brdd\Documents\GitHub\dl_course_2023\notebooks

07.03.2023  09:59        17'840'468 cifar_EMB_1000.npz
               1 Datei(en),     17'840'468 Bytes
               0 Verzeichnis(se), 242'575'511'552 Bytes frei


In [24]:
Data=np.load("cifar_EMB_1000.npz")
vgg_features_cifar = Data["arr_0"]

In [25]:
vgg_features_cifar_train = vgg_features_cifar[idx_train]
vgg_features_cifar_test=(np.delete(vgg_features_cifar,idx_train,axis=0))
vgg_features_cifar_valid = vgg_features_cifar_test[idx_valid]
vgg_features_cifar_test=(np.delete(vgg_features_cifar_test,idx_valid,axis=0))


In [26]:
print(vgg_features_cifar_train.shape)
print(vgg_features_cifar_valid.shape)
print(vgg_features_cifar_test.shape)

(100, 4096)
(100, 4096)
(800, 4096)


## Baseline 2: use VGG feature to train a Random Forest model

In [27]:
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier()
clf.fit(vgg_features_cifar_train,np.argmax(y_train_new, axis=1))

RandomForestClassifier()

In [28]:
from sklearn.metrics import confusion_matrix
pred=clf.predict(vgg_features_cifar_test)
print(confusion_matrix(np.argmax(y_test_new, axis=1), pred))
np.sum(pred==np.argmax(y_test_new, axis=1))/len(np.argmax(y_test_new, axis=1))


[[39  2  2  0  2  0  1  1 25  8]
 [ 0 62  1  1  1  1  0  0  3 11]
 [ 6  1 30 13 10  9  5  3  2  1]
 [ 2  1  4 27  0 14 24  3  0  5]
 [ 5  0  9  2 40  3  9  8  2  2]
 [ 1  0  2 27  5 38  3  3  0  1]
 [ 2  0 17  1  5  4 48  2  1  0]
 [ 1  0  1  8 15  4  0 48  1  2]
 [ 5  2  0  2  0  1  1  0 62  7]
 [ 0 18  0  0  0  0  1  0  5 56]]


0.5625

## Setting up the the NN classifier based on VGG feature

In [29]:
model = Sequential()
model.add(Dense(200,batch_input_shape=(None, 4096)))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Dense(200))

#### we still need to add the last layers to get the predictions on the 10 classes
### your code here

model.add(Dense(nb_classes))
model.add(Activation('softmax'))

####### end of your code ######


model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])


In [30]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 200)               819400    
_________________________________________________________________
batch_normalization_5 (Batch (None, 200)               800       
_________________________________________________________________
dropout_1 (Dropout)          (None, 200)               0         
_________________________________________________________________
activation_6 (Activation)    (None, 200)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 200)               40200     
_________________________________________________________________
dense_4 (Dense)              (None, 10)                2010      
_________________________________________________________________
activation_7 (Activation)    (None, 10)               

In [31]:
history=model.fit(vgg_features_cifar_train, y_train_new, 
                  batch_size=10, 
                  epochs=20,
                  verbose=2, 
                  validation_data=(vgg_features_cifar_valid, y_valid_new),shuffle=True)

Epoch 1/20
10/10 - 1s - loss: 2.6514 - accuracy: 0.2300 - val_loss: 7.6833 - val_accuracy: 0.2800
Epoch 2/20
10/10 - 0s - loss: 0.9340 - accuracy: 0.6800 - val_loss: 4.9032 - val_accuracy: 0.3500
Epoch 3/20
10/10 - 0s - loss: 0.6010 - accuracy: 0.8100 - val_loss: 3.2697 - val_accuracy: 0.4700
Epoch 4/20
10/10 - 0s - loss: 0.5742 - accuracy: 0.8300 - val_loss: 2.6853 - val_accuracy: 0.4700
Epoch 5/20
10/10 - 0s - loss: 0.4067 - accuracy: 0.8700 - val_loss: 2.3595 - val_accuracy: 0.4900
Epoch 6/20
10/10 - 0s - loss: 0.4187 - accuracy: 0.8800 - val_loss: 2.1491 - val_accuracy: 0.5000
Epoch 7/20
10/10 - 0s - loss: 0.2045 - accuracy: 0.9500 - val_loss: 1.9905 - val_accuracy: 0.4900
Epoch 8/20
10/10 - 0s - loss: 0.2516 - accuracy: 0.9300 - val_loss: 2.1082 - val_accuracy: 0.5300
Epoch 9/20
10/10 - 0s - loss: 0.1757 - accuracy: 0.9500 - val_loss: 1.9797 - val_accuracy: 0.5400
Epoch 10/20
10/10 - 0s - loss: 0.1777 - accuracy: 0.9500 - val_loss: 1.8250 - val_accuracy: 0.5400
Epoch 11/20
10/10 -

### Evaluation of the NN classifier that was trained on VGG features

In [32]:
pred=model.predict(vgg_features_cifar_test)

#### we now want to get the confusion matrix for the predictions on the test data
### your code here

print(confusion_matrix(np.argmax(y_test_new,axis=1),np.argmax(pred,axis=1)))
print("Acc = " ,np.sum(np.argmax(y_test_new,axis=1)==np.argmax(pred,axis=1))/len(y_test_new))

########## end of your code ###############################


[[42  1  3  1  1  2  0  2 23  5]
 [ 1 56  2  0  1  3  0  1  1 15]
 [ 7  0 31  2 13 18  2  3  3  1]
 [ 0  0  3 26  2 30  7  2  4  6]
 [ 0  0  3  5 46  4  1 14  4  3]
 [ 0  0  0 16  5 52  3  4  0  0]
 [ 2  0 25  2 11  8 29  2  1  0]
 [ 1  0  1  5 15  6  0 50  1  1]
 [ 3  1  1  1  0  3  1  2 63  5]
 [ 1 13  0  0  1  1  0  0  2 62]]
Acc =  0.57125
