# **Feature extraction using pre-trained ResNet50, dimensionality reduction and SVM classification**

Before starting the code execution, make the following change: **Ambiente de execução -> Alterar o tipo de ambiente de execução -> GPU**

In [7]:
import numpy as np
from PIL import Image

from keras.datasets import cifar10
from keras.models import Model
from tensorflow.keras.applications import resnet50

from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

**lowSampleDataset(X,Y)** function just reduces the number of examples so that the execution is faster in our example!

In [8]:
def lowSampleDataset(X, Y):
    perm = np.random.permutation(X.shape[0])
    X = X[perm[0 : (int)(X.shape[0] * (5/100))]]
    Y = Y[perm[0 : (int)(Y.shape[0] * (5/100))]]
    return X, Y

**Pre-processing:**

1.   Load CIFAR10 dataset
2.   Reduce the number of examples
1.   Change the resolution of the examples

In [9]:
print("Loading CIFAR10 images ...")
(Xtrain, Ytrain), (Xtest, Ytest) = cifar10.load_data()

print('\tOriginal training set shape: ', Xtrain.shape)
print('\tOriginal testing set shape: ', Xtest.shape)

Xtrain, Ytrain = lowSampleDataset(Xtrain, Ytrain)
Xtest, Ytest = lowSampleDataset(Xtest, Ytest)

X = []
for i in range(0, Xtrain.shape[0]):
    X.append(np.array(Image.fromarray(Xtrain[i]).resize(size=(224,224))))
Xtrain = np.array(X)

X = []
for i in range(0, Xtest.shape[0]):
    X.append(np.array(Image.fromarray(Xtest[i]).resize(size=(224,224))))
Xtest = np.array(X)

print('\tTraining set shape: ', Xtrain.shape)
print('\tTesting set shape: ', Xtest.shape)

Loading CIFAR10 images ...
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
	Original training set shape:  (50000, 32, 32, 3)
	Original testing set shape:  (10000, 32, 32, 3)
	Training set shape:  (2500, 224, 224, 3)
	Testing set shape:  (500, 224, 224, 3)


**Feature extraction:**

Load the ResNet50 and perform feature extraction using the pre-prediction layer (-2)

In [10]:
print("Loading the ResNet50-ImageNet model ...")
model = resnet50.ResNet50(include_top=True, weights='imagenet', input_shape=(224, 224, 3), classes=1000)
model = Model(inputs=model.input, outputs=model.get_layer(index=-2).output)
#model.summary()

prediction = np.array(model.predict(Xtrain))
Xtrain = np.reshape(prediction, (prediction.shape[0], prediction.shape[1]))

prediction = np.array(model.predict(Xtest))
Xtest = np.reshape(prediction, (prediction.shape[0], prediction.shape[1]))

print('\tFeatures training shape: ', Xtrain.shape)
print('\tFeatures testing shape: ', Xtest.shape)

Loading the ResNet50-ImageNet model ...
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5
	Features training shape:  (2500, 2048)
	Features testing shape:  (500, 2048)


**Dimensionality reduction:**

The extracted features have 2048 dimensions. When the dimensionality is high, one solution is to apply some dimensionality reduction technique, such as PCA:

In [11]:
print("Dimensionality reduction with PCA ...")
pca = PCA(n_components=256)
Xtrain = pca.fit_transform(Xtrain)
Xtest = pca.transform(Xtest)

print('\tFeatures training shape: ', Xtrain.shape)
print('\tFeatures testing shape: ', Xtest.shape)

Dimensionality reduction with PCA ...
	Features training shape:  (2500, 256)
	Features testing shape:  (500, 256)


**Classification:**

We can choose any classifier, such as SVM:

In [12]:
print("Classification with Linear SVM ...")
svm = SVC(kernel='linear')
svm.fit(Xtrain, np.ravel(Ytrain, order='C'))
result = svm.predict(Xtest)

acc = accuracy_score(result, np.ravel(Ytest, order='C'))
print("\tAccuracy Linear SVM: %0.4f" % acc)

Classification with Linear SVM ...
	Accuracy Linear SVM: 0.7760
