<a href="https://colab.research.google.com/github/nourhanOfTerra/CIFAR-10/blob/main/CIFAR10_SVM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#CIFAR-10 Classification Model: Support Vector Machines

The overall guide: https://towardsdatascience.com/multiclass-classification-using-k-nearest-neighbours-ca5281a9ef76

##Mounting Google Drive

In [1]:
from google.colab import drive
drive.mount('/content/drive')
import os
os.chdir('/content/drive/MyDrive/CIFAR-10')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


##Importing the necessary libraries

###General Libraries

In [3]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

###Model Specific Libraris

In [12]:
from sklearn import preprocessing
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.experimental import enable_halving_search_cv
from sklearn.model_selection import HalvingGridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

##Importing the data from csv files to variables

In [5]:
df_extracted_features = pd.read_csv('extracted features.csv')
df_labels = pd.read_csv('labels.csv')

Now converting to numpy arrays as they are easier to deal with in the upcoming functions of the scikit-learn library.

In [6]:
extracted_features = df_extracted_features.to_numpy()
labels = df_labels.to_numpy()

##Dividing the data into training, validation and tesing sets

In [7]:
standard_extracted_features = StandardScaler().fit_transform(extracted_features)        # Necessary standardisation step
X_train, X_test, y_train, y_test = train_test_split(standard_extracted_features, labels, test_size = 0.2)

In [8]:
y_train = np.ravel(y_train, order = 'C')
y_test = np.ravel(y_test, order = 'C')

##Training and Cross Validation
Using GridSearchCV to optimize the hyperparameters using all processor cores available to run the processes in parallel to save time. The model is then trained using the best obtained hyperparameters on the training data.

Guide: https://www.geeksforgeeks.org/svm-hyperparameter-tuning-using-gridsearchcv-ml/

Turns out that GridSearchCV has a very high time complexity, which will make the code take forever to run even while using all cores. Therefore, I will be using HalvingGridSearchCV instead, which will make the code run 20 times faster. This approach uses only a small part of the dataset to run all parameter combinations, then selects the best combinations to run on a larger subset of the data. These two steps are run repeatedly until the optimum hyperparameters are selected.

Source: https://towardsdatascience.com/20x-times-faster-grid-search-cross-validation-19ef01409b7c

Note that the upcoming block of code needs to run only once.


In [15]:
#parameters = {'C': [0.1, 1, 10, 100, 1000], 'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 'kernel': ['rbf', 'poly']}
#model = HalvingGridSearchCV(svm.SVC(), parameters, verbose = 3, n_jobs = -1, factor = 3)
#model.fit(X_train, y_train)
#print(model.best_params_)
#print()
#print(model.best_estimator_)

n_iterations: 4
n_required_iterations: 4
n_possible_iterations: 4
min_resources_: 1777
max_resources_: 48000
aggressive_elimination: False
factor: 3
----------
iter: 0
n_candidates: 50
n_resources: 1777
Fitting 5 folds for each of 50 candidates, totalling 250 fits
----------
iter: 1
n_candidates: 17
n_resources: 5331
Fitting 5 folds for each of 17 candidates, totalling 85 fits
----------
iter: 2
n_candidates: 6
n_resources: 15993
Fitting 5 folds for each of 6 candidates, totalling 30 fits
----------
iter: 3
n_candidates: 2
n_resources: 47979
Fitting 5 folds for each of 2 candidates, totalling 10 fits




{'C': 100, 'gamma': 0.001, 'kernel': 'rbf'}

SVC(C=100, gamma=0.001)


According to the result from the HalvingGridSearchCV, the optimum hyperparameters are:

C = 100

gamma = 0.001

kernel = 'rbf

The downside to this method is that these values are the best from the manually suggested values in the parameter dictionary in the above code block. This means that there might be some unexplored better options.

In [17]:
model = svm.SVC(C = 100, kernel = 'rbf', gamma = 0.001)
model.fit(X_train, y_train)

SVC(C=100, gamma=0.001)

##Testing

In [18]:
y_predicted = model.predict(X_test)
confused = confusion_matrix(y_test, y_predicted)
print("The confusion matrix: ")
print(confused)
report = classification_report(y_test, y_predicted)
print("\nThe classification report: ")
print(report)
acc_score = accuracy_score(y_test, y_predicted)
print("\nTest Accuracy: ", acc_score)

The confusion matrix: 
[[695  71  67  24  40  15  21  27 147  60]
 [ 55 780  11  21  18  11  20  20  53 175]
 [ 93  29 549 104 141  83  77  54  22  23]
 [ 35  31 125 518  83 207 108  61  22  48]
 [ 60   8 161  89 631  57  71  96  19  18]
 [ 13  24 136 241  82 527  58  94  14  22]
 [ 18  32  96 133 114  57 700  19  19  22]
 [ 28  22  64  86  99  95  21 709  10  42]
 [131  64  25  26  21  14   8   6 824  58]
 [ 62 216  11  37  26  17  17  39  74 773]]

The classification report: 
              precision    recall  f1-score   support

           0       0.58      0.60      0.59      1167
           1       0.61      0.67      0.64      1164
           2       0.44      0.47      0.45      1175
           3       0.41      0.42      0.41      1238
           4       0.50      0.52      0.51      1210
           5       0.49      0.44      0.46      1211
           6       0.64      0.58      0.61      1210
           7       0.63      0.60      0.62      1176
           8       0.68      0