## Grid Search for Support Vector Machine Classifier

#### Link to Readme section:    

https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/blob/main/README.md#support-vector-machine-classifier-vs-final-trained-classifier

#### Citations:

- https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
- https://scikit-learn.org/stable/modules/svm.html

**Motivation:** for the investigation of understanding how SVM model performs in comparison to our fine tuned CNN classifier, we wanted to perform gridsearch to find the best hyperparameters to use for our simple SVM model. However, since gridsearch on the entire training set took too long, we decided to perform gridsearch on a subset of our training dataset to find the best hyperparameters. 

#### 1. Initial Set-Up

The code below adds all the imports neccesary for the code to run smoothly. Note that we are importing 'GridSearchCV' from 'sklearn.model_selection' to perform the gridsearch for our SVM model.

In [None]:
import math
import io

import sklearn
from sklearn.datasets import load_files
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from matplotlib.cbook import flatten
from skimage.transform import resize
from skimage.io import imread
from sklearn.preprocessing import StandardScaler

from sklearn.metrics import accuracy_score
from sklearn.metrics import f1_score
from sklearn.metrics import classification_report, confusion_matrix

#### 2. Load and Preprocess

Load the KDEF face images. The dataset being used for this experiment is the **KDEF Dataset** which can be found by clicking the following link:
https://www.kdef.se/ .

In [None]:
# load KDEF face images
dataset = load_files('face_images', shuffle=True)

Now we perform some basic preprocessing steps such as resizing the images to 224x224X3 and then flatten the images. 

In [None]:
# resize images to 224,224,3 for resnet
# then flatten images
flat_data_arr = []
for i in dataset.data:
  img_array = imread(io.BytesIO(i))
  img_resized = resize(img_array, (224,224,3))
  flat_data_arr.append(img_resized.flatten())

#### 3. Define the Hyperparameter space to perform gridsearch

In [None]:
# parameters for gridsearchCV
parameter_space = [
    {
        'C': [1, 10, 100],
        'kernel': ['rbf', 'poly'],
        'gamma': ['scale', 'auto']
    },
    {
        'C': [1, 10, 100],
        'kernel': ['linear']
    }
]

#### 4. Split the Data

Now we split our data into our training and validation set, using an 80/20 split. 

In [None]:
split_idx = math.trunc((len(flat_data_arr)) * 0.8)

ss = StandardScaler()
X_train = ss.fit_transform(flat_data_arr[:split_idx])
y_train = dataset.target[:split_idx]

X_test = ss.transform(flat_data_arr[split_idx:])
y_test = dataset.target[split_idx:]

#### 5. Create SVM classifier and Apply Gridsearch

In [None]:
print('training...')
model = SVC(random_state=0)
grid_clf = GridSearchCV(model, parameter_space, cv=10, n_jobs=1, verbose=3)

grid_clf.fit(X_train, y_train)
print('done training')

#### 6. Find the best hyperparameters yielded from gridsearch

In [None]:
# show best results
print('Best parameters found from gridsearchCV:')
print(grid_clf.best_params_)

#### 7. Evaluate the Model and Compute Metrics

In [None]:
# predict on test set
y_pred = grid_clf.predict(X_test)

# compute accuracy
print('Model accuracy:', accuracy_score(y_test, y_pred))

# compute f1 score
f1 = f1_score(y_test, y_pred, average='weighted')
print('F1 score:', f1)

print('confusion matrix:')
print(confusion_matrix(y_test, y_pred))

print('classification report:')
print(classification_report(y_test, y_pred))

<div>
<img src= "https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/raw/e95318dde059ffdb369ad6051b225932dcec0edb/Images/svm-results.png"/>
</div>