## Support Vector Machine Classifier vs Final Trained Classifier

#### Link to Readme section:    

https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/blob/main/README.md#support-vector-machine-classifier-vs-final-trained-classifier

#### Citations:

- https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
- https://scikit-learn.org/stable/modules/svm.html

**Motivation:** Investigate how a simple SVM model (our non-deep-learning baseline) performs compared to our fine tuned CNN classifier. We want to show that a simple ML model will not be as accurate for our task. An SVM model will not be as comparable for the task of accurately classifying emotions.

#### 1. Initial Set-Up

This adds all the imports that are necessary for the code to run smoothly. Note that we import 'sklearn.svm' and 'sklearn.datasets' to utilize the load_files. We import 'SVC' from 'sklearn.svm' since it is a class that is capable of performing multi-class classification on a dataset.

In [None]:
import math
import io

import sklearn
from sklearn.datasets import load_files
from sklearn.svm import SVC
from matplotlib.cbook import flatten
from skimage.transform import resize
from skimage.io import imread
from sklearn.preprocessing import StandardScaler

from sklearn.metrics import accuracy_score
from sklearn.metrics import f1_score
from sklearn.metrics import classification_report, confusion_matrix

#### 2. Load and Preprocess

Load the KDEF face images. The dataset being used for this experiment is the **KDEF Dataset** which can be found by clicking the following link:
https://www.kdef.se/ .

In [None]:
# load KDEF face images
dataset = load_files('../face_images', shuffle=True)

Now we perform some basic preprocessing steps such as resizing the images to 224x224X3 and then flatten the images. 

In [None]:
# resize images to 224,224,3 for resnet
# then flatten images
flat_data_arr = []
for i in dataset.data:
  img_array = imread(io.BytesIO(i))
  img_resized = resize(img_array, (224,224,3))
  flat_data_arr.append(img_resized.flatten())

#### 3. Split the Data

Now we split our data into our training and validation set, using an 80/20 split. 

In [None]:
# split data 80/20 train/val
split_idx = math.trunc((len(flat_data_arr)) * 0.8)

ss = StandardScaler()
X_train = ss.fit_transform(flat_data_arr[:split_idx])
y_train = dataset.target[:split_idx]

X_test = ss.transform(flat_data_arr[split_idx:])
y_test = dataset.target[split_idx:]

#### 4. Create the SVM Classifier and Apply Grid Search

Now we create the SVM classifier model, and train the model using SVC. Since performing gridsearch on our entire training set took too long, we decided to perform gridsearch on a smaller subset of our training set to find the best hyperparameters which we then use to train our model on the entire training set, as seen below.

The code for the gridsearch can be found here: https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/blob/main/svm_experiment/svm_gridsearch.py

In [None]:
# create SVM classifier and apply gridsearch
print('training...')

model = SVC(C=100, kernel='rbf', gamma ='auto', random_state=0)
model.fit(X_train, y_train)

print('done training')

#### 5. Evaluate the Model and Compute Metrics

In [None]:
# predict on test set
y_pred = model.predict(X_test)

# compute accuracy
print('Model accuracy:', accuracy_score(y_test, y_pred))

# compute f1 score
f1 = f1_score(y_test, y_pred, average='weighted')
print('F1 score:', f1)

print('confusion matrix:')
print(confusion_matrix(y_test, y_pred))

print('classification report:')
print(classification_report(y_test, y_pred))

<div>
<img src= "https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/raw/e95318dde059ffdb369ad6051b225932dcec0edb/Images/svm-results.png"/>
</div>