# Machine Learning 3 - Support Vector Machines

A SVM classifier builds a set of hyper-planes to try and separate the data by maximizing the distance between the borders and the data points.

![SVM](http://scikit-learn.org/stable/_images/sphx_glr_plot_separating_hyperplane_0011.png "Decision border in an SVM")

This separation is generally not possible to achieve in the original data space. Therefore, the first step of the SVM is to project the data into a high or infinite dimensions space in which this linear separation can be done. The projection can be done with linear, polynomial, or more comonly "RBF" kernels.

In [1]:
from lab_tools import CIFAR10, evaluate_classifier, get_hog_image

dataset = CIFAR10('./CIFAR10')

Pre-loading training data
Pre-loading test data


**Build a simple SVM** using [the SVC (Support Vector Classfiication) from sklearn](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC). 
**Train** it on the CIFAR dataset.

In [3]:
from sklearn.svm import SVC
from sklearn.model_selection import StratifiedKFold, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from lab_tools import CIFAR10

# Normalize the HOG features
scaler = StandardScaler()
# Normalize the HOG features
scaled_train_hog = scaler.fit_transform(dataset.train['hog'])
scaled_test_hog = scaler.transform(dataset.test['hog'])

# Initialize SVC (Support Vector Classification)
svm_clf = SVC(random_state=42)

# Define the parameter grid for the SVM
param_grid = {
    'kernel': ['linear', 'poly', 'rbf'],
    'C': [0.1, 1, 10],
    'gamma': ['scale', 'auto']
}

# Initialize StratifiedKFold with 5 folds
kf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

# Initialize GridSearchCV with the classifier, parameter grid, StratifiedKFold, and scoring
grid_search = GridSearchCV(svm_clf, param_grid, cv=kf, scoring='accuracy', n_jobs=-1, verbose=2)

# Train the grid search to find the best combination of hyperparameters
grid_search.fit(scaled_train_hog, dataset.train['labels'])

# Get the best estimator (classifier) from the grid search
best_svm_clf = grid_search.best_estimator_

# Now, you can use this best classifier to make predictions on the test set
svm_test_preds = best_svm_clf.predict(scaled_test_hog)
svm_test_accuracy = accuracy_score(dataset.test['labels'], svm_test_preds)

# Print the best parameters and test accuracy
print("Best Hyperparameters:", grid_search.best_params_)
print("SVM Classifier Predictive Performance (Accuracy) on Test Data:", svm_test_accuracy)


Fitting 5 folds for each of 18 candidates, totalling 90 fits
Best Hyperparameters: {'C': 10, 'gamma': 'auto', 'kernel': 'rbf'}
SVM Classifier Predictive Performance (Accuracy) on Test Data: 0.8323333333333334


In [5]:
print(grid_search.best_score_)

0.8251333333333333


**Explore the classifier**. How many support vectors are there? What are support vectors?

In [None]:
all_support_vectors = clf.support_vectors_ #Each line = 1 "Support Vector" ; 1024 columns forming a 32x32 image 
vectors_per_class = clf.n_support_ #Number of "Support Vector" for each class

# -- Your code here -- #


**Try to find the best "C" (error penalty) and "gamma" parameters** using cross-validation. What influence does "C" have on the number of support vectors?

In [None]:

# -- Your code here -- #


# Comparing algorithms

Using the best hyper-parameters that you found for each of the algorithms (kNN, Decision Trees, Random Forests, MLP, SVM):

* Re-train the models on the full training set.
* Compare their results on the test set.

In [None]:

# -- Your code here -- #
