# Support Vector Classifier 

This notebook implements a SVM classifier and hyperparameter optimization using the [SciKitLearn](https://scikit-learn.org/stable/index.html) library. The hyperparameter optimization is done using *Search Grid with 5-fold Cross Validation*. The search grid includes two different kernels; Linear and RBF, and four different setting for C (penalty term).

In [1]:
import numpy as np
import pandas as pd
from load_data import loadVectors
from sklearn import svm
from sklearn.model_selection import GridSearchCV

The training and validation set are merged, as CV creates its' own train/test split.     

In [2]:
x_train, y_train, x_validation, y_validation, x_test = loadVectors()
x = pd.concat([x_train, x_validation])
y = np.concatenate((y_train, y_validation))

Finding the best classifier requires setting up the hyperparameters and the wanted SVM classifier. 

In [3]:
# Hyperparameter to search for  
parameters = {
    'kernel': ('linear', 'rbf'), 
    'C':[0.01, 0.1, 1, 10]
}

# Support Vector Classifier
svc = svm.SVC(gamma="scale")

# GridSearch returns the best classifier for the given hyperparameters
clf = GridSearchCV(svc, parameters, cv=5, verbose=20, n_jobs=-1)

Training the classifier using the entire labeled dataset.

In [4]:
clf.fit(x, y)

Fitting 5 folds for each of 8 candidates, totalling 40 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done   1 tasks      | elapsed: 15.7min
[Parallel(n_jobs=-1)]: Done   2 tasks      | elapsed: 15.7min
[Parallel(n_jobs=-1)]: Done   3 tasks      | elapsed: 15.7min
[Parallel(n_jobs=-1)]: Done   4 tasks      | elapsed: 15.7min
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed: 22.5min
[Parallel(n_jobs=-1)]: Done   6 tasks      | elapsed: 34.7min
[Parallel(n_jobs=-1)]: Done   7 tasks      | elapsed: 34.7min
[Parallel(n_jobs=-1)]: Done   8 tasks      | elapsed: 34.7min
[Parallel(n_jobs=-1)]: Done   9 tasks      | elapsed: 41.3min
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed: 41.5min
[Parallel(n_jobs=-1)]: Done  11 tasks      | elapsed: 41.6min
[Parallel(n_jobs=-1)]: Done  12 tasks      | elapsed: 48.0min
[Parallel(n_jobs=-1)]: Done  13 tasks      | elapsed: 48.0min
[Parallel(n_jobs=-1)]: Done  14 tasks      | elapsed: 48.1min
[Parallel(n_jobs=-1)]: Done  15 tasks      | elapsed: 53

GridSearchCV(cv=5, error_score='raise-deprecating',
       estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False),
       fit_params=None, iid='warn', n_jobs=-1,
       param_grid={'kernel': ('linear', 'rbf'), 'C': [0.01, 0.1, 1, 10]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',
       scoring=None, verbose=20)

## Grid Search Results

In [6]:
Results = pd.DataFrame(clf.cv_results_)
Results

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_C,param_kernel,params,split0_test_score,split1_test_score,split2_test_score,...,mean_test_score,std_test_score,rank_test_score,split0_train_score,split1_train_score,split2_train_score,split3_train_score,split4_train_score,mean_train_score,std_train_score
0,271.994743,80.842013,140.115429,41.165375,0.01,linear,"{'C': 0.01, 'kernel': 'linear'}",0.852599,0.854077,0.87523,...,0.835999,0.039666,2,0.999384,0.999384,0.998923,0.998924,0.999693,0.999262,0.000298
1,669.925747,6.411299,91.820483,0.680193,0.01,rbf,"{'C': 0.01, 'kernel': 'rbf'}",0.091743,0.090742,0.09035,...,0.091166,0.000553,8,0.091175,0.091427,0.091371,0.0913,0.091174,0.091289,0.000102
2,109.132209,2.336807,57.637934,0.868455,0.1,linear,"{'C': 0.1, 'kernel': 'linear'}",0.852599,0.854077,0.87646,...,0.835753,0.040377,3,1.0,1.0,1.0,1.0,1.0,1.0,0.0
3,399.578479,4.922437,82.989512,0.912757,0.1,rbf,"{'C': 0.1, 'kernel': 'rbf'}",0.752905,0.741876,0.746159,...,0.733268,0.025531,7,0.762051,0.762198,0.760806,0.767138,0.782809,0.767,0.008196
4,105.444882,2.354912,56.161458,0.579528,1.0,linear,"{'C': 1, 'kernel': 'linear'}",0.852599,0.854077,0.87646,...,0.835753,0.040377,3,1.0,1.0,1.0,1.0,1.0,1.0,0.0
5,205.525481,1.699417,69.266541,0.580346,1.0,rbf,"{'C': 1, 'kernel': 'rbf'}",0.865443,0.841815,0.863553,...,0.832308,0.040471,6,0.939935,0.940126,0.941701,0.943283,0.955641,0.944137,0.005878
6,105.560806,2.304609,56.410568,1.091998,10.0,linear,"{'C': 10, 'kernel': 'linear'}",0.852599,0.854077,0.87646,...,0.835753,0.040377,3,1.0,1.0,1.0,1.0,1.0,1.0,0.0
7,193.551679,13.402285,64.181612,6.691864,10.0,rbf,"{'C': 10, 'kernel': 'rbf'}",0.862385,0.849172,0.870928,...,0.838214,0.038752,1,0.99923,0.999076,0.999077,0.998924,0.999386,0.999139,0.000157


The found hyperparameter settings are used to predict the classes of the unlabeled test set.

In [7]:
predictions = clf.predict(x_test)

In [8]:
predictions

array([13, 13,  8, ..., 17, 22, 22], dtype=int64)

In [9]:
submission = pd.DataFrame({'id': np.arange(1,3461), 'label': predictions})
submission.to_csv("submissionSVM.csv",index=False)

In [10]:
clf.best_estimator_

SVC(C=10, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)