## Classification task using Support Vector machines

* Train an SVM classifier on the MNIST dataset. Since SVM classifiers are binary classifiers, you will need to use one-versus-the-rest to classify all 10 digits. You may want to tune the hyperparameters using small validation sets to speed up the process. What accuracy can you reach?

In [1]:
import numpy as np

from sklearn.datasets import load_digits
mnist = load_digits()

In [2]:
mnist.keys()

dict_keys(['data', 'target', 'frame', 'feature_names', 'target_names', 'images', 'DESCR'])

In [3]:
X = mnist["data"]
y = mnist["target"] #Type int32

In [4]:
np.unique(y), X.shape, y.shape

(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), (1797, 64), (1797,))

In [5]:
#from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC #Uses OVR as decision function shape
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

In [6]:
#Scale data to unit variance
sc = StandardScaler()
X_new = sc.fit_transform(X)

In [7]:
from sklearn.model_selection import GridSearchCV

param_grid = [
    {"C": [0.1,0.5,1,2,5,10],
    "kernel": ["linear","rbf"],
    #"gamma": [0.01,0.05,0.1,] --->Doesn't affect performance by much hence ignored!!
    }
]

svm_param_check = SVC(random_state=42) #Random State for reproducibility of results

grid_search = GridSearchCV(svm_param_check, param_grid, cv=5, scoring='accuracy', return_train_score=True)

grid_search.fit(X_new, y)

GridSearchCV(cv=5, estimator=SVC(random_state=42),
             param_grid=[{'C': [0.1, 0.5, 1, 2, 5, 10],
                          'kernel': ['linear', 'rbf']}],
             return_train_score=True, scoring='accuracy')

In [8]:
grid_search.best_params_

{'C': 2, 'kernel': 'rbf'}

In [9]:
grid_search.best_score_  

0.954363974001857

In [10]:
import warnings
warnings.filterwarnings("ignore")

from sklearn.svm import LinearSVC

param_grid2 = [
    {"C": [0.1,0.5,1,2,5,10],
    "loss": ["hinge", "squared_hinge"]
    }
]

linear_svm_param_check = LinearSVC(random_state=42) #Random State for reproducibility of results

grid_search2 = GridSearchCV(linear_svm_param_check, param_grid2, cv=5, scoring='accuracy', return_train_score=True)

grid_search2.fit(X_new, y)

GridSearchCV(cv=5, estimator=LinearSVC(random_state=42),
             param_grid=[{'C': [0.1, 0.5, 1, 2, 5, 10],
                          'loss': ['hinge', 'squared_hinge']}],
             return_train_score=True, scoring='accuracy')

In [11]:
grid_search2.best_params_

{'C': 0.1, 'loss': 'squared_hinge'}

In [12]:
grid_search2.best_score_

0.9165413184772516

In [13]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [14]:
#The best parameters were obtained from the first GridSearch hence will be used in pipeline.

svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("SVM", SVC(C=grid_search.best_params_['C'], 
                kernel=grid_search.best_params_['kernel'],
                random_state=42))
])

svm_clf.fit(X_train,y_train)

Pipeline(steps=[('scaler', StandardScaler()),
                ('SVM', SVC(C=2, random_state=42))])

In [15]:
#Make Predictions
y_predSVM = svm_clf.predict(X_test)

In [16]:
from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score, confusion_matrix

#OBTAIN PERFORMANCE METRICS
pre_score = precision_score(y_test,y_predSVM, average='macro')
rec_score = recall_score(y_test,y_predSVM, average='macro')
f1 = f1_score(y_test,y_predSVM, average='macro')
cm = confusion_matrix(y_test,y_predSVM)

#DISPLAY RESULTS
print('THE SVM CLASSIFICATION RESULTS ARE:')
print(f'The overall accuracy score is {accuracy_score(y_test,y_predSVM)}')
print(f'The precision score is {pre_score}')
print(f'The recall score is {rec_score}')
print(f'The f1-score is {f1}')
print(f'The confusion matrix is \n {cm}')

THE SVM CLASSIFICATION RESULTS ARE:
The overall accuracy score is 0.9833333333333333
The precision score is 0.9841923932693645
The recall score is 0.9832154776804337
The f1-score is 0.9835399099136002
The confusion matrix is 
 [[33  0  0  0  0  0  0  0  0  0]
 [ 0 28  0  0  0  0  0  0  0  0]
 [ 0  0 33  0  0  0  0  0  0  0]
 [ 0  0  0 33  0  1  0  0  0  0]
 [ 0  0  0  0 46  0  0  0  0  0]
 [ 0  0  0  0  0 46  1  0  0  0]
 [ 0  0  0  0  0  0 35  0  0  0]
 [ 0  0  0  0  1  0  0 32  0  1]
 [ 0  0  1  0  0  0  0  0 29  0]
 [ 0  0  0  0  0  0  0  0  1 39]]
