## Exercise

All three types of kernels appear to perform quite well on the MNIST data! Nonetheless, their performance can be improved!
1. Try to split your training data (again using **train_test_split**) to obtain a validation set. Try to optimize the performance of your model on the validation data, by trying different kernels (linear, poly, and rbf), different values of C, different decision function (ovr or ovo), and perhaps even other stuff. You can find a full list of options to tune at https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html.

**Hint**: You may use the notebook I have uploaded under this lecture as a starting point ($\texttt{exercise-svm-k-classes.ipynb}$). It provides some of the code, and you then have to fill in the rest. You do not have to use it - it is there if you think it might be helpful!

In [5]:
# imports
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import svm
import pandas as pd
import numpy as np

In [15]:
# Load the digits dataset
X, y = load_digits(return_X_y=True)

# We use `train_test_split` to split our data into a train and a test set.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

# Now split the train data to also obtain validation data
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)
print(X_train.shape, X_val.shape, X_test.shape, y_train.shape, y_val.shape, y_test.shape)

(1437, 64) (360, 64) (1437,) (360,)
(1149, 64) (288, 64) (360, 64) (1149,) (288,) (360,)


In [25]:
kernels = ['linear', 'poly', 'rbf']
Cs = [0.1, 2, 3, 4, 5, 100]
decision_function = ['ovr','ovo']

results = []

for kernel in kernels:
    for C in Cs:
        for decision_function_shape in decision_function:
            svm_current = svm.SVC(kernel=kernel, C=C, decision_function_shape=decision_function_shape)
            svm_current.fit(X_train, y_train) #wrong change remember the validation set
            y_val_hat = svm_current.predict(X_val)
            accuracy = accuracy_score(y_val_hat, y_val)

            results.append([accuracy, kernel, C, decision_function_shape])

results = pd.DataFrame(results)
results.columns = ['Accuracy', 'Polynomial Degree', 'C','Decision_function']
print(results)



    Accuracy Polynomial Degree      C decision_function
0   0.993056            linear    0.1               ovr
1   0.993056            linear    0.1               ovo
2   0.993056            linear    2.0               ovr
3   0.993056            linear    2.0               ovo
4   0.993056            linear    3.0               ovr
5   0.993056            linear    3.0               ovo
6   0.993056            linear    4.0               ovr
7   0.993056            linear    4.0               ovo
8   0.993056            linear    5.0               ovr
9   0.993056            linear    5.0               ovo
10  0.993056            linear  100.0               ovr
11  0.993056            linear  100.0               ovo
12  0.986111              poly    0.1               ovr
13  0.986111              poly    0.1               ovo
14  1.000000              poly    2.0               ovr
15  1.000000              poly    2.0               ovo
16  1.000000              poly    3.0           

In [29]:
# Extract best parameters
results[results['Accuracy'] == results['Accuracy'].max()]

Unnamed: 0,Accuracy,Polynomial Degree,C,decision_function
14,1.0,poly,2.0,ovr
15,1.0,poly,2.0,ovo
16,1.0,poly,3.0,ovr
17,1.0,poly,3.0,ovo
18,1.0,poly,4.0,ovr
19,1.0,poly,4.0,ovo
20,1.0,poly,5.0,ovr
21,1.0,poly,5.0,ovo
22,1.0,poly,100.0,ovr
23,1.0,poly,100.0,ovo


In [30]:
# Start making the optimized model
svm_optimized = svm.SVC(kernel="poly", C=100, decision_function_shape='ovr')

# Train and validation to fit data
svm_optimized.fit(np.concatenate([X_train, X_val]), np.concatenate([y_train, y_val]))

#predict on the test data
y_val_hat_optimized = svm_optimized.predict(X_test)

# Get and check accuracy on the test data
accuracy_optimized = accuracy_score(y_val_hat_optimized, y_test)
print(f'Optimized SVM achieved {round(accuracy_optimized * 100, 1)}% accuracy.')


Optimized SVM achieved 98.6% accuracy.
