Σε αυτό το notebook κάνω ανάλυση των συνόλων δεδομένων CIFAR-10 με την χρήση της μεθόδου KPCA+LDA. Πριν ξεκινήσω με την ανάλυση, εξηγώ μερικά πράγματα για την KPCA+LDA και τον τρόπο που θα λειτουργήσω κατά την ανάλυση.

Η ιδέα της KPCA+LDA είναι πρώτα να εφαρμόσουμε KPCA και να ρίξουμε την διάσταση των δεδομένων έτσι ώστε να μειωθεί η ασήμαντη πληροφορία. Έπειτα, στον χώρο KPCA που προκύπτει, εφαρμόζουμε LDA για να εξάγουμε ακόμη περισσότερη πληροφορία για τα δεδομένα μας. Στο ομώνυμο paper, οι Jian Yang, et. all. αποδεικνύουνε ότι τα παραπάνω βήματα είναι αντοίστοιχα με την χρήση KFD και θεωρούν ότι η εφαρμογή KPCA+LDA κάνει ακριβώς το ίδιο πράγμα με την Kernel Discriminant Analysis.

Θεωρώ σαν benchmark το accuracy score των Nearest Neighbor και Nearest Class Centroid στον αρχικό χώρο των δεδομένων, καθώς και σε χώρο PCA όπου κρατάω το 90% της πληροφορίας. Έπειτα, λειτουργώ παρόμοια με τα SVM, εφαρμόζοντας γραμμικό (PCA+LDA), πολυωνυμικό και RBF kernel στον μετασχηματισμό KPCA+LDA και καταγράφοντας τις αποδόσεις των 2 μοντέλων σε κάθε περίπτωση. Για την επιλογή των υπερπαραμέτρων (gamme στον RBF και coef0 στον poly, καθώς και num_components), χρησιμοποιώ 2D search, δοκιμάζοντας αρχικά πολύ μεγάλες τιμές και μικραίνω το εύρος τιμών ανάλογα με τα αποτελέσματα που παίρνω κάθε φορα. Για το LDA, αφήνω default την τελική διάσταση.

# CIFAR-10 analysis using KPCA+LDA
## Dataset import and preprocess
Χρησιμοποιώ τις οδηγίες στην ιστοσελίδα https://www.cs.toronto.edu/~kriz/cifar.html για την εγκατάσταση και φόρτωμα των δεδομένων του CIFAR-10 και CIFAR-100. Κάνω ένα απλό απλό min-max scaling των τιμών των δεδομένων στο $[0, 1]$.

In [16]:
import pickle

def unpickle(file):
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

def get_batch_data(filename):
    decoded_data = unpickle(filename)
    
    return decoded_data[bytes('data', 'utf-8')], decoded_data[bytes('labels', 'utf-8')]

def combine_data_from_files(filenames_list):
    data = []
    labels = []
    
    for filename in filenames_list:
        curr_data, curr_labels = get_batch_data(filename)
        
        data.extend(curr_data)
        labels.extend(curr_labels)
    
    return data, labels

In [17]:
batch_filenames = [ 'cifar10/data_batch_1', 'cifar10/data_batch_2', 'cifar10/data_batch_3',
                    'cifar10/data_batch_4', 'cifar10/data_batch_5']

train_data, train_labels = combine_data_from_files(batch_filenames)
test_data, test_labels = get_batch_data('cifar10/test_batch')

In [20]:
import numpy as np

In [61]:
X, y = np.array(train_data), np.array(train_labels)
X_test, y_test = np.array(test_data), np.array(test_labels)

Λόγω μεγέθους το dataset, χρησιμοποιώ 5.000 δεδομένα για training και άλλα 5.000 για validation.

In [62]:
from sklearn.model_selection import train_test_split

used_dataset_size = 5000 # amount of samples to retain

X, _, y, _ = train_test_split(X, y, train_size=used_dataset_size) 

X_train, X_val, y_train, y_val = train_test_split(X, y, train_size=.5) # 50-50 split

In [63]:
print(X_train.shape, y_train.shape, X_val.shape, y_val.shape, X_test.shape, y_test.shape)

(2500, 3072) (2500,) (2500, 3072) (2500,) (10000, 3072) (10000,)


Rescale:

In [64]:
X_train, X_val, X_test = X_train / 255, X_val / 255, X_test / 255

## Analysis
Ξεικάω με την εφαρμογή ΚΝΝ (με $k=5$) και NearestCentroid στον αρχικό χώρο και σε απλό PCA με διατήρηση 90\% της πληροφορίας.

In [65]:
from sklearn.neighbors import KNeighborsClassifier, NearestCentroid
from sklearn.metrics import accuracy_score

In [66]:
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)

y_pred = knn.predict(X_test)
print(f'KNN: {accuracy_score(y_pred, y_test)}')

nc = NearestCentroid()
nc.fit(X_train, y_train)

y_pred = nc.predict(X_test)
print(f'NC: {accuracy_score(y_pred, y_test)}')

KNN: 0.2522
NC: 0.2694


In [67]:
from sklearn.decomposition import PCA

pca = PCA(n_components=0.9)

X_train_pca = pca.fit_transform(X_train)
X_test_pca = pca.transform(X_test)

knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train_pca, y_train)

y_pred = knn.predict(X_test_pca)
print(f'KNN: {accuracy_score(y_pred, y_test)}')

nc = NearestCentroid()
nc.fit(X_train_pca, y_train)

y_pred = nc.predict(X_test_pca)
print(f'NC: {accuracy_score(y_pred, y_test)}')

KNN: 0.2745
NC: 0.2697


Παρατηρούμε μια μικρή βελτίωση της απόδοσης του ΚΝΝ στον χώρο του PCA.

In [76]:
from sklearn.decomposition import KernelPCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

def kpca_plus_lda(kpca_model, lda_model_in=None):
    X_train_kpca = kpca_model.fit_transform(X_train)
    X_val_kpca = kpca_model.transform(X_val)
    
    lda_model = lda_model_in if lda_model_in is not None else LinearDiscriminantAnalysis()
    
    X_train_lda = lda_model.fit_transform(X_train_kpca, y_train)
    X_val_lda = lda_model.transform(X_val_kpca)
    
    knn = KNeighborsClassifier(n_neighbors=5)
    knn.fit(X_train_lda, y_train)

    y_pred = knn.predict(X_val_lda)
    knn_acc = accuracy_score(y_pred, y_val)

    nc = NearestCentroid()
    nc.fit(X_train_lda, y_train)

    y_pred = nc.predict(X_val_lda)
    nc_acc = accuracy_score(y_pred, y_val)
    
    return knn_acc, nc_acc

### PCA + LDA
Ξεκινάω με δοκιμή της απλής περίπτωσης PCA + LDA

In [72]:
pca_model = PCA(n_components=.9)

print('PCA + LDA:')

lda_model = LinearDiscriminantAnalysis()

X_train_lda = lda_model.fit_transform(X_train_pca, y_train)
X_test_lda = lda_model.transform(X_test_pca)

knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train_lda, y_train)

y_pred = knn.predict(X_test_lda)
print(f'KNN: {accuracy_score(y_pred, y_test)}')

nc = NearestCentroid()
nc.fit(X_train_lda, y_train)

y_pred = nc.predict(X_test_lda)
print(f'NC: {accuracy_score(y_pred, y_test)}')

PCA + LDA:
KNN: 0.3072
NC: 0.3544


Παρατηρούμε ήδη πολύ καλύ βελτίωση σε σχέση με την απλή PCA.

### RBF Kernel
Δοκιμάζω αρικά με μεγάλες τιμές:

In [79]:
for gamma in [ 1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5 ]:
    for n_components in [ 500, 1000, 1500, 2000, 2500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='rbf', gamma=gamma)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)
        
        print(f'RBF with n_components={n_components}, gamma={gamma}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')


RBF with n_components=500, gamma=1: KNN: 9.56%, NC: 9.56%
RBF with n_components=1000, gamma=1: KNN: 10.08%, NC: 9.44%
RBF with n_components=1500, gamma=1: KNN: 10.68%, NC: 10.08%
RBF with n_components=2000, gamma=1: KNN: 9.36%, NC: 9.36%
RBF with n_components=2500, gamma=1: KNN: 9.36%, NC: 10.48%
RBF with n_components=500, gamma=0.1: KNN: 11.20%, NC: 11.12%
RBF with n_components=1000, gamma=0.1: KNN: 9.72%, NC: 11.20%
RBF with n_components=1500, gamma=0.1: KNN: 10.64%, NC: 10.20%
RBF with n_components=2000, gamma=0.1: KNN: 10.52%, NC: 10.12%
RBF with n_components=2500, gamma=0.1: KNN: 11.08%, NC: 15.32%
RBF with n_components=500, gamma=0.01: KNN: 38.28%, NC: 41.56%
RBF with n_components=1000, gamma=0.01: KNN: 38.12%, NC: 40.64%
RBF with n_components=1500, gamma=0.01: KNN: 36.80%, NC: 38.32%
RBF with n_components=2000, gamma=0.01: KNN: 37.80%, NC: 36.04%
RBF with n_components=2500, gamma=0.01: KNN: 20.44%, NC: 28.76%
RBF with n_components=500, gamma=0.001: KNN: 36.88%, NC: 38.72%
RBF wi

Πολύ καλή απόδοση παίρνουμε για `n_components=500, gamma=0.01` με `KNN: 38.28%, NC: 41.56%`. Δοκιμάζω ένα ακόμη grid search.

In [81]:
for gamma in [ 0.005, 0.01, 0.02 ]:
    for n_components in [ 700, 600, 500, 400, 300 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='rbf', gamma=gamma)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'RBF with n_components={n_components}, gamma={gamma}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

RBF with n_components=700, gamma=0.005: KNN: 40.52%, NC: 41.60%
RBF with n_components=600, gamma=0.005: KNN: 39.28%, NC: 41.56%
RBF with n_components=500, gamma=0.005: KNN: 38.92%, NC: 41.08%
RBF with n_components=400, gamma=0.005: KNN: 38.36%, NC: 40.36%
RBF with n_components=300, gamma=0.005: KNN: 38.12%, NC: 40.36%
RBF with n_components=700, gamma=0.01: KNN: 38.84%, NC: 41.72%
RBF with n_components=600, gamma=0.01: KNN: 37.64%, NC: 41.40%
RBF with n_components=500, gamma=0.01: KNN: 38.28%, NC: 41.56%
RBF with n_components=400, gamma=0.01: KNN: 38.00%, NC: 41.04%
RBF with n_components=300, gamma=0.01: KNN: 36.56%, NC: 41.68%
RBF with n_components=700, gamma=0.02: KNN: 29.20%, NC: 35.92%
RBF with n_components=600, gamma=0.02: KNN: 28.88%, NC: 36.40%
RBF with n_components=500, gamma=0.02: KNN: 27.44%, NC: 36.72%
RBF with n_components=400, gamma=0.02: KNN: 31.24%, NC: 37.12%
RBF with n_components=300, gamma=0.02: KNN: 29.16%, NC: 37.40%


In [84]:
for gamma in [ .003, .004, .005, .006 ]:
    for n_components in [ 650, 700, 750, 800 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='rbf', gamma=gamma)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'RBF with n_components={n_components}, gamma={gamma}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

RBF with n_components=650, gamma=0.003: KNN: 38.64%, NC: 41.24%
RBF with n_components=700, gamma=0.003: KNN: 38.80%, NC: 41.24%
RBF with n_components=750, gamma=0.003: KNN: 39.72%, NC: 41.48%
RBF with n_components=800, gamma=0.003: KNN: 39.48%, NC: 41.08%
RBF with n_components=650, gamma=0.004: KNN: 39.96%, NC: 41.72%
RBF with n_components=700, gamma=0.004: KNN: 40.40%, NC: 41.76%
RBF with n_components=750, gamma=0.004: KNN: 40.48%, NC: 41.68%
RBF with n_components=800, gamma=0.004: KNN: 39.24%, NC: 41.56%
RBF with n_components=650, gamma=0.005: KNN: 38.88%, NC: 41.92%
RBF with n_components=700, gamma=0.005: KNN: 40.52%, NC: 41.60%
RBF with n_components=750, gamma=0.005: KNN: 39.72%, NC: 42.20%
RBF with n_components=800, gamma=0.005: KNN: 39.92%, NC: 41.72%
RBF with n_components=650, gamma=0.006: KNN: 38.20%, NC: 41.52%
RBF with n_components=700, gamma=0.006: KNN: 39.60%, NC: 41.64%
RBF with n_components=750, gamma=0.006: KNN: 39.96%, NC: 42.28%
RBF with n_components=800, gamma=0.006: 

Για την RBF, πολύ καλές αποδόσεις παίρνουμε για `n_components=700, gamma=0.005` με `KNN: 40.52%, NC: 41.60%`, τα οποία ξεπερνάνε το 40%.

### Poly deg 2
Συνεχίζω με ανάλυση για πολυωνυμικά κerel 2ου βαθμού.

In [86]:
for coef0 in [ 100, 10, 1, 1e-1, 1e-2, 1e-3 ]:
    for n_components in [ 500, 1000, 1500, 2000, 2500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=2, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 2 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 2 with n_components=500, coef0=100: KNN: 29.44%, NC: 30.40%
Poly deg 2 with n_components=1000, coef0=100: KNN: 25.96%, NC: 27.60%
Poly deg 2 with n_components=1500, coef0=100: KNN: 24.88%, NC: 25.16%
Poly deg 2 with n_components=2000, coef0=100: KNN: 21.32%, NC: 21.52%
Poly deg 2 with n_components=2500, coef0=100: KNN: 14.56%, NC: 21.44%
Poly deg 2 with n_components=500, coef0=10: KNN: 30.80%, NC: 31.88%
Poly deg 2 with n_components=1000, coef0=10: KNN: 29.16%, NC: 30.16%
Poly deg 2 with n_components=1500, coef0=10: KNN: 28.96%, NC: 29.36%
Poly deg 2 with n_components=2000, coef0=10: KNN: 27.24%, NC: 27.80%
Poly deg 2 with n_components=2500, coef0=10: KNN: 14.52%, NC: 22.48%
Poly deg 2 with n_components=500, coef0=1: KNN: 33.08%, NC: 34.68%
Poly deg 2 with n_components=1000, coef0=1: KNN: 32.72%, NC: 34.04%
Poly deg 2 with n_components=1500, coef0=1: KNN: 31.24%, NC: 32.48%
Poly deg 2 with n_components=2000, coef0=1: KNN: 32.28%, NC: 32.40%
Poly deg 2 with n_components=2500, c

* `n_components=1000, coef0=0.1: KNN: 35.28%, NC: 37.04%`
* `n_components=1000, coef0=0.01: KNN: 35.36%, NC: 37.68%`

In [87]:
for coef0 in [ .005, .01, .04, .07, .1, .13 ]:
    for n_components in [ 800, 900, 1000, 1100, 1200 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=2, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 2 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 2 with n_components=800, coef0=0.005: KNN: 34.20%, NC: 38.04%
Poly deg 2 with n_components=900, coef0=0.005: KNN: 35.52%, NC: 37.32%
Poly deg 2 with n_components=1000, coef0=0.005: KNN: 34.84%, NC: 37.72%
Poly deg 2 with n_components=1100, coef0=0.005: KNN: 34.80%, NC: 37.04%
Poly deg 2 with n_components=1200, coef0=0.005: KNN: 35.92%, NC: 36.72%
Poly deg 2 with n_components=800, coef0=0.01: KNN: 34.28%, NC: 37.80%
Poly deg 2 with n_components=900, coef0=0.01: KNN: 35.00%, NC: 37.56%
Poly deg 2 with n_components=1000, coef0=0.01: KNN: 35.36%, NC: 37.68%
Poly deg 2 with n_components=1100, coef0=0.01: KNN: 34.88%, NC: 36.96%
Poly deg 2 with n_components=1200, coef0=0.01: KNN: 35.24%, NC: 36.48%
Poly deg 2 with n_components=800, coef0=0.04: KNN: 34.68%, NC: 37.60%
Poly deg 2 with n_components=900, coef0=0.04: KNN: 35.52%, NC: 37.44%
Poly deg 2 with n_components=1000, coef0=0.04: KNN: 35.52%, NC: 37.16%
Poly deg 2 with n_components=1100, coef0=0.04: KNN: 35.20%, NC: 36.84%
Poly de

Γενικά δεν υπάρχουν μεγάλες διαφορές για τις παραπάνω τιμές. Σε μερικά έχουμε καλή ακρίβεια για το ΚΝΝ και σε άλλα για το NCC.

### Poly deg 3
Δοκιμάζω το ίδιο με πολυώνυμα 3ου βαθμού.

In [89]:
for coef0 in [ 100, 10, 1, 1e-1, 1e-2, 1e-3 ]:
    for n_components in [ 500, 1000, 1500, 2000, 2500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=3, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 3 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 3 with n_components=500, coef0=100: KNN: 29.48%, NC: 30.68%
Poly deg 3 with n_components=1000, coef0=100: KNN: 26.76%, NC: 27.88%
Poly deg 3 with n_components=1500, coef0=100: KNN: 26.64%, NC: 26.68%
Poly deg 3 with n_components=2000, coef0=100: KNN: 22.84%, NC: 23.12%
Poly deg 3 with n_components=2500, coef0=100: KNN: 14.52%, NC: 21.72%
Poly deg 3 with n_components=500, coef0=10: KNN: 29.40%, NC: 32.28%
Poly deg 3 with n_components=1000, coef0=10: KNN: 30.48%, NC: 31.44%
Poly deg 3 with n_components=1500, coef0=10: KNN: 30.04%, NC: 30.40%
Poly deg 3 with n_components=2000, coef0=10: KNN: 28.32%, NC: 29.64%
Poly deg 3 with n_components=2500, coef0=10: KNN: 15.88%, NC: 22.92%
Poly deg 3 with n_components=500, coef0=1: KNN: 33.44%, NC: 35.48%
Poly deg 3 with n_components=1000, coef0=1: KNN: 34.92%, NC: 35.80%
Poly deg 3 with n_components=1500, coef0=1: KNN: 33.36%, NC: 34.32%
Poly deg 3 with n_components=2000, coef0=1: KNN: 32.60%, NC: 33.84%
Poly deg 3 with n_components=2500, c

`n_components=1000, coef0=0.001: KNN: 36.92%, NC: 36.40%`

In [90]:
for coef0 in [ .0008, .0009, .001, .0011, .0012 ]:
    for n_components in [ 800, 900, 1000, 1100, 1200 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=3, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 3 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 3 with n_components=800, coef0=0.0008: KNN: 34.88%, NC: 37.20%
Poly deg 3 with n_components=900, coef0=0.0008: KNN: 34.52%, NC: 36.72%
Poly deg 3 with n_components=1000, coef0=0.0008: KNN: 36.76%, NC: 36.52%
Poly deg 3 with n_components=1100, coef0=0.0008: KNN: 35.12%, NC: 36.68%
Poly deg 3 with n_components=1200, coef0=0.0008: KNN: 35.56%, NC: 35.84%
Poly deg 3 with n_components=800, coef0=0.0009: KNN: 34.64%, NC: 37.16%
Poly deg 3 with n_components=900, coef0=0.0009: KNN: 34.76%, NC: 36.56%
Poly deg 3 with n_components=1000, coef0=0.0009: KNN: 36.80%, NC: 36.44%
Poly deg 3 with n_components=1100, coef0=0.0009: KNN: 34.84%, NC: 36.80%
Poly deg 3 with n_components=1200, coef0=0.0009: KNN: 35.72%, NC: 35.96%
Poly deg 3 with n_components=800, coef0=0.001: KNN: 34.48%, NC: 37.12%
Poly deg 3 with n_components=900, coef0=0.001: KNN: 34.96%, NC: 36.96%
Poly deg 3 with n_components=1000, coef0=0.001: KNN: 36.92%, NC: 36.40%
Poly deg 3 with n_components=1100, coef0=0.001: KNN: 34.80%,

Δεν παρατηρούμε κάποια σημαντική βελτίωση.

### Poly deg 5
Συνεχίζω με πολυωνυμικά kernel βαθμού 5.

In [91]:
for coef0 in [ 100, 10, 1, 1e-1, 1e-2, 1e-3 ]:
    for n_components in [ 500, 1000, 1500, 2000, 2500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=5, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 5 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 5 with n_components=500, coef0=100: KNN: 30.40%, NC: 30.92%
Poly deg 5 with n_components=1000, coef0=100: KNN: 27.12%, NC: 28.60%
Poly deg 5 with n_components=1500, coef0=100: KNN: 27.32%, NC: 27.36%
Poly deg 5 with n_components=2000, coef0=100: KNN: 24.80%, NC: 25.40%
Poly deg 5 with n_components=2500, coef0=100: KNN: 14.32%, NC: 22.08%
Poly deg 5 with n_components=500, coef0=10: KNN: 30.28%, NC: 33.60%
Poly deg 5 with n_components=1000, coef0=10: KNN: 31.56%, NC: 32.32%
Poly deg 5 with n_components=1500, coef0=10: KNN: 30.36%, NC: 30.88%
Poly deg 5 with n_components=2000, coef0=10: KNN: 30.60%, NC: 31.20%
Poly deg 5 with n_components=2500, coef0=10: KNN: 15.40%, NC: 22.48%
Poly deg 5 with n_components=500, coef0=1: KNN: 34.80%, NC: 37.16%
Poly deg 5 with n_components=1000, coef0=1: KNN: 35.56%, NC: 37.44%
Poly deg 5 with n_components=1500, coef0=1: KNN: 35.48%, NC: 35.36%
Poly deg 5 with n_components=2000, coef0=1: KNN: 33.92%, NC: 33.84%
Poly deg 5 with n_components=2500, c

`n_components=500, coef0=0.1: KNN: 35.68%, NC: 38.12%`

In [96]:
for coef0 in [ .09, .1, .11 ]:
    for n_components in [ 400, 500, 600 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=5, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 5 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 5 with n_components=400, coef0=0.09: KNN: 34.20%, NC: 38.32%
Poly deg 5 with n_components=500, coef0=0.09: KNN: 35.76%, NC: 37.56%
Poly deg 5 with n_components=600, coef0=0.09: KNN: 35.92%, NC: 37.80%
Poly deg 5 with n_components=400, coef0=0.1: KNN: 34.44%, NC: 38.44%
Poly deg 5 with n_components=500, coef0=0.1: KNN: 35.68%, NC: 38.12%
Poly deg 5 with n_components=600, coef0=0.1: KNN: 35.32%, NC: 37.56%
Poly deg 5 with n_components=400, coef0=0.11: KNN: 34.88%, NC: 38.60%
Poly deg 5 with n_components=500, coef0=0.11: KNN: 35.40%, NC: 38.48%
Poly deg 5 with n_components=600, coef0=0.11: KNN: 35.32%, NC: 37.64%


Κρατάμε το `n_components=500, coef0=0.1: KNN: 35.68%, NC: 38.12%`.

### Poly deg 7

In [92]:
for coef0 in [ 100, 10, 1, 1e-1, 1e-2, 1e-3 ]:
    for n_components in [ 500, 1000, 1500, 2000, 2500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=7, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 7 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 7 with n_components=500, coef0=100: KNN: 29.92%, NC: 31.24%
Poly deg 7 with n_components=1000, coef0=100: KNN: 27.24%, NC: 29.48%
Poly deg 7 with n_components=1500, coef0=100: KNN: 27.92%, NC: 28.16%
Poly deg 7 with n_components=2000, coef0=100: KNN: 25.76%, NC: 26.20%
Poly deg 7 with n_components=2500, coef0=100: KNN: 14.56%, NC: 23.00%
Poly deg 7 with n_components=500, coef0=10: KNN: 33.40%, NC: 34.12%
Poly deg 7 with n_components=1000, coef0=10: KNN: 32.32%, NC: 33.20%
Poly deg 7 with n_components=1500, coef0=10: KNN: 31.52%, NC: 31.84%
Poly deg 7 with n_components=2000, coef0=10: KNN: 32.08%, NC: 32.04%
Poly deg 7 with n_components=2500, coef0=10: KNN: 16.48%, NC: 23.04%
Poly deg 7 with n_components=500, coef0=1: KNN: 35.64%, NC: 37.92%
Poly deg 7 with n_components=1000, coef0=1: KNN: 35.80%, NC: 37.88%
Poly deg 7 with n_components=1500, coef0=1: KNN: 35.84%, NC: 36.92%
Poly deg 7 with n_components=2000, coef0=1: KNN: 33.92%, NC: 34.00%
Poly deg 7 with n_components=2500, c

* `n_components=500, coef0=1: KNN: 35.64%, NC: 37.92%`
* `n_components=1000, coef0=1: KNN: 35.80%, NC: 37.88%`

In [97]:
for coef0 in [ .9, 1, 1.1  ]:
    for n_components in [ 400, 500, 700, 850, 1000, 1100 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=7, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 7 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 7 with n_components=400, coef0=0.9: KNN: 35.20%, NC: 38.16%
Poly deg 7 with n_components=500, coef0=0.9: KNN: 36.28%, NC: 37.84%
Poly deg 7 with n_components=700, coef0=0.9: KNN: 35.24%, NC: 38.40%
Poly deg 7 with n_components=850, coef0=0.9: KNN: 36.00%, NC: 38.92%
Poly deg 7 with n_components=1000, coef0=0.9: KNN: 35.64%, NC: 37.96%
Poly deg 7 with n_components=1100, coef0=0.9: KNN: 36.60%, NC: 38.08%
Poly deg 7 with n_components=400, coef0=1: KNN: 35.28%, NC: 37.92%
Poly deg 7 with n_components=500, coef0=1: KNN: 35.64%, NC: 37.92%
Poly deg 7 with n_components=700, coef0=1: KNN: 35.16%, NC: 38.36%
Poly deg 7 with n_components=850, coef0=1: KNN: 35.96%, NC: 38.48%
Poly deg 7 with n_components=1000, coef0=1: KNN: 35.80%, NC: 37.88%
Poly deg 7 with n_components=1100, coef0=1: KNN: 36.60%, NC: 37.84%
Poly deg 7 with n_components=400, coef0=1.1: KNN: 34.96%, NC: 37.76%
Poly deg 7 with n_components=500, coef0=1.1: KNN: 35.20%, NC: 37.40%
Poly deg 7 with n_components=700, coef0=1.

In [99]:
for coef0 in [ .9, 1, 1.1  ]:
    for n_components in [ 1100, 1300, 1500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=7, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 7 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 7 with n_components=1100, coef0=0.9: KNN: 36.60%, NC: 38.08%
Poly deg 7 with n_components=1300, coef0=0.9: KNN: 36.32%, NC: 37.80%
Poly deg 7 with n_components=1500, coef0=0.9: KNN: 36.12%, NC: 36.96%
Poly deg 7 with n_components=1100, coef0=1: KNN: 36.60%, NC: 37.84%
Poly deg 7 with n_components=1300, coef0=1: KNN: 36.32%, NC: 37.56%
Poly deg 7 with n_components=1500, coef0=1: KNN: 35.84%, NC: 36.92%
Poly deg 7 with n_components=1100, coef0=1.1: KNN: 35.72%, NC: 37.44%
Poly deg 7 with n_components=1300, coef0=1.1: KNN: 36.16%, NC: 37.84%
Poly deg 7 with n_components=1500, coef0=1.1: KNN: 35.84%, NC: 36.64%


`n_components=1100, coef0=0.9: KNN: 36.60%, NC: 38.08%`

### Poly deg 9

In [93]:
for coef0 in [ 100, 10, 1, 1e-1, 1e-2, 1e-3 ]:
    for n_components in [ 500, 1000, 1500, 2000, 2500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=9, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 9 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 9 with n_components=500, coef0=100: KNN: 30.20%, NC: 31.40%
Poly deg 9 with n_components=1000, coef0=100: KNN: 28.08%, NC: 29.88%
Poly deg 9 with n_components=1500, coef0=100: KNN: 29.04%, NC: 28.80%
Poly deg 9 with n_components=2000, coef0=100: KNN: 26.28%, NC: 27.00%
Poly deg 9 with n_components=2500, coef0=100: KNN: 15.24%, NC: 22.24%
Poly deg 9 with n_components=500, coef0=10: KNN: 33.00%, NC: 34.48%
Poly deg 9 with n_components=1000, coef0=10: KNN: 33.24%, NC: 34.04%
Poly deg 9 with n_components=1500, coef0=10: KNN: 31.56%, NC: 32.72%
Poly deg 9 with n_components=2000, coef0=10: KNN: 33.12%, NC: 32.96%
Poly deg 9 with n_components=2500, coef0=10: KNN: 15.36%, NC: 22.60%
Poly deg 9 with n_components=500, coef0=1: KNN: 36.12%, NC: 37.96%
Poly deg 9 with n_components=1000, coef0=1: KNN: 36.28%, NC: 38.48%
Poly deg 9 with n_components=1500, coef0=1: KNN: 36.20%, NC: 37.12%
Poly deg 9 with n_components=2000, coef0=1: KNN: 34.16%, NC: 34.40%
Poly deg 9 with n_components=2500, c

`n_components=1000, coef0=1: KNN: 36.28%, NC: 38.48%`

In [98]:
for coef0 in [ .8, .9, 1, 1.1, 1.2 ]:
    for n_components in [ 800, 900, 1000, 1100, 1200 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=9, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 9 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 9 with n_components=800, coef0=0.8: KNN: 35.40%, NC: 38.12%
Poly deg 9 with n_components=900, coef0=0.8: KNN: 35.16%, NC: 38.36%
Poly deg 9 with n_components=1000, coef0=0.8: KNN: 35.64%, NC: 38.40%
Poly deg 9 with n_components=1100, coef0=0.8: KNN: 35.00%, NC: 38.08%
Poly deg 9 with n_components=1200, coef0=0.8: KNN: 36.28%, NC: 38.28%
Poly deg 9 with n_components=800, coef0=0.9: KNN: 36.28%, NC: 38.24%
Poly deg 9 with n_components=900, coef0=0.9: KNN: 35.76%, NC: 38.44%
Poly deg 9 with n_components=1000, coef0=0.9: KNN: 35.72%, NC: 38.56%
Poly deg 9 with n_components=1100, coef0=0.9: KNN: 35.32%, NC: 38.12%
Poly deg 9 with n_components=1200, coef0=0.9: KNN: 36.64%, NC: 38.40%
Poly deg 9 with n_components=800, coef0=1: KNN: 36.00%, NC: 37.88%
Poly deg 9 with n_components=900, coef0=1: KNN: 35.08%, NC: 38.36%
Poly deg 9 with n_components=1000, coef0=1: KNN: 36.28%, NC: 38.48%
Poly deg 9 with n_components=1100, coef0=1: KNN: 35.28%, NC: 38.28%
Poly deg 9 with n_components=1200,

`n_components=1200, coef0=1.2: KNN: 36.76%, NC: 38.48%`

### Poly deg 11

In [94]:
for coef0 in [ 100, 10, 1, 1e-1, 1e-2, 1e-3 ]:
    for n_components in [ 500, 1000, 1500, 2000, 2500 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=11, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 11 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 11 with n_components=500, coef0=100: KNN: 30.56%, NC: 31.64%
Poly deg 11 with n_components=1000, coef0=100: KNN: 29.04%, NC: 30.24%
Poly deg 11 with n_components=1500, coef0=100: KNN: 28.88%, NC: 29.04%
Poly deg 11 with n_components=2000, coef0=100: KNN: 27.44%, NC: 27.92%
Poly deg 11 with n_components=2500, coef0=100: KNN: 16.28%, NC: 22.72%
Poly deg 11 with n_components=500, coef0=10: KNN: 32.68%, NC: 34.80%
Poly deg 11 with n_components=1000, coef0=10: KNN: 33.20%, NC: 34.72%
Poly deg 11 with n_components=1500, coef0=10: KNN: 31.88%, NC: 33.56%
Poly deg 11 with n_components=2000, coef0=10: KNN: 32.72%, NC: 33.08%
Poly deg 11 with n_components=2500, coef0=10: KNN: 15.56%, NC: 22.08%
Poly deg 11 with n_components=500, coef0=1: KNN: 34.60%, NC: 37.40%
Poly deg 11 with n_components=1000, coef0=1: KNN: 35.20%, NC: 38.28%
Poly deg 11 with n_components=1500, coef0=1: KNN: 36.44%, NC: 37.16%
Poly deg 11 with n_components=2000, coef0=1: KNN: 33.84%, NC: 34.12%
Poly deg 11 with n_com

`n_components=1500, coef0=1: KNN: 36.44%, NC: 37.16%`

In [100]:
for coef0 in [ .8, .9, 1, 1.1, 1.2 ]:
    for n_components in [ 1300, 1500, 1700 ]:
        kpca_model = KernelPCA(n_components=n_components, kernel='poly', degree=11, coef0=coef0)
        knn_acc, nc_acc = kpca_plus_lda(kpca_model)

        print(f'Poly deg 11 with n_components={n_components}, coef0={coef0}: KNN: {100 * knn_acc:.2f}%, NC: {100 * nc_acc:.2f}%')

Poly deg 11 with n_components=1300, coef0=0.8: KNN: 36.08%, NC: 37.44%
Poly deg 11 with n_components=1500, coef0=0.8: KNN: 35.96%, NC: 37.44%
Poly deg 11 with n_components=1700, coef0=0.8: KNN: 34.64%, NC: 35.88%
Poly deg 11 with n_components=1300, coef0=0.9: KNN: 36.08%, NC: 37.76%
Poly deg 11 with n_components=1500, coef0=0.9: KNN: 34.84%, NC: 37.24%
Poly deg 11 with n_components=1700, coef0=0.9: KNN: 35.08%, NC: 35.68%
Poly deg 11 with n_components=1300, coef0=1: KNN: 36.68%, NC: 38.32%
Poly deg 11 with n_components=1500, coef0=1: KNN: 36.44%, NC: 37.16%
Poly deg 11 with n_components=1700, coef0=1: KNN: 35.76%, NC: 35.92%
Poly deg 11 with n_components=1300, coef0=1.1: KNN: 36.88%, NC: 38.60%
Poly deg 11 with n_components=1500, coef0=1.1: KNN: 35.64%, NC: 37.16%
Poly deg 11 with n_components=1700, coef0=1.1: KNN: 35.28%, NC: 35.88%
Poly deg 11 with n_components=1300, coef0=1.2: KNN: 36.84%, NC: 38.56%
Poly deg 11 with n_components=1500, coef0=1.2: KNN: 36.20%, NC: 37.52%
Poly deg 11 

`n_components=1300, coef0=1.1: KNN: 36.88%, NC: 38.60%`

Δεν παρατηρούμε μεγάλες βελτιώσεις από kernel μικρότερου βαθμού. Δεν εξερευνάω πολυώνυμα μεγαλύτερου βαθμού.

## Σύνοψη
Συλλέγω τις καλύτερες παραμέτρους από όλα τα kernels και τα τρέχω στο test set.

In [103]:
def kpca_plus_lda_test(kpca_model, lda_model_in=None):
    X_train_kpca = kpca_model.fit_transform(X_train)
    X_test_kpca = kpca_model.transform(X_test)
    
    lda_model = lda_model_in if lda_model_in is not None else LinearDiscriminantAnalysis()
    
    X_train_lda = lda_model.fit_transform(X_train_kpca, y_train)
    X_test_lda = lda_model.transform(X_test_kpca)
    
    knn = KNeighborsClassifier(n_neighbors=5)
    knn.fit(X_train_lda, y_train)

    y_pred = knn.predict(X_test_lda)
    knn_acc = accuracy_score(y_pred, y_test)

    nc = NearestCentroid()
    nc.fit(X_train_lda, y_train)

    y_pred = nc.predict(X_test_lda)
    nc_acc = accuracy_score(y_pred, y_test)
    
    return knn_acc, nc_acc

print('Accuracies on the test data.')

kpca_rbf = KernelPCA(n_components=700, kernel='rbf', gamma=0.005)
knn_acc, nc_acc = kpca_plus_lda_test(kpca_rbf)
print(f'KPCA RBF: KNN: {100 * knn_acc:.2f}%, NCC: {100 * nc_acc:.2f}%')

kpca_poly2 = KernelPCA(n_components=1000, kernel='poly', degree=2, coef0=0.01)
knn_acc, nc_acc = kpca_plus_lda_test(kpca_poly2)
print(f'KPCA poly deg 2: KNN: {100 * knn_acc:.2f}%, NCC: {100 * nc_acc:.2f}%')

kpca_poly3 = KernelPCA(n_components=1000, kernel='poly', degree=3, coef0=0.001)
knn_acc, nc_acc = kpca_plus_lda_test(kpca_poly3)
print(f'KPCA poly deg 3: KNN: {100 * knn_acc:.2f}%, NCC: {100 * nc_acc:.2f}%')

kpca_poly5 = KernelPCA(n_components=500, kernel='poly', degree=5, coef0=0.1)
knn_acc, nc_acc = kpca_plus_lda_test(kpca_poly5)
print(f'KPCA poly deg 5: KNN: {100 * knn_acc:.2f}%, NCC: {100 * nc_acc:.2f}%')

kpca_poly7 = KernelPCA(n_components=1100, kernel='poly', degree=7, coef0=0.9)
knn_acc, nc_acc = kpca_plus_lda_test(kpca_poly7)
print(f'KPCA poly deg 7: KNN: {100 * knn_acc:.2f}%, NCC: {100 * nc_acc:.2f}%')

kpca_poly9 = KernelPCA(n_components=1200, kernel='poly', degree=9, coef0=1.2)
knn_acc, nc_acc = kpca_plus_lda_test(kpca_poly9)
print(f'KPCA poly deg 9: KNN: {100 * knn_acc:.2f}%, NCC: {100 * nc_acc:.2f}%')

kpca_poly11 = KernelPCA(n_components=1300, kernel='poly', degree=11, coef0=1.1)
knn_acc, nc_acc = kpca_plus_lda_test(kpca_poly11)
print(f'KPCA poly deg 11: KNN: {100 * knn_acc:.2f}%, NCC: {100 * nc_acc:.2f}%')



Accuracies on the test data.
KPCA RBF: KNN: 39.38%, NCC: 41.26%
KPCA poly deg 2: KNN: 35.22%, NCC: 37.19%
KPCA poly deg 3: KNN: 35.88%, NCC: 36.50%
KPCA poly deg 5: KNN: 34.74%, NCC: 38.06%
KPCA poly deg 7: KNN: 35.86%, NCC: 37.12%
KPCA poly deg 9: KNN: 35.71%, NCC: 36.77%
KPCA poly deg 11: KNN: 35.64%, NCC: 36.97%


Πίνακας αποτελεσμάτων για το CIFAR-10 στα test data:

| Kernel      | KNN Acc    | NCC Acc  |
| :---        | ---    | ---    |
| original    | 25.22% | 26.94% |
| PCA         | 27.45% | 26.97% |
| PCA+LDA     | 30.72% | 35.44% |
| Poly deg 2  | 35.22% | 37.19% |
| Poly deg 3  | 35.88% | 36.50% |
| Poly deg 5  | 34.74% | 38.06% |
| Poly deg 7  | 35.86% | 37.12% |
| Poly deg 9  | 35.71% | 36.77% |
| Poly deg 11 | 35.64% | 36.97% |
| RBF         | 39.38% | 41.26% |

## Σύγκριση με SVM
Δοκιμάζω μερικά SVM για σύγκριση:

In [107]:
from sklearn.svm import SVC

In [114]:
def train_test_SVM(model):
    model.fit(X_train, y_train)
    y_pred = model.predict(X_val)
    
    return accuracy_score(y_pred, y_val)

In [112]:
linear_svc = SVC(kernel='linear')
linear_svc.fit(X_train, y_train) # same results for any C

y_pred = linear_svc.predict(X_test)
acc = accuracy_score(y_pred, y_test)
print(f'Linear SVM: {100 * acc:.2f}%')

Linear SVM: 30.58%


In [120]:
for gamma in ['scale', 'auto']:
    for C in [1e-3, 1e-2, .1, 1, 10, 100, 1e3]:
        rbf_svc = SVC(kernel='rbf', gamma=gamma, C=C)
        
        acc = train_test_SVM(rbf_svc)
        print(f'RBF SVM with gamma={gamma}, C={C}: {100 * acc:.2f}%')        


RBF SVM with gamma=scale, C=0.001: 10.08%
RBF SVM with gamma=scale, C=0.01: 10.08%
RBF SVM with gamma=scale, C=0.1: 29.56%
RBF SVM with gamma=scale, C=1: 40.76%
RBF SVM with gamma=scale, C=10: 40.92%
RBF SVM with gamma=scale, C=100: 40.72%
RBF SVM with gamma=scale, C=1000.0: 40.72%
RBF SVM with gamma=auto, C=0.001: 10.08%
RBF SVM with gamma=auto, C=0.01: 10.08%
RBF SVM with gamma=auto, C=0.1: 16.24%
RBF SVM with gamma=auto, C=1: 33.48%
RBF SVM with gamma=auto, C=10: 37.52%
RBF SVM with gamma=auto, C=100: 36.92%
RBF SVM with gamma=auto, C=1000.0: 35.16%


In [119]:
for gamma in [.1, 1e-2, 1e-3, 1e-4, 1e-5]:
    for C in [1, 10, 100, 1e3, 1e4]:
        rbf_svc = SVC(kernel='rbf', gamma=gamma, C=C)
        
        acc = train_test_SVM(rbf_svc)
        print(f'RBF SVM with gamma={gamma}, C={C}: {100 * acc:.2f}%')    

RBF SVM with gamma=0.1, C=1: 11.32%
RBF SVM with gamma=0.1, C=10: 11.68%
RBF SVM with gamma=0.1, C=100: 11.68%
RBF SVM with gamma=0.1, C=1000.0: 11.68%
RBF SVM with gamma=0.1, C=10000.0: 11.68%
RBF SVM with gamma=0.01, C=1: 40.32%
RBF SVM with gamma=0.01, C=10: 41.24%
RBF SVM with gamma=0.01, C=100: 41.08%
RBF SVM with gamma=0.01, C=1000.0: 41.08%
RBF SVM with gamma=0.01, C=10000.0: 41.08%
RBF SVM with gamma=0.001, C=1: 37.16%
RBF SVM with gamma=0.001, C=10: 39.56%
RBF SVM with gamma=0.001, C=100: 38.40%
RBF SVM with gamma=0.001, C=1000.0: 38.48%
RBF SVM with gamma=0.001, C=10000.0: 38.48%
RBF SVM with gamma=0.0001, C=1: 27.32%
RBF SVM with gamma=0.0001, C=10: 35.04%
RBF SVM with gamma=0.0001, C=100: 36.64%
RBF SVM with gamma=0.0001, C=1000.0: 34.12%
RBF SVM with gamma=0.0001, C=10000.0: 33.40%
RBF SVM with gamma=1e-05, C=1: 10.08%
RBF SVM with gamma=1e-05, C=10: 27.28%
RBF SVM with gamma=1e-05, C=100: 35.12%
RBF SVM with gamma=1e-05, C=1000.0: 35.08%
RBF SVM with gamma=1e-05, C=10000.

`gamma=0.01, C=10: 41.24%`

In [122]:
for gamma in [.008, .009, .01, .011, .012]:
    for C in [6, 8, 10, 12, 14]:
        rbf_svc = SVC(kernel='rbf', gamma=gamma, C=C)
        
        acc = train_test_SVM(rbf_svc)
        print(f'RBF SVM with gamma={gamma}, C={C}: {100 * acc:.2f}%') 

RBF SVM with gamma=0.008, C=6: 41.44%
RBF SVM with gamma=0.008, C=8: 41.48%
RBF SVM with gamma=0.008, C=10: 41.48%
RBF SVM with gamma=0.008, C=12: 41.60%
RBF SVM with gamma=0.008, C=14: 41.64%
RBF SVM with gamma=0.009, C=6: 41.76%
RBF SVM with gamma=0.009, C=8: 41.60%
RBF SVM with gamma=0.009, C=10: 41.84%
RBF SVM with gamma=0.009, C=12: 41.80%
RBF SVM with gamma=0.009, C=14: 41.72%
RBF SVM with gamma=0.01, C=6: 41.52%
RBF SVM with gamma=0.01, C=8: 41.40%
RBF SVM with gamma=0.01, C=10: 41.24%
RBF SVM with gamma=0.01, C=12: 41.20%
RBF SVM with gamma=0.01, C=14: 41.08%
RBF SVM with gamma=0.011, C=6: 41.80%
RBF SVM with gamma=0.011, C=8: 41.48%
RBF SVM with gamma=0.011, C=10: 41.44%
RBF SVM with gamma=0.011, C=12: 41.32%
RBF SVM with gamma=0.011, C=14: 41.32%
RBF SVM with gamma=0.012, C=6: 41.28%
RBF SVM with gamma=0.012, C=8: 41.08%
RBF SVM with gamma=0.012, C=10: 41.08%
RBF SVM with gamma=0.012, C=12: 41.04%
RBF SVM with gamma=0.012, C=14: 41.04%


`gamma=0.011, C=6: 41.80%`

In [125]:
rbf_svc = SVC(kernel='rbf', gamma=0.011, C=6)
rbf_svc.fit(X_train, y_train) # same results for any C

y_pred = rbf_svc.predict(X_test)
acc = accuracy_score(y_pred, y_test)
print(f'RBF SVM with gamma=0.011, C=6: {100 * acc:.2f}%')

RBF SVM with gamma=0.011, C=6: 41.19%
