# Challenge Scratchbook

* This notebook explores methods for the Kernel Methods for Machine Learning Kaggle [challenge](https://www.kaggle.com/c/kernel-methods-for-machine-learning-2018-2019/data).

* Note that this is a binary classification challenge.

Our first goal is to implement two baseline methods:
1. Random classification
2. All instances are 0s (Doing so we get an idea of the proportion of 0's in the public test set)
3. Implement the Simple Pattern Recognition Algorithm (SPR) from Learning with Kernels 

Before that, we have to implement some data loaders

## Imports

In [1]:
import csv
import os
import numpy as np
from scipy import optimize
from itertools import product
import matplotlib.pyplot as plt
from tqdm import tqdm_notebook
import scipy as sp
from time import time
from utils.data import load_data, save_results
from utils.models import SVM, SPR, PCA
from utils.kernels import GaussianKernel

## Paths and Globals

In [2]:
CWD = os.getcwd()
DATA_DIR = os.path.join(CWD, "data")
RESULT_DIR = os.path.join(CWD, "results")

FILES = {0: {"train_mat": "Xtr0_mat100.csv",
             "train": "Xtr0.csv",
             "test_mat": "Xte0_mat100.csv",
             "test": "Xte0.csv",
             "label": "Ytr0.csv"},
         1: {"train_mat": "Xtr1_mat100.csv",
             "train": "Xtr1.csv",
             "test_mat": "Xte1_mat100.csv",
             "test": "Xte1.csv",
             "label": "Ytr1.csv"},
         2: {"train_mat": "Xtr2_mat100.csv",
             "train": "Xtr2.csv",
             "test_mat": "Xte2_mat100.csv",
             "test": "Xte2.csv",
             "label": "Ytr2.csv"}}

## 0 entries

In [3]:
#with open(os.path.join(RESULT_DIR, "results.csv"), 'w', newline='') as csvfile:
 #   writer = csv.writer(csvfile, delimiter=',')
    
  #  writer.writerow(["Id", "Bound"])
   # for i in range(3000):
    #    writer.writerow([i, 0])

**Comment:**

* We get 0.51266 which means that the dataset is pretty balanced.

## SPR: A Simple Pattern Recognition Algorithm

In [None]:
γ = 500
λ = 5e-5
kernel = GaussianKernel(γ)

results = np.zeros(3000)
len_files = len(FILES)
for i in range(len_files):
    X_train, Y_train, X_test = load_data(i, data_dir=DATA_DIR, files_dict=FILES)
    X_val = X_train[1600:]
    Y_val = Y_train[1600:]
    X_train = X_train[:1600]
    Y_train = Y_train[:1600]
    clf = SPR(kernel)
    clf.fit(X_train, Y_train)
    y_pred_train =clf.predict(X_train)
    y_pred_val = clf.predict(X_val)
    score_train = clf.score(y_pred_train, Y_train)
    score_val = clf.score(y_pred_val, Y_val)
    #results[i*1000:i*1000 + 1000] = y_pred_test
    print(f"Accuracy on train set / val set {i} : {score_train} / {score_val} (λ: {λ},γ: {γ})")

## Define Kernels (to delete)

In [17]:
def pol_kernel(x,y,c): #c=0
    return (x.dot(y) + c)**2

def gaussian_kernel(x,y, gamma): #c=100
    return np.exp(-gamma*np.linalg.norm(x-y)**2)

def linear_kernel(x,y,c): 
    return x.dot(y)

def laplace_kernel(x,y,gamma):
    return np.exp(-gamma*np.linalg.norm(x-y,1))

## SVM

In [11]:
γ = 400
λ = 5e-5
kernel = GaussianKernel(γ)

results = np.zeros(3000)
len_files = len(FILES)
for i in range(len_files):
    X_train, Y_train, X_test = load_data(i, data_dir=DATA_DIR, files_dict=FILES)
    X_val = X_train[1600:]
    Y_val = Y_train[1600:]
    X_train = X_train[:1600]
    Y_train = Y_train[:1600]
    clf = SVM(_lambda=λ, kernel=kernel)
    clf.fit(X_train, Y_train)
    y_pred_train =clf.predict(X_train)
    y_pred_val = clf.predict(X_val)
    score_train = clf.score(y_pred_train, Y_train)
    score_val = clf.score(y_pred_val, Y_val)
    #results[i*1000:i*1000 + 1000] = y_pred_test
    print(f"Accuracy on train set / val set {i} : {score_train} / {score_val} (λ: {λ},γ: {γ})")

Accuracy on train set / val set 0 : 1.0 / 0.5775 (λ: 5e-05,γ: 400)
Accuracy on train set / val set 1 : 1.0 / 0.7425 (λ: 5e-05,γ: 400)
Accuracy on train set / val set 2 : 1.0 / 0.6275 (λ: 5e-05,γ: 400)


## Tuning SVM

In [13]:
γ = 500
λ = 1e-4
gamma_list = np.linspace(300,γ,5, endpoint = True)
lambda_list = np.linspace(5e-6, λ, 5, endpoint = True)
settings = list(product(gamma_list,lambda_list))
best_score = {i: 0 for i in range(3)}
best_lambda = {i: 0 for i in range(3)}
best_gamma = {i: 0 for i in range(3)}

for k, tup in enumerate(settings):
    
    γ, λ = tup
    
    kernel = GaussianKernel(γ)

    results = np.zeros(3000)
    len_files = len(FILES)
    for i in range(len_files):
        X_train, Y_train, X_test = load_data(i, data_dir=DATA_DIR, files_dict=FILES)
        X_val = X_train[1600:]
        Y_val = Y_train[1600:]
        X_train = X_train[:1600]
        Y_train = Y_train[:1600]
        clf = SVM(_lambda=λ, kernel=kernel)
        clf.fit(X_train, Y_train)
        y_pred_train =clf.predict(X_train)
        y_pred_val = clf.predict(X_val)
        score_train = clf.score(y_pred_train, Y_train)
        score_val = clf.score(y_pred_val, Y_val)
        #results[i*1000:i*1000 + 1000] = y_pred_test
        print(f"Accuracy on train set / val set {i} : {score_train} / {score_val} (λ: {λ},γ: {γ})")
        
    print('\n')

Accuracy on train set / val set 0 : 1.0 / 0.5575 (λ: 5e-06,γ: 300.0)
Accuracy on train set / val set 1 : 1.0 / 0.745 (λ: 5e-06,γ: 300.0)
Accuracy on train set / val set 2 : 1.0 / 0.6325 (λ: 5e-06,γ: 300.0)


Accuracy on train set / val set 0 : 1.0 / 0.5575 (λ: 2.875e-05,γ: 300.0)
Accuracy on train set / val set 1 : 1.0 / 0.745 (λ: 2.875e-05,γ: 300.0)
Accuracy on train set / val set 2 : 1.0 / 0.6325 (λ: 2.875e-05,γ: 300.0)


Accuracy on train set / val set 0 : 1.0 / 0.5575 (λ: 5.25e-05,γ: 300.0)
Accuracy on train set / val set 1 : 1.0 / 0.745 (λ: 5.25e-05,γ: 300.0)
Accuracy on train set / val set 2 : 1.0 / 0.6325 (λ: 5.25e-05,γ: 300.0)


Accuracy on train set / val set 0 : 1.0 / 0.5575 (λ: 7.625e-05,γ: 300.0)
Accuracy on train set / val set 1 : 1.0 / 0.745 (λ: 7.625e-05,γ: 300.0)
Accuracy on train set / val set 2 : 1.0 / 0.6325 (λ: 7.625e-05,γ: 300.0)


Accuracy on train set / val set 0 : 1.0 / 0.56 (λ: 0.0001,γ: 300.0)
Accuracy on train set / val set 1 : 1.0 / 0.745 (λ: 0.0001,γ: 300.0

1er lancement:
- score : {0: 0.6, 1: 0.6775, 2: 0.6425}
- lambda : {0: 0.0002575, 1: 1e-05, 2: 0.000505}
- gamma : {0: 77.5, 1: 100.0, 2: 32.5}

2ème lancement:
- score : {0: 0.61, 1: 0.7075, 2: 0.6425}
- lambda : {0: 5.5e-05, 1: 1e-05, 2: 1e-05}
- gamma : {0: 70.0, 1: 300.0, 2: 300.0}

3ème lancement:
- score : {0: 0.6075, 1: 0.73, 2: 0.66}
- lambda : {0: 1e-05, 1: 1e-05, 2: 1e-05}
- gamma : {0: 350.0, 1: 400.0, 2: 500.0}

4ème lancement:
- score : {0: 0.59, 1: 0.73, 2: 0.6575}
- lambda : {0: 5e-06, 1: 5e-06, 2: 5e-06}
- gamma : {0: 400.0, 1: 400.0, 2: 400.0}

5ème lancement ! :
- score : {0: 0.59, 1: 0.73, 2: 0.6575}
- lambda : {0: 5e-05, 1: 5e-05, 2: 5e-05}
- gamma : {0: 400.0, 1: 400.0, 2: 400.0}

6ème lancement (après shuffle):
- score : {0: 0.5775, 1: 0.745, 2: 0.6375}
- lambda : {0: 1e-05, 1: 1e-05, 2: 1e-05}
- gamma : {0: 400.0, 1: 300.0, 2: 500.0}


## Biological Sequence Modeling with Convolutional Kernel Networks

**Define function to encode k-mer of x centered at position i**

In [7]:
ENCODING = {'A': [1.,0.,0.,0.],
                'C': [0.,1.,0.,0.],
                'G': [0.,0.,1.,0.],
                'T': [0.,0.,0.,1.]
               }

def P(i, x, k):
    
    not_in = True # True when the k-mers computed is at the edge of the sequence x
    if i-(k+1)//2 + 1 < 0:
        k_mer_i = x[len(x) + i-(k+1)//2 + 1:] + x[0 :  i + (k+2)//2]
    elif i + (k+2)//2 > len(x):
        k_mer_i = x[i-(k+1)//2 + 1 : ] +  x[:i + (k+2)//2 - len(x)]
    else:
        k_mer_i = x[i-(k+1)//2 + 1 :  i + (k+2)//2]
        not_in = False
        
    # concatenate one hot encoding
    L = []
    for c in k_mer_i:
        L += ENCODING[c]
    
    return np.array(L), not_in
    

**Define Kernels**

In [8]:
def K(u, σ):
    return np.exp((u-1)/σ**2)

# define the kernel on k-mers with the norm
def K0(z1, z2, σ):
    z1_norm = np.linalg.norm(z1)
    z2_norm = np.linalg.norm(z2)
    z1z2_norm = z1_norm*z2_norm
    return z1z2_norm*np.exp(-(1/(2*z1z2_norm*(σ**2)))*np.linalg.norm(z1-z2)**2)

# same kernel but with the scalar product
def K1(z1, z2, σ):
    z1_norm = np.linalg.norm(z1)
    z2_norm = np.linalg.norm(z2)
    z1z2_norm = z1_norm*z2_norm
    u = z1.dot(z2)/z1z2_norm
    return z1z2_norm*K(u,σ)

# define the kernel on sequences of k-mers
def conv_kernel(x,y,k, σ):
    mx = len(x)
    my = len(y)    
    Px = np.array([P(i,x,k)[0] for i in range(mx)])
    Py = np.array([P(i,y,k)[0] for i in range(mx)])
    PxPyt = Px.dot(Py.T)/k
    s = k*np.exp((1/(σ**2))*(PxPyt-1))
        
    return np.sum(s)/(mx*my)

## Unsupervised learning of the anchor points

**Choose parameters k of k-mer here**

In [9]:
k = 10

**Load all the k-mers**

In [13]:
def compute_kmers_list(idx, k):
    
    
    """This function compute all the k-mers of a list of sequences
    
    Parameters
    ------------
    - idx : int
        index of the dataset (0,1, or 2)
    - k : int
        length of the k-mers
    """
    X_train, Y_train, X_test = load_data(idx, data_dir=DATA_DIR, files_dict=FILES, mat = False)
    n = len(X_train)
    m = len(X_train[0])
    
    kmers = []
    for x in X_train:
        for i in range(m):
            p = P(i,x,k)
            if p[1] == False:
                kmers.append(p[0])
            
    kmers = np.array(kmers)
    
    return kmers

idx= 0
kmers = compute_kmers_list(idx, k)

print(f"Number of k-mers : {len(kmers)}")

Number of k-mers : 184000


**Test some $\sigma$**

In [25]:
σ0 = 0.4

In [26]:
X_train, Y_train, X_test = load_data(0, data_dir=DATA_DIR, files_dict=FILES, mat = False)


print("Check the kernel on k-mers:")

print('- Value of the kernel with two identical k-mer as input')
print(K0(kmers[0],kmers[0], σ0 ), K0(kmers[1000],kmers[1000],σ0), K0(kmers[2000],kmers[2000],σ0), 
      K0(kmers[3000],kmers[3000],σ0))

print('- Value of the kernel with two random different k-mer as input')
print(K0(kmers[0],kmers[1],σ0), K0(kmers[1000],kmers[1],σ0), K0(kmers[2000],kmers[1],σ0), K0(kmers[3000],kmers[1],σ0))

print("\n Check the kernel on sequences:")
print('- Value of the kernel with two identical sequences as input')
print(conv_kernel(X_train[10],X_train[10],k, σ0))
print(conv_kernel(X_train[100],X_train[100],k, σ0))
print(conv_kernel(X_train[1000],X_train[1000],k, σ0))
print(conv_kernel(X_train[1100],X_train[1100],k, σ0))
print(conv_kernel(X_train[1200],X_train[1200],k, σ0))

print('- Value of the kernel with two random different sequences as input')
print(conv_kernel(X_train[0],X_train[10],k, σ0))
print(conv_kernel(X_train[10],X_train[100],k, σ0))
print(conv_kernel(X_train[100],X_train[1000],k, σ0))
print(conv_kernel(X_train[1000],X_train[1100],k, σ0))
print(conv_kernel(X_train[1100],X_train[1200],k, σ0))

Check the kernel on k-mers:
- Value of the kernel with two identical k-mer as input
10.000000000000002 10.000000000000002 10.000000000000002 10.000000000000002
- Value of the kernel with two random different k-mer as input
0.4393693362340744 0.23517745856009153 0.036065631360157405 0.019304541362277113

 Check the kernel on sequences:
- Value of the kernel with two identical sequences as input
0.25956464520581596
0.24119415486607676
0.23624943944397556
0.23388930165037605
0.24253323018997613
- Value of the kernel with two random different sequences as input
0.14588082760917187
0.13881732912341568
0.13785101515169185
0.14012321574935283
0.1362135709297253


## K-means

In [27]:
def Kmeans(X, K, max_iter):
    
    #step 0 (initialize centroids mu)
    idx = np.random.randint(len(X), size=K)
    mu = X[idx]    
    
    n_iter = 0
    stop = False
    d = 1e6
    while stop != True:
        
        #create clusters
        clustering = np.zeros(len(X))
            
        #step 1 (minimizing by assigning a cluster to each point)
        for i in range(len(X)):
            clustering[i] = np.argmin(np.linalg.norm(X[i]-mu, axis=1))

        
        #step 2 (minimizing w.r.t mu)
        for k in range(K):
            if np.sum([clustering==k]) != 0:
                mu[k] = np.mean(X[clustering==k], axis=0)

        
        d_new = distortion(X, mu, clustering)
        #print(d_new)
        
        if(d_new == d) or n_iter > max_iter:
            stop = True
        d = d_new
        n_iter +=1
        
        
        
    return mu, clustering


def distortion(X, mu, clustering):
    dis = 0
    for k in range(len(mu)):
        dis = dis + np.linalg.norm(X[clustering==k] - mu[k])**2
    return dis

#We try several random initializations and keep the partition which minimize the distorsion.
def Kmeans_try(X, n_try, n_cluster, max_iter):

    for i in range(n_try):
        
        mu, cl = Kmeans(X, n_cluster, max_iter)
    
        if i == 0:
            dist_min = distortion(X, mu, cl)
            mu_min, cl_min = mu, cl
        else:
            if distortion(X, mu, cl) < dist_min:
                dist_min = distortion(X, mu, cl)
                mu_min, cl_min = mu, cl
                
                
    return mu_min, cl_min, dist_min

## Spectral Clustering

In [31]:
def spec_cl(n_cl, kmers, σ):
    
    # compute Gram matrix
    n = len(kmers)
    K = np.zeros((n,n))
    for j in range(n):
        for i in range(j+1):
            K[i,j] = K1(kmers[i],kmers[j], σ)
    K =  K + K.T
    np.fill_diagonal(K, np.diagonal(K)/2)
     
    # Compute the n_cl first eigenvectors (ui, ∆i)
    λ, v = np.linalg.eig(K)
    
    # compute the maximum entry of a row
    #cluster_idx = np.argmax(v[:,:n_cl], axis = 1)
    # Kmeans on the rows
    t = time()
    n_try = 1
    max_iter = 20
    Z = v[:,:n_cl]/np.sum(v[:,:n_cl],axis=1).reshape(-1,1) # normalize v
    
    mu_rows, cluster_idx, dist = Kmeans_try(Z, n_try, n_cl, max_iter)
    print(f"time kmeans {time() - t}")
    
    
    # compute the barycenter
    mu = []
    for i in range(n_cl):
        if np.sum([cluster_idx==i]) != 0:
            bary = np.mean(kmers[cluster_idx==i], axis = 0)
            mu.append(bary)
    
    return mu

**Choose parameter $\sigma$ here**

In [32]:
σc = 0.4

In [33]:
kmers = compute_kmers_list(0, k)

n_cl = 200
n_kmers = 10000
# choose random kmers to do clustering
idx = np.random.choice(range(len(kmers)), size = n_kmers, replace = False)
mu = spec_cl(n_cl, kmers[idx], σc)

time kmeans 118.43829107284546


**Compute the mapping approximation $\psi$**

In [34]:
σ = σc

Z = np.einsum('ij, i -> ij',mu, 1/np.linalg.norm(mu, axis = 1)) # normalized mu
#Z = skm.cluster_centers_

# compute K_ZZ
p = len(mu)
K_zz = np.zeros((p,p))
for j in range(p):
    for i in range(j+1):
        K_zz[i,j] = K1(Z[i],Z[j], σ)
K_zz =  K_zz + K_zz.T
np.fill_diagonal(K_zz, np.diagonal(K_zz)/2)

# compute K_ZZ
K_ZZ_inv_sqr = sp.linalg.sqrtm(np.linalg.inv(K_zz))


# define ψ_0
def ψ_0(z,Z_anchor, k , σ):
    return(K_ZZ_inv_sqr.dot(np.array([K1(z,z_a, σ) for z_a in Z_anchor])))


# define ψ
def ψ(x, Z_anchor, k , σ):
    P_x = [P(i,x,k)[0] for i in range(len(x))]
    L = np.array([ψ_0(z, Z_anchor, k, σ) for z in P_x])
    return np.sum(L, axis=0)/len(L)


# define approx kernel
def approx_K(x,y, mu_min, k, σ):
    return ψ(x, mu_min, k, σ).dot(ψ(y, mu_min, k, σ))

**Check Kernel approximation**

In [35]:
x = np.random.choice(X_train)
P_x = [P(i,x,k)[0] for i in range(len(x))]
L = np.array([ψ_0(z, Z, k, σ) for z in P_x])


i = 0
j = 0
L[i].dot(L[j]).real, K0(P_x[i],P_x[j], σ)

(0.31779230005193637, 10.000000000000002)

In [36]:
n_mean = 10

mu = Z

mean_error = 0
mean_value = 0
var = 0
for i in tqdm_notebook(range(n_mean)):
    x = np.random.choice(X_train)
    y = np.random.choice(X_train)

    true_value = conv_kernel(x,y,k, σ)
    approx_value = approx_K(x, y, mu, k, σ)
    
    mean_error += np.abs(true_value - approx_value)
    mean_value += true_value
    var += true_value**2
    
    if (i<10):
        print(true_value, approx_value)
    
mean_error = mean_error/n_mean
mean_value = mean_value/n_mean
var = var/n_mean- mean_value**2
standard_deviation = np.sqrt(var)    

print(f"% error =  {mean_error/standard_deviation*100}")
print(f"Mean Approximation Error: {mean_error} / True Kernel sd : {standard_deviation} / Mean true Kernel Value : {mean_value}")

HBox(children=(IntProgress(value=0, max=10), HTML(value='')))

0.14202739078855803 0.10633018876535529
0.13778754157098014 0.10853707149094022
0.1389965102874467 0.11198354172452432
0.13815820134531406 0.11298679488309496
0.12557085671272164 0.10142182274795869
0.13161648558104905 0.10472830089887775
0.17005293194695367 0.12792052954771616
0.1502408256429319 0.1205936049210454
0.18256109306674484 0.13280988919489078
0.13857299125593983 0.10687512176809041

% error =  191.91933658251398
Mean Approximation Error: 0.03213979622561459 / True Kernel sd : 0.016746512778714393 / Mean true Kernel Value : 0.145558482819864


**Compute embeddings and Gram matrix**

In [38]:
i = 0
X_train, Y_train, X_test = load_data(i, data_dir=DATA_DIR, files_dict=FILES, mat = False)
X_val = X_train[1600:]
Y_val = Y_train[1600:]
X_train = X_train[:1600]
Y_train = Y_train[:1600]


embed_train = []
for x in tqdm_notebook(X_train):
    embed_train.append(ψ(x,mu,k,σ))
embed_val = []
for x in tqdm_notebook(X_val):
    embed_val.append(ψ(x,mu,k,σ))
    
E_train = np.array(embed_train)
E_val = np.array(embed_val)
    
# centering ?
centering = False
if centering == True:
    E_train = E_train - np.mean(E_train)
    E_val = E_val - np.mean(E_train)
        
# compute Gram matrix
#K = E_train.dot(E_train.T)

HBox(children=(IntProgress(value=0, max=1600), HTML(value='')))




HBox(children=(IntProgress(value=0, max=400), HTML(value='')))




**Run SVM!**

Dataset 0

In [58]:
γ = 15.5
kernel = GaussianKernel(γ)
λ = 6.6e-6
#kernel = linear_kernel


clf = SVM(λ, kernel)
clf.fit(E_train, Y_train)
y_pred_train =clf.predict(E_train)
y_pred_val =clf.predict(E_val)
score_train = clf.score(y_pred_train, Y_train)
score_val = clf.score(y_pred_val, Y_val)
print(f"Accuracy on train set / val set {i} : {score_train} / {score_val} (λ: {λ},γ: {γ})")

Accuracy on train set / val set 0 : 0.845625 / 0.665 (λ: 6.6e-06,γ: 15.5)


Dataset 1

In [122]:
kernel = gaussian_kernel
γ = 2000
λ = 20
#kernel = linear_kernel


clf = SVM(γ, λ, kernel)
clf.fit(E_train, Y_train)
y_pred_train =clf.predict(E_train)
y_pred_val =clf.predict(E_val)
score_train = clf.score(y_pred_train, Y_train)
score_val = clf.score(y_pred_val, Y_val)
print(f"Accuracy on train set / val set {i} : {score_train} / {score_val} (λ: {λ},γ: {γ})")

HBox(children=(IntProgress(value=0, max=1600), HTML(value='')))

HBox(children=(IntProgress(value=0, max=400), HTML(value='')))

Accuracy on train set / val set 1 : 1.0 / 0.6175 (λ: 20,γ: 2000)


Dataset 2

In [91]:
kernel = gaussian_kernel
γ = 1720
λ = 1
#kernel = linear_kernel


clf = SVM(γ, λ, kernel)
clf.fit(E_train, Y_train)
y_pred_train =clf.predict(E_train)
y_pred_val =clf.predict(E_val)
score_train = clf.score(y_pred_train, Y_train)
score_val = clf.score(y_pred_val, Y_val)
print(f"Accuracy on train set / val set {i} : {score_train} / {score_val} (λ: {λ},γ: {γ})")

HBox(children=(IntProgress(value=0, max=1600), HTML(value='')))

HBox(children=(IntProgress(value=0, max=400), HTML(value='')))

Accuracy on train set / val set 2 : 1.0 / 0.6475 (λ: 1,γ: 1720)


**Tuning**

In [52]:
γ = 18
λ = 2e-5

gamma_list = np.linspace(13,γ,15, endpoint = True)
lambda_list = np.linspace(3e-6, λ, 10, endpoint = True)
settings = list(product(gamma_list,lambda_list))


for j, tup in enumerate(settings):
    
    γ, λ = tup
    
    kernel = GaussianKernel(γ)
    clf = SVM(_lambda=λ, kernel=kernel)
    clf.fit(E_train, Y_train)
    y_pred_train =clf.predict(E_train)
    y_pred_val =clf.predict(E_val)
    score_train = clf.score(y_pred_train, Y_train)
    score_val = clf.score(y_pred_val, Y_val)
    print(f"Accuracy on train set / val set {i} : {score_train} / {score_val} (λ: {λ},γ: {γ})")


Accuracy on train set / val set 0 : 0.876875 / 0.65 (λ: 3e-06,γ: 13.0)
Accuracy on train set / val set 0 : 0.843125 / 0.655 (λ: 4.888888888888889e-06,γ: 13.0)
Accuracy on train set / val set 0 : 0.818125 / 0.6425 (λ: 6.777777777777778e-06,γ: 13.0)
Accuracy on train set / val set 0 : 0.799375 / 0.6375 (λ: 8.666666666666666e-06,γ: 13.0)
Accuracy on train set / val set 0 : 0.790625 / 0.6425 (λ: 1.0555555555555555e-05,γ: 13.0)
Accuracy on train set / val set 0 : 0.781875 / 0.6475 (λ: 1.2444444444444445e-05,γ: 13.0)
Accuracy on train set / val set 0 : 0.7725 / 0.6475 (λ: 1.4333333333333332e-05,γ: 13.0)
Accuracy on train set / val set 0 : 0.766875 / 0.65 (λ: 1.622222222222222e-05,γ: 13.0)
Accuracy on train set / val set 0 : 0.761875 / 0.65 (λ: 1.8111111111111112e-05,γ: 13.0)
Accuracy on train set / val set 0 : 0.758125 / 0.645 (λ: 2e-05,γ: 13.0)
Accuracy on train set / val set 0 : 0.880625 / 0.655 (λ: 3e-06,γ: 13.357142857142858)
Accuracy on train set / val set 0 : 0.84625 / 0.6475 (λ: 4.888

## Save results

In [45]:
def save_results(filename, results):
    """
    Save results in a csv file
    
    Parameters
    -----------
    - filename : string
        Name of the file to be saved under the ``results`` folder
        
    - results : numpy.array
        Resulting array (0 and 1's)
    """
    
    assert filename.endswith(".csv"), "this is not a csv extension!"
    # Convert results to int
    results = results.astype("int")
    
    with open(os.path.join(RESULT_DIR, filename), 'w', newline='') as csvfile:
        writer = csv.writer(csvfile, delimiter=',')

        # Write header
        writer.writerow(["Id", "Bound"]) 
        assert len(results) == 3000, "There is not 3000 predictions"
        # Write results
        for i in range(len(results)):
            writer.writerow([i, results[i]])

In [50]:
# Test the save results function
save_results("results5.csv", results)