## Optimisation paramètres de la SVM One Class

Pour un noyau Gaussien, la matrice de gram sera de la forme :

K(xi, xj) = exp ( - gamma * d(xi, xj)^2  )

Trois remarques :
1. quelle que soit la valeur de nu, la matrice de Gram ne changera pas
2. vous pourriez partir d'une matrice de distances d(xi, xj)^2 et dériver les matrices de Gram pour chaque gamma assez rapidement
3. il est possible de fournir à scikit-learn une matrice de Gram pré-calculée en utilisant le noyau "precomputed" (regardez des exemples pour comprendre comment construire les matrices d'apprentissage et de test)

Interprétation du nu:

A low C makes the decision surface smooth, while a high C aims at classifying all training examples correctly.

Interprétation de gamma:

gamma defines how much influence a single training example has. 
The larger gamma is, the closer other examples must be to be affected.

### Custom kernel

You can define your own kernels by either giving the kernel as a python function or by precomputing the Gram matrix.

Classifiers with custom kernels behave the same way as any other classifiers, except that:

    Field support_vectors_ is now empty, only indices of support vectors are stored in support_
    A reference (and not a copy) of the first argument in the fit() method is stored for future reference. If that array changes between the use of fit() and predict() you will have unexpected results.
    
### Use the Gram matrix

Set kernel='precomputed' and pass the Gram matrix instead of X in the fit method. At the moment, the kernel values between all training vectors and the test vectors must be provided

In [None]:
>>> import numpy as np
>>> from sklearn import svm
>>> X = np.array([[0, 0], [1, 1]])
>>> y = [0, 1]
>>> clf = svm.SVC(kernel='precomputed')
>>> # linear kernel computation
>>> gram = np.dot(X, X.T)
>>> clf.fit(gram, y) 
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='precomputed', max_iter=-1, probability=False,
    random_state=None, shrinking=True, tol=0.001, verbose=False)
>>> # predict on training examples
>>> clf.predict(gram)
array([0, 1])

### Test sur notre jeu de données

Dans notre cas, on veut utiliser le noyau gaussien donc la matrice Gram est : 
    K(xi, xj) = exp ( - gamma * d(xi, xj)^2 )

In [8]:
import pandas as pd
import numpy as np
from sklearn import svm

In [9]:
data_train = pd.read_hdf('train.hdf5')

In [10]:
data_valid = pd.read_hdf('validation.hdf5')

In [11]:
from scipy.spatial.distance import cdist
value_gamma=0.001
SVM_oneclass_kernelPrecomputed = svm.OneClassSVM(kernel='precomputed', gamma=value_gamma)
# rbf kernel computation
gram = np.exp(-value_gamma*cdist(data_train, data_train, 'euclidean')**2)
SVM_oneclass_kernelPrecomputed.fit(gram) 

# predict on training examples
gram_valid = np.exp(-value_gamma*cdist(data_valid, data_train, 'euclidean')**2)
pred_anomaly = SVM_oneclass_kernelPrecomputed.predict(gram_valid)

from collections import Counter
Counter(pred_anomaly)

Counter({1: 116, -1: 478})

In [15]:
value_gamma=0.0001
SVM_oneclass_kernelPrecomputed = svm.OneClassSVM(kernel='precomputed', gamma=value_gamma)
# rbf kernel computation
gram = np.exp(-value_gamma*cdist(data_train, data_train, 'euclidean')**2)
SVM_oneclass_kernelPrecomputed.fit(gram) 

# predict on training examples
gram_valid = np.exp(-value_gamma*cdist(data_valid, data_train, 'euclidean')**2)
pred_anomaly = SVM_oneclass_kernelPrecomputed.predict(gram_valid)

# from collections import Counter
Counter(pred_anomaly)

Counter({1: 246, -1: 348})

### Avec un gamma à 0.001, on a : 

In [16]:
pred_anomaly2 = pd.DataFrame(pred_anomaly)
pred_anomaly2 = pred_anomaly2.replace(1, 0)
pred_anomaly2 = pred_anomaly2.replace(-1, 1)

data = {'seqID': np.arange(0,len(data_valid)), 'anomaly': pred_anomaly2[0]}
y_test_template_gram = pd.DataFrame(data)
y_test_template_gram["seqID"] = y_test_template_gram["seqID"].astype(int)
y_test_template_gram["anomaly"] = y_test_template_gram["anomaly"].astype(int)

y_test_template_gram.to_csv('y_test_template_gram.csv',  index = False, sep=";")

4e soumissions: 
    F1 Score: 0.67871 
    Precision: 0.55021 
    Recall: 0.88552
On avait detecté 478 anomalies et 116 séquences normales


### Avec un gamma à 0.0001, on a:

In [None]:
pred_anomaly2 = pd.DataFrame(pred_anomaly)
pred_anomaly2 = pred_anomaly2.replace(1, 0)
pred_anomaly2 = pred_anomaly2.replace(-1, 1)

data = {'seqID': np.arange(0,len(data_valid)), 'anomaly': pred_anomaly2[0]}
y_test_template_gram = pd.DataFrame(data)
y_test_template_gram["seqID"] = y_test_template_gram["seqID"].astype(int)
y_test_template_gram["anomaly"] = y_test_template_gram["anomaly"].astype(int)

y_test_template_gram.to_csv('y_test_template_gram0001.csv',  index = False, sep=";")

5e soumission:
    F1 Score: 0.80000
    Precision: 0.74138
    Recall: 0.86869
On avait detecté 348 anomalies et 246 séquences normales