# Support vector machines

## 1. Kernels

We've only seen rbf kernel until now, there are other types too:

Other types of kernels: 
http://scikit-learn.org/stable/modules/svm.html#svm-mathematical-formulation

The idea is that kernels are inner products in a transformed space. 

### 1.1 The linear kernel

The linear kernel is, as we've seen, the default kernel and simply creates linear decision boundaries. The linear kernel is represented by the inner product of the $\langle x, x' \rangle$. It is not very important to really understand what's happening here, we just wanted to give the expression because you'll get the expressions for other kernels as well. As some kernels have additional parameters that can be specified, it is important to know about them.

### 1.2 The RBF kernel

There are two parameters when training an SVM with the Radial Basis Function: C and gamma. 

- The parameter C is common to all SVM kernels. By tuning the C parameter when using kernels, you can provide a trafe-off between misclassification of the training set and simplicity of the decision function. a high C will classify as many samples correctly as possible (and might potentially lead to overfitting).

- Gamma defines how much influence a single training example has. The larger gamma is, the closer other examples must be to be affected.

The RBF kernel is specified as 

$$\exp{(-\gamma \lVert  x -  x' \rVert^2)} $$

Gamma has a strong effect on the results: gamma too large will lead to overfitting, a gamma which is too small will lead to underfitting (kind of like a simple linear boundary for a complex problem)

You can specify a value for gamma using the attribute `gamma`. The default gamma value is "auto", if no other gamma is specified, gamma is set to 1/number_of_features (so, 0.5 if 2 classes, 0.333 when 3 classes, etc.)

### 1.3 The Polynomial kernel

The Polynomial kernel is specified as 

$$(\gamma \langle  x -  x' \rangle+r)^d $$

- d can be specified by the keyword `degree`. The default degree is 3. 
- r can be specified by the keyword `coef0`. The default is 0.

### 1.4 The Sigmoid kernel

The sigmoid kernel is specified as 

$$\tanh ( \gamma\langle  x -  x' \rangle+r) $$

This kernel is similar to the signoid function in logistic regression

## 2. SVC, NuSVC and LinearSVC

Next to SVC, other functions exist in scikit-learn. 

### 2.1 NuSVC

NuSVC is similar to SVC, but accepts slightly defferent parameters, and the mathematical formulation is also altered slightly. 

A new parameter $\nu$ is introduced. The parameter controls the number of support vectors and training errors. $\nu$ jointly creates an upper bound on training errors, and a lower bound on support vectors


Just like SVC, NuSVC implements the "one-against-one" approach when there are more than 2 classes. This means that when there are n classes, $\dfrac{n*(n-1)}{2}$ classifiers are created, and each one classifies samples in 2 classes. 

### 2.2 LinearSVC

LinearSVC is similar to SVC, but instead of the "one-versus-one" method, a "one-vs-rest" method is used. So in this case, when there are n classes, just $n$ classifiers are created, and each one classifies samples in 2 classes, the one of interest, and all the other classes. This means that SVC generates more classifiers, so in cases with many classes, LinearSVC actually tends to scale better. 


## 3. Probabilities and predictions 

You can make predictions using support vector machines. The SVC decision function gives a probability score per class. This is not done by default, however, You'll need to set the `probability` argument equal to `True`. Scikit-learn internally performs cross-validations to compute the probabilities, so you can expect that setting `probability` to `True` makes the calculations longer. For large data sets, computation times can take considerable proportions.

# Sources

http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html
    
http://crsouza.com/2010/03/17/kernel-functions-for-machine-learning-applications/#linear

https://jakevdp.github.io/PythonDataScienceHandbook/05.07-support-vector-machines.html

http://scikit-learn.org/stable/modules/svm.html#svm-kernels --> mainly this