### INTRO

* Use cases: classification, regression, outlier detection
* Good in highD situations
* Uses subset of training data in decision function = memory efficient
* Different kernel functions available
* If #features>>#samples, poor performance likely
* Does not directly provide probabilities - these come from 5-fold CV
* Optimal performance: use numpy.ndarray or scipy.sparse.csr_matrix (dtype=float64)
* NuSVC: param `v` controls #support_vectors, #training_errors
* Implemented with [libsvm](http://www.csie.ntu.edu.tw/~cjlin/libsvm/), [liblinear](http://www.csie.ntu.edu.tw/~cjlin/liblinear/) (wrapped usinc C & Cython)

### CLASSIFICATION

[SVC](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC) |
[NuSVC](http://scikit-learn.org/stable/modules/generated/sklearn.svm.NuSVC.html#sklearn.svm.NuSVC) |
[LinearSVC](http://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html#sklearn.svm.LinearSVC)

* inputs = X [#samples,#features], y [#labels,#samples]

[Plot diff SVM classifiers: IRIS dataset](plot_iris.ipynb) | 
[SVM: max margin separating hyperplanes](plot_separating_hyperplane.ipynb) | 
[SVM: unbalanced classes](plot_separating_hyperplane_unbalanced.ipynb) | 
[SVM: ANOVA (uninvariate feature select)](plot_svm_anova.ipynb) | 
[SVM: Binary classification, RBF kernel = predict XOR of inputs](plot_svm_nonlinear.ipynb)

In [5]:
# CLASSIFICATION (SVC)

from sklearn import svm
X = [[0, 0], [1, 1]]
y = [0, 1]
clf = svm.SVC()
clf.fit(X, y)  
clf.predict([[2., 2.]])

print("support vectors: ",clf.support_vectors_)
print("support vector indices: ",clf.support_ )
print("#support vectors, each class: ",clf.n_support_ )

support vectors:  [[ 0.  0.]
 [ 1.  1.]]
support vector indices:  [0 1]
#support vectors, each class:  [1 1]


In [6]:
# CLASSIFICATION (MULTICLASS = SVC, decision_function_shape='ovo')

X = [[0], [1], [2], [3]]
Y = [0, 1, 2, 3]
clf = svm.SVC(decision_function_shape='ovo')
clf.fit(X, Y) 

dec = clf.decision_function([[1]])
dec.shape[1] # 4 classes: 4*3/2 = 6

clf.decision_function_shape = "ovr"
dec = clf.decision_function([[1]])
dec.shape[1] # 4 classes

4

In [7]:
# CLASSIFICATION (MULTICLASS = LinearSVC, decision function = '1vr')

X = [[0], [1], [2], [3]]
Y = [0, 1, 2, 3]
lin_clf = svm.LinearSVC()
lin_clf.fit(X, Y) 
dec = lin_clf.decision_function([[1]])
dec.shape[1]

4

### UNBALANCED PROBLEMS

* SVC uses `class_weight` (a dictionary of {label: value})

[demo](plot_separating_hyperplane_unbalanced.ipynb)

### REGRESSION (SVR)

[SVR](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html#sklearn.svm.SVR) |
[NuSVR](http://scikit-learn.org/stable/modules/generated/sklearn.svm.NuSVR.html#sklearn.svm.NuSVR) |
[LinearSVR](http://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVR.html#sklearn.svm.LinearSVR) | 
[demo: linear vs nonlinear kernels](plot_svm_regression.ipynb)

In [8]:
from sklearn import svm
X = [[0, 0], [2, 2]]
y = [0.5, 2.5]
clf = svm.SVR()
clf.fit(X, y) 
clf.predict([[1, 1]])

array([ 1.5])

### DENSITY ESTIMATION / NOVELTY DETECTION (1-CLASS SVM)

* Given a set of samples, 1cSVM detects soft set boundary.

[API](http://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html#sklearn.svm.OneClassSVM) | [1cSVM: RBF kernel](plot_oneclass.ipynb) | [1cSVM: species distribution](plot_species_distribution_modeling.ipynb)

### KERNEL OPTIONS

* linear, polynomial, rbf, sigmoid

In [9]:
linear_svc = svm.SVC(kernel='linear')
linear_svc.kernel
rbf_svc = svm.SVC(kernel='rbf')
rbf_svc.kernel

'rbf'

### COMPLEXITY

* QP solver, libsvm-based implementation
* O(#features x #samples^2) to O(#features x #samples^3), dataset-dependent

### CUSTOM KERNELS (PYTHON)

[3-class SVM](plot_custom_kernel.ipynb)

#### SVM WITH PRECOMPUTED 'GRAM' MATRIX

In [10]:
import numpy as np
from sklearn import svm
X = np.array([[0, 0], [1, 1]])
y = [0, 1]
clf = svm.SVC(kernel='precomputed')
# linear kernel computation
gram = np.dot(X, X.T)
clf.fit(gram, y) 
# predict on training examples
clf.predict(gram)

array([0, 1])

[SVM vs gamma,C params on RBF function](plot_rbf_parameters.ipynb)