## 3.1.4 Support Vector Machines

Support vector machines (SVMs) are a set of supervised learning methods used for 
- classiﬁcation
- regression
- outliers detection

Advantages:
- Effective in high dimensional spaces
- Still effective in cases where number of dimensions is greater than the number of samples
- Uses a subset of training points in the decision function (called support vectors), so it is also memory efﬁcient
- Versatile: different Kernel functions can be speciﬁed for the decision function. Common kernels are provided, but it is also possible to specify custom kernels

Disadvantages:
- If the number of features is much greater than the number of samples, avoid over-ﬁtting in choosing Kernel functions and regularization term is crucial.
-  SVMs do not directly provide probability estimates, these are calculated using an expensive ﬁve-fold crossvalidation (see Scores and probabilities, below).

### 1) Classiﬁcation

SVC, NuSVC and LinearSVC are classes capable of performing multi-class classiﬁcation on a dataset.

SVC and NuSVC are similar methods, but accept slightly different sets of parameters and have different mathematical formulations (see section Mathematical formulation). On the other hand, LinearSVC is another implementation of Support Vector Classiﬁcation for the case of a linear kernel. Note that LinearSVC does not accept keyword kernel, as this is assumed to be linear. It also lacks some of the members of SVC and NuSVC, like support_. 

In [2]:
from sklearn import svm
X = [[0, 0], [1, 1]] 
y = [0, 1]
clf = svm.SVC(gamma='scale') 
clf.fit(X, y)
print(clf.predict([[2., 2.]]) )
# get support vectors 
print(clf.support_vectors_)
# get indices of support vectors 
print(clf.support_)
# get number of support vectors for each class 
print(clf.n_support_)

[1]
[[0. 0.]
 [1. 1.]]
[0 1]
[1 1]


#### 1.1) Multi-class classiﬁcation

In [3]:
X = [[0], [1], [2], [3]]
Y = [0, 1, 2, 3]
clf = svm.SVC(gamma='scale', decision_function_shape='ovo')
print(clf.fit(X, Y))
dec = clf.decision_function([[1]])
print(dec.shape)
clf.decision_function_shape = 'ovr'
dec = clf.decision_function([[1]])
print(dec.shape)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovo', degree=3, gamma='scale', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)
(1, 6)
(1, 4)


On the other hand, LinearSVC implements “one-vs-the-rest” multi-class strategy, thus training n_class models. If there are only two classes, only one model is trained:

In [4]:
lin_clf = svm.LinearSVC() 
print(lin_clf.fit(X, Y))
dec = lin_clf.decision_function([[1]])
print(dec.shape)

LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
          intercept_scaling=1, loss='squared_hinge', max_iter=1000,
          multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
          verbose=0)
(1, 4)


#### 1.2) Unbalanced problems

In problems where it is desired to give more importance to certain classes or certain individual samples keywords class_weight and sample_weight can be used.

SVC (but not NuSVC) implement a keyword class_weight in the fit method. It’s a dictionary of the form {class_label : value}, where value is a ﬂoating point number > 0 that sets the parameter C of class class_label to C * value.

SVC, NuSVC, SVR, NuSVR and OneClassSVM implement also weights for individual samples in method fit through keyword sample_weight. Similar to class_weight, these set the parameter C for the i-th example to C * sample_weight \[i\]

### 2) Regression

The method of Support Vector Classiﬁcation can be extended to solve regression problems. This method is called Support Vector Regression. 