# Support Vector Machines

### Advantages

* Effective in high dimensional spaces.
* Still effective in cases where number of dimensions is greater than the number of samples.
* Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
* Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.

### Disadvantages

* If the number of features is much greater than the number of samples, avoid over-fitting in choosing Kernel functions and regularization term is crucial.
* SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation (see Scores and probabilities, below).

In [1]:
from sklearn import svm
X = [[0, 0], [1, 1]]
y = [0, 1]
clf = svm.SVC()
clf.fit(X, y)

SVC()

After being fitted, the model can then be used to predict new values:

In [6]:
newset0 = [2.,2.]
newset1 = [-1.,-1]

prediction = clf.predict([newset0])[0]
if prediction == 1:
    print('yes, within set')
else:
    print('no, not within set')

yes, within set


In [7]:
prediction = clf.predict([[-1, -1]])[0]
if prediction == 1:
    print('yes, within set')
else:
    print('no, not within set')

no, not within set


Mathematical Formulation https://scikit-learn.org/stable/modules/svm.html#svm-mathematical-formulation

Some properties of these support vectors can be found in attributes support_vectors_, support_ and n_support_

In [12]:
# get support vectors
print('clf.support_vectors_',clf.support_vectors_)
# get indices of support vectors
print('clf.support_',clf.support_)
# get number of support vectors for each class
print('clf.n_support_',clf.n_support_)

clf.support_vectors_ [[0. 0.]
 [1. 1.]]
clf.support_ [0 1]
clf.n_support_ [1 1]
