The advantages of support vector machines are:

        Effective in high dimensional spaces.
        Still effective in cases where number of dimensions is greater than the number of samples.
        Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
        Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.

The disadvantages of support vector machines include:

        If the number of features is much greater than the number of samples, avoid over-fitting in choosing Kernel functions and regularization term is crucial.
        SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation (see Scores and probabilities, below).



Classification

In [2]:
from sklearn import svm
X = [[0, 0], [1, 1]]
y = [0, 1]

In [3]:
clf = svm.SVC()

In [4]:
clf.fit(X, y)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [6]:
clf.predict([[2., 2.]])

array([1])

In [7]:
clf.support_vectors_

array([[ 0.,  0.],
       [ 1.,  1.]])

In [8]:
clf.support_

array([0, 1])

In [9]:
clf.n_support_

array([1, 1])

Multi-class Classification

In [10]:
X = [[0], [1], [2], [3]]
Y = [0, 1, 2, 3]

In [12]:
clf = svm.SVC(decision_function_shape='ovo')

In [14]:
clf.fit(X, Y)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovo', degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [15]:
dec = clf.decision_function([[1]])
dec.shape[1]

6

In [16]:
dec

array([[-0.63212056,  0.        ,  0.3495638 ,  0.63212056,  0.98168436,
         0.3495638 ]])

In [17]:
clf.decision_function_shape = 'ovr'

In [18]:
dec = clf.decision_function([[1]])

In [20]:
dec.shape[1]

4

In [21]:
lin_clf = svm.LinearSVC()
lin_clf.fit(X, Y)

LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0)

In [23]:
dec = lin_clf.decision_function([[1]])
dec.shape[1]

4

Regression

In [24]:
from sklearn import svm
X = [[0, 0], [2, 2]]
y = [0.5, 2.5]
clf = svm.SVR()
clf.fit(X, y)
clf.predict([[1, 1]])

array([ 1.5])

Kernel Functions

In [25]:
linear_svc = svm.SVC(kernel='linear')
linear_svc.kernel

'linear'

In [27]:
rbf_svc = svm.SVC(kernel='rbf')
rbf_svc.kernel

'rbf'

In [28]:
import numpy as np

In [29]:
def my_kernel(X, Y):
    return np.dot(X, Y.T)

In [30]:
clf = svm.SVC(kernel=my_kernel)

In [31]:
X = np.array([[0, 0], [1, 1]])
y = [0, 1]

In [32]:
clf = svm.SVC(kernel="precomputed")
gram = np.dot(X, X.T)
clf.fit(gram, y)
clf.predict(gram)

array([0, 1])