In [1]:
from sklearn.datasets import load_iris

In [2]:
train_input, train_output = load_iris(return_X_y = True)

In [3]:
train_input.shape

(150, 4)

In [4]:
train_output.shape

(150,)

In [5]:
train_input[0]

array([5.1, 3.5, 1.4, 0.2])

In [6]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

train_input = scaler.fit_transform(train_input)

In [7]:
train_input.mean(axis = 0)

array([-1.69031455e-15, -1.84297022e-15, -1.69864123e-15, -1.40924309e-15])

In [8]:
train_input.std(axis = 0)

array([1., 1., 1., 1.])

In [9]:
from sklearn.svm import SVC

In [10]:
model = SVC()

In [11]:
model.get_params()

{'C': 1.0,
 'break_ties': False,
 'cache_size': 200,
 'class_weight': None,
 'coef0': 0.0,
 'decision_function_shape': 'ovr',
 'degree': 3,
 'gamma': 'scale',
 'kernel': 'rbf',
 'max_iter': -1,
 'probability': False,
 'random_state': None,
 'shrinking': True,
 'tol': 0.001,
 'verbose': False}

In [12]:
model.fit(train_input, train_output)

SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

In [13]:
model.score(train_input, train_output)

0.9733333333333334

The SVC object allows for specification of a kernel function via the 'kernel' parameter, the default being the radial basis function ('rbf') function, together with kernel hyperparameters 'coef0', 'degree', and 'gamma'.  Note that user-defined kernel functions are permitted.

The kernel function encodes a transformation of the feature variables to a new feature space (possibly of infinite dimension, as with the rbf kernel).  The dual problem (a quadratic program) is then solved in feature space via an SMO algorithm (more or less, coordinate descent) with classes compared in a one vs rest scheme.  Regularization is added to this dual objective via the 'C' parameter, the multiplicative inverse of the 'alpha' parameter appearing in other sklearn objects (see e.g. SGDClassifier in logistic_regression.ipynb).  By default, there is no limit on the number of algorithm iterations (max_iter = -1), with stopping occuring when a tolerable error (tol = .001) is achieved.

Performance of the model is indicated via the .score() method, which returns the classification accuracy of an (output, input) collection of data.