# Linear SVM Classification

The idea behind a Support Vector Machine classifier is to establish a boundary that has the widest separation between the classes. Since the objective of this algorithm is to establish the widest "street" between two classes, instances that are outside the street are not going to afect the location of the boundary. Instead, the boundary between the classes is going to be fully determined by the instaces that are in the limits of the "street" between the two classes.

It is important to keep in mind that Support Vector Machines are sensitive to feature scaling.

## Soft Margin Classification

The condition of having the instances of each class in separate sides of the street is called *hard margin classification*. The problem with establishing a strict condition like that one is that the algorithm would only work properly in linearly separable classes. If only an instance of a class is located near the other class in the feature space, the algorithm will perform poorly. In other words, the hard margin classification is sensitive to outliers.

To avoid this problem, the *soft margin classification* is the one that is generallly used. The idea of the soft margin classification is to limit the amount of instaces that are in the wrong side of the street while keeping the street as wide as possible. The way to control this trade-off in Scikit-Learn is using the hyperparameter C. The algorithm sets a wider street with more margin violations when C is set in a small value, while there are less street violation but a narrower street when C is set high.

In [2]:
import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

iris = datasets.load_iris()
X = iris['data'][:, (2,3)]
y = (iris['target'] == 2).astype(np.float64)

svm_clf = Pipeline((
        ('scaler', StandardScaler()),
        ('linear_svc', LinearSVC(C=1, loss='hinge'))
    ))

svm_clf.fit(X, y)
svm_clf.predict([[5.5, 1.7]])

array([ 1.])

Another way of implemeting a Linear SVM classifier is using the `SVC()` class and setting `kernel='linear`. However, this option is much slower than `LinearSVC` when it comes to large datasets.

Other option is to use the stochastic gradient descent classifier `SGDClassifier` setting `loss=hinge` and `alpha=1/(m*C)`. This method does not converge as fast as `LinearSVC` but is better at handling large datasets that do not fit in memory.