----
Support Vector Machine
----
----

Linear SVM Classification
---

In [5]:
import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

In [6]:
iris = datasets.load_iris()
X = iris["data"][: , (2,3)]
y = (iris["target"] == 2).astype(np.float64)

svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("linear_svc", LinearSVC(C=1, loss="hinge")) # Same as SVC(kernel="linear", C=1)
])

svm_clf.fit(X, y)

Pipeline(steps=[('scaler', StandardScaler()),
                ('linear_svc', LinearSVC(C=1, loss='hinge'))])

In [5]:
svm_clf.predict([[5.5, 1.7]])

array([1.])

Nonlinear SVM Classification
---

In [8]:
from sklearn.datasets import make_moons
from sklearn.preprocessing import PolynomialFeatures
from sklearn.svm import SVC

In [12]:
X, y = make_moons(n_samples = 100, noise = 0.15)

polynomial_svm_clf = Pipeline([
    ("poly_features", PolynomialFeatures(degree=3)),
    ("scaler", StandardScaler()),
    ("svm_clf", LinearSVC(C=10, loss="hinge"))
])

polynomial_svm_clf.fit(X, y)



Pipeline(steps=[('poly_features', PolynomialFeatures(degree=3)),
                ('scaler', StandardScaler()),
                ('svm_clf', LinearSVC(C=10, loss='hinge'))])

In [9]:
poly_kernel_svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("svm_clf", SVC(kernel="poly", degree=3, coef0=1, C=5))
])

poly_kernel_svm_clf.fit(X, y)

Pipeline(steps=[('scaler', StandardScaler()),
                ('svm_clf', SVC(C=5, coef0=1, kernel='poly'))])

Gaussian RBF Kernel
---

In [10]:
rbf_kernel_svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("svm_clf", SVC(kernel="rbf", gamma=5, C=0.001))
])

rbf_kernel_svm_clf.fit(X,y)

Pipeline(steps=[('scaler', StandardScaler()),
                ('svm_clf', SVC(C=0.001, gamma=5))])

As a rule of thumb, we should always try LiearSVC first, especially if the trainig set is very large and have multiple features. If it's not too large we should try GaussianRBF Kernel.

| Class | Time Complexity | Out-of-core Support | Scaling Required | Kernel Trick |
|  :---: |  :---:  |  :---:  |  :---:  |  :---:  |
| LinearSVC | O(mxn) | No | Yes | No |
| SGDClassifier | O(mxn) | Yes | Yes | No |
| SVC | O($m^2$xn) to O($m^3$xn) | No | Yes | Yes |

SVM Regression
---
It is possible to use SVM not only for classification but also for Regression, by not finding the largest street possible, but trying to fit as many instances possible in the street with minimal margin violation 

In [13]:
from sklearn.svm import LinearSVR, SVR
svm_reg = LinearSVR(epsilon=1.5) # Tolerance or margin for the regression
svm_reg.fit(X, y)

LinearSVR(epsilon=1.5)

To tackle nonlinear Regression we can use SVM with non linear Kernel

In [15]:
svm_poly_reg = SVR(kernel="poly", degree=2, C=100, epsilon=0.1)
svm_poly_reg.fit(X, y)

SVR(C=100, degree=2, kernel='poly')