<a href="https://colab.research.google.com/github/hermesfeet/ML-Learning/blob/master/SupportVectorMachines.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Support Vector Machines

SVMs are ML models for linear or nonlinear classification, regression, and outlier detection.   They are great for complex small or medium sized datasets.

The main idea is that two classes can often be separated with a straight line (they are linearly separable), but many possible lines can work.  SVMs look for the single line that has the largest/widest possible street around it (represented by parallel dashed lines) between the classes (this is large margin classification).

The instances located at the edge of the street are the "support vectors".  SVM streets are sensitive to feature scaling (it's needed!), and to relax the rule of all instances being off the street (or on the other side), you can relax hard margins for soft margins where some outliers can cross over (margin violations!).

SVM will not output probability for each class.


In [2]:
#Simple linear SVM to do detect iris virginica flowers in the Iris dataset
import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

iris = datasets.load_iris()
X = iris["data"][:, (2,3)] # petal length, petal width
y = (iris["target"] ==2).astype(np.float64) #Iris virginica

svm_clf = Pipeline([
                    ("scaler", StandardScaler()),
                    ("linear_svc", LinearSVC(C=1, loss="hinge")),
])

svm_clf.fit(X,y)

Pipeline(memory=None,
         steps=[('scaler',
                 StandardScaler(copy=True, with_mean=True, with_std=True)),
                ('linear_svc',
                 LinearSVC(C=1, class_weight=None, dual=True,
                           fit_intercept=True, intercept_scaling=1,
                           loss='hinge', max_iter=1000, multi_class='ovr',
                           penalty='l2', random_state=None, tol=0.0001,
                           verbose=0))],
         verbose=False)

In [3]:
svm_clf.predict([[5.5, 1.7]])

array([1.])

In [4]:
#Nonlinear SVM Classification
#Make moons generates a 2 feature clump dataset, not linearly separable
from sklearn.datasets import make_moons
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures

X , y = make_moons(n_samples=100, noise=0.15)
polynomial_svm_clf = Pipeline([
                               ("poly_features", PolynomialFeatures(degree=3)),
                               ("scaler", StandardScaler()),
                               ("svm_clf", LinearSVC(C=10, loss="hinge"))
])

polynomial_svm_clf.fit(X,y)

Pipeline(memory=None,
         steps=[('poly_features',
                 PolynomialFeatures(degree=3, include_bias=True,
                                    interaction_only=False, order='C')),
                ('scaler',
                 StandardScaler(copy=True, with_mean=True, with_std=True)),
                ('svm_clf',
                 LinearSVC(C=10, class_weight=None, dual=True,
                           fit_intercept=True, intercept_scaling=1,
                           loss='hinge', max_iter=1000, multi_class='ovr',
                           penalty='l2', random_state=None, tol=0.0001,
                           verbose=0))],
         verbose=False)

In [7]:
polynomial_svm_clf.predict([[3.3, 22.1]])

array([0])

In [10]:
#Polynomial kernal - the regular polynomial asks for too many features
#Use the kernal trick to get the same results without adding extra features

from sklearn.svm import SVC
poly_kernal_svm_clf = Pipeline([
                                ("scaler", StandardScaler()),
                                ("svm_clf", SVC(kernel="poly", degree=3, coef0=1, C=5))
                                ])
poly_kernal_svm_clf.fit(X,y)

Pipeline(memory=None,
         steps=[('scaler',
                 StandardScaler(copy=True, with_mean=True, with_std=True)),
                ('svm_clf',
                 SVC(C=5, break_ties=False, cache_size=200, class_weight=None,
                     coef0=1, decision_function_shape='ovr', degree=3,
                     gamma='scale', kernel='poly', max_iter=-1,
                     probability=False, random_state=None, shrinking=True,
                     tol=0.001, verbose=False))],
         verbose=False)

In [12]:
poly_kernal_svm_clf.predict([[3.3, 2.1]])

array([1])