# Support Vector Machine

Let's draw two plot of the same data and try answer this question.

- Which one is the line that separates the points better?

As a preview, this is how Support Vector Machine (SVM) works. It gives a line that separates the points with the greatest distance.

SVM extend the criteria of classificaton a bit further, from giving a line that separates the points, to create one which is far away from the points as possible. We do this by creating two more lines which is equidistant parallel lines to the main line, and we try to **maximise** the distance between these two or the **margin** between them.

![](../../assets/img/svm-boundary.png)

Now, the error function now become a bit more complex, that is
$$
error = \text{classification error} + \text{margin error}
$$

## SVM in Scikit-Learn

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, LinearSVC

In [None]:
data = pd.read_csv("../../data/data.csv", names=["x1", "x2", "label"])

In [None]:
X = data[["x1", "x2"]].values
y = data[["label"]].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.25, random_state=111)

In [None]:
linear_svc = SVC(kernel='linear')
poly_svc = SVC(kernel='poly')
rbf_svc = SVC(kernel='rbf')

In [None]:
for model in [linear_svc, poly_svc, rbf_svc]:
    model.fit(X_train, y_train)
#     print(f"model {model.__na}")
    print(f"\ttraining accuracy: {accuracy_score(y_train, model.predict(X_train))}")
    print(f"\ttest accuracy: {accuracy_score(y_test, model.predict(X_test))}\n")

In [None]:
model = SVC(kernel='poly', C=10000., degree=4, random_state=1111)
model.fit(X_train, y_train)
print("training accuracy:", accuracy_score(y_train, model.predict(X_train)))
print("test accuracy:", accuracy_score(y_test, model.predict(X_test)))

In [None]:
model = LinearSVC(random_state=1111)
model.fit(X_train, y_train)
print("training accuracy:", accuracy_score(y_train, model.predict(X_train)))
print("test accuracy:", accuracy_score(y_test, model.predict(X_test)))