# Working With Support Vector Machines (SVM) for Classification and Detection
*Curtis Miller*

**Support vector machines (SVMs)** are linear classifiers. An SVM can be understood as a hyperplane (think: line) such that on one side of the plane consists of data only belonging to one class (ideally) while on the other side all instances of the other class exist. Prediction amounts to determining on which side of the line a data point lies.

Training an SVM involves finding a hyperplane that best separates data from different classes and trying to maximize the margin between the plane and the nearest data points from two different classes. By doing this SVMs tend to generalize well from training data to future data; they are not known to be prone to overfitting.

A hyperparameter common to all SVMs is a parameter $C$ known as the error penalty parameter. Smaller $C$ tends to combat overfitting.

SVMs can be trained using the `SVC` object provided in **scikit-learn**

## Kernel Methods

The linearity assumption seems restrictive. Analysts overcome it by choosing a **kernel**, a mathematical function that alters the feature space a SVM is trained on. Choosing different kernels allows the boundary between classes to take different shapes, which may lead to better predictions.

Again, we can consider choice of kernel as a hyperparameter to optimize. However, we may also use **domain knowledge** (our understanding of the phenomenon being learned) to pick a kernel.

Let's load in the *Titanic* dataset again.

In [None]:
import pandas as pd
from pandas import DataFrame
from sklearn.model_selection import train_test_split, cross_validate
from sklearn.metrics import classification_report
from random import seed

In [None]:
seed(110717)

titanic = pd.read_csv("titanic.csv")
titanic.replace({'Sex': {'male': 0, 'female': 1}}, inplace=True)
titanic.drop("Name", axis=1, inplace=True)
titanic.head()

Here we would be wise to handle passenger class with more care. While written as numbers this is actually a categorical or ordinal variable; the actual numbers don't matter. We should have binary variables, one for each class.

In [None]:
pd.get_dummies(titanic.Pclass).head()

In [None]:
titanic = titanic.join(pd.get_dummies(titanic.Pclass, prefix='Pclass')).drop("Pclass", axis=1)
titanic.head()

In [None]:
titanic_train, titanic_test = train_test_split(titanic)
titanic_train.head()

## Training a SVM

We can train a SVM like so:

In [None]:
from sklearn.svm import SVC

In [None]:
svm1 = SVC(C=1.0,              # Penalty parameter C
           kernel='linear')    # Using a linear kernel
svm1.fit(X=titanic_train.drop("Survived", axis=1), y=titanic_train.Survived)

svm1.predict([[0, 26, 0, 0, 30, 0, 1, 0]])    # Predicting whether a 26 year old male without family aboard in second
                                              # class who paid $30 fare would survive

Choosing the kernel and $C$ could be done with cross-validation, but I will not demonstrate this (it would take too long for this video).

In [None]:
print(classification_report(titanic_train.Survived, svm1.predict(titanic_train.drop("Survived", axis=1))))

The SVM does reasonably well on the training data. Let's see how it does on the test data.

In [None]:
survived_test_predict = svm1.predict(titanic_test.drop("Survived", axis=1))
print(classification_report(titanic_test.Survived, survived_test_predict))

Performance on test data is slightly worse.