# SVM: Support Vector Machines

Young algorithm.

Initial defn: SVM **finds (outputs) a separating line (hyperplane) between data of two classes.**

Q: What makes a good separating line?

A: Maximises **margin**: distance between the line and the nearest point of either of two classes.

Underlying concept is to **maximise robustness**.

Question diagram.

**SVM prioritises correct classification over maximising margin.**

BUT **outliers**: may ignore individual outliers to do the best it can in constructing a decision surface. **SVM is somewhat robust to outliers.** SOmehow mediates attempt to find maximum marginal separator and ability to ignore outliers. There is a tradeoff: can determine via parameters how willing it is to ignore outliers.

From [sk-learn documentation](http://scikit-learn.org/stable/modules/svm.html):

The advantages of support vector machines are:
* Effective in high dimensional spaces.
* Still effective in cases where number of dimensions is greater than the number of samples.
* Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
* Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.

The disadvantages of support vector machines include:
* If the number of features is much greater than the number of samples, the method is likely to give poor performances.
* SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation (see Scores and probabilities, below).


In [1]:
>>> from sklearn import svm
>>> X = [[0, 0], [1, 1]]
>>> y = [0, 1]
>>> clf = svm.SVC()
>>> clf.fit(X, y)  
# SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)
clf.predict([[2., 2.,]])

IndentationError: unexpected indent (<ipython-input-1-2e825f8a2dd5>, line 7)

In [None]:
# Exercise
import sys
from class_vis import prettyPicture
from prep_terrain_data import makeTerrainData

import matplotlib.pyplot as plt
import copy
import numpy as np
import pylab as pl


features_train, labels_train, features_test, labels_test = makeTerrainData()


########################## SVM #################################
### we handle the import statement and SVC creation for you here
from sklearn.svm import SVC
clf = SVC(kernel="linear")


#### now your job is to fit the classifier
#### using the training features/labels, and to
#### make a set of predictions on the test data
clf.fit(features_train, labels_train)

#### store your predictions in a list named pred
pred = clf.predict(features_test)


from sklearn.metrics import accuracy_score
acc = accuracy_score(pred, labels_test)

def submitAccuracy():
    return acc
# 0.92

## Nonlinear SVMs

(Insert diagram)

SVM is built on giving linear separation.

* Previously assume inputs x,y - SVM -> Label.
* Now have x, y, $x^2+y^2$ - SVM -> Label.

SVM makes nonlinear decision surfaces by **making new features**. In the case above, $z = x^2 + y^2$ is a new feature.

Now they are linearly separable?

### Finding New Features using the Kernel Trick

Gist: 
* Changing input space X,Y into a much larger input space $X_i$ using the kernel trick, 
* separate using SVM. 
* Take solution back to original space to get non-linear separation.

**One of most central tricks in machine learning.**


(Groups kernel?)

As specified in the documentation, "different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels."

**SVC**: Support Vector Classifier (a type of svm)

Kernel is a SVC parameter. Kernels available for SVC: 
* "linear"
* "poly"
* "rbf" (default)
* "sigmoid"
* "precomputed"
* a callable

In [None]:
# e.g.:
clf = SVC(kernel="linear")

### Parameters in ML
Arguments passed when you **create** your classifier. I.e. before fitting.

Sample parameters for an SVM:
* kernel
* C
* gamma