# Support Vector Machines

## What is an SVM?
- Trained using the hinge loss and L2 regularization
![image-10](image-10.png)

If the training example falls in this "zero loss" region, it doesn't contribute to the fit; even if they are removed, nothing would change. **This is a key property of SVM.**

- Support vector: a training example **not** in the flat part of the loss diagram.
- Support vector: an example that is incorrectly classified **or** close to the boundary

![image-11](image-11.png)

In the figure, support vectors are shown with yellow circles around them.

## Kernel SVMs 
Fast processing -- because clever algorithms whose running time only scales with the number of support vectors, rather than the total number of training examples.

### SVM Kernel Functions
SVM algorithms use a set of mathematical functions that are defined as the kernel. The function of kernel is to take data as input and transform it into the required form

**default is kernel='rbf'**

RBF = Radial Basis Function

In [1]:
from sklearn.datasets import load_digits
digits = load_digits()
X = digits.data
y = digits.target

In [2]:
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV

# Instantiate an RBF SVM
svm = SVC()

# Instantiate the GridSearchCV object and run the search
# gamma controls the smoothness of the boundary, by decreasing gamma, we can make the boundaries smoother
parameters = {'gamma':[0.00001, 0.0001, 0.001, 0.01, 0.1]}
searcher = GridSearchCV(svm, parameters)
searcher.fit(X,y)

# Report the best parameters
print("Best CV params", searcher.best_params_)

Best CV params {'gamma': 0.001}


In [3]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,random_state=42)

# Instantiate an RBF SVM
svm = SVC()

# Instantiate the GridSearchCV object and run the search
# C hyperparameter controls regularization
parameters = {'C':[0.1, 1, 10], 'gamma':[0.00001, 0.0001, 0.001, 0.01, 0.1]}
searcher = GridSearchCV(svm, parameters)
searcher.fit(X_train,y_train)

# Report the best parameters and the corresponding score
print("Best CV params", searcher.best_params_)
print("Best CV accuracy", searcher.best_score_)

# Report the test accuracy using these best parameters
print("Test accuracy of best grid search hypers:", searcher.score(X_test,y_test))

Best CV params {'C': 10, 'gamma': 0.001}
Best CV accuracy 0.988861352058378
Test accuracy of best grid search hypers: 0.9911111111111112


## Comparing logistic regression and SVM
![image-12](image-12.png)


### SGDClassifier 
`SGDClassifier`: scales well to large datasets
- `SDGClassifier` hyperparameter `alpha` is like `1/C`

To switch between logistic regression and linear SVM, one only has to set the loss hyperparamter of the SGDClassifier.

`logreg = SGDClassifier(loss='log_loss')` #logistic regression

`linsvm = SGDClassifier(loss='hinge')` #linear SVM

In [5]:
from sklearn.linear_model import SGDClassifier

# We set random_state=0 for reproducibility 
linear_classifier = SGDClassifier(random_state=0)

# Instantiate the GridSearchCV object and run the search
parameters = {'alpha':[0.00001, 0.0001, 0.001, 0.01, 0.1, 1], 
             'loss':['hinge','log_loss']}
searcher = GridSearchCV(linear_classifier, parameters, cv=10)
searcher.fit(X_train, y_train)

# Report the best parameters and the corresponding score
print("Best CV params", searcher.best_params_)
print("Best CV accuracy", searcher.best_score_)
print("Test accuracy of best grid search hypers:", searcher.score(X_test, y_test))

Best CV params {'alpha': 0.1, 'loss': 'log_loss'}
Best CV accuracy 0.9517081260364844
Test accuracy of best grid search hypers: 0.9644444444444444
