# Statistical Pattern Recognition - Solution 8:  Support Vector Machines

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn.model_selection import train_test_split
from libsvm.svmutil import svm_problem, svm_parameter, svm_train, svm_predict


## $\star$ Part 1: libsvm
In this exercise, we will use the package `libsvm-official`, a python interface for the `LIBSVM` library, originally written in C++/Java.

#### Installation
* If you installed all packages from the `requirements.txt` file, `libsvm-official` should already be installed. Installation and usage instructions can be found 
[here](https://github.com/cjlin1/libsvm/tree/master/python).
* If you have problems installing it on Windows, make sure you are using python 3.8 (run `python -V`).

#### Usage
* Details on how to use the original `LIBSVM` library and its parameters can be found [here](https://github.com/cjlin1/libsvm).
* Details on how to use the python interface can be found [here](https://github.com/cjlin1/libsvm/tree/master/python)
* Further documentation can be found on the library's [main page](http://www.csie.ntu.edu.tw/~cjlin/libsvm/).

### Data preparation
1. Load the data from `dataset.npz` and, like in previous assignments,
split it evenly into a training set and a test set.
2. Visualize the data, you can reuse parts of exercise 4 for this.

In [None]:
# START TODO ################
raise NotImplementedError
# END TODO ################


### Using LIBSVM
Familiarize with the library: 
* Learn how to train SVMs using `libsvm-official`.
* Learn how to to set options and hyper-parameters.
* How do you deal with the multiclass case?.

Implement the `libsvm_classifier` function below (hints and example at the end of the notebook):
* Given `params`, fit an SVM to the training data.
* Compute the predictions on the test data and print the model's test accuracy.
* Plot the test data and mark the wrongly classified points with a black circle around them (use `plt.scatter(..., marker="o", ec="black", fc="none", s=200)` to get black circles).
* Plot the training data and highlight the support vectors by plotting them in a bigger size. Use `sv_indices = np.array(model.get_sv_indices()) - 1` to get the support vector indices. Note how libsvm starts counting at 1, so we need to subtract 1.
* Show the model's decision boundaries with a contour plot as in exercise 4: Create a meshgrid for the 2 input features, reshape and stack it,
    predict it's labels using the trained SVM model and plot the contour.

In [None]:
def libsvm_classifier(data_train, data_test, params):
    """
    Takes the training and test data and a specified kernel and uses a svm model
    to get the predicted labels of the test data as well as the accuracy.
    """
    print(f"Running SVM with parameters {params}")

    # START TODO ################
    # Train the model and predict the labels of the test data
    raise NotImplementedError
    # END TODO ################

    # START TODO ################
    # Visualize the model's predictions and decision boundary
    raise NotImplementedError
    # END TODO ################


### Kernels
Train SVMs using different kernels: start with Linear and RBF.

In [None]:
# START TODO ################
# Train, predict, plot
raise NotImplementedError
# END TODO ################


### Hyper-parameters
Pick the kernel that worked best in the runs above. Try out different hyper-parameter configurations available for that kernel and compare their results.

In [None]:
# START TODO ################
raise NotImplementedError
# END TODO ################


### Margin size
Train different SVMs using different margin sizes, from high to low.
Which parameter can you use to achieve this, and how do you set it in `LIBSVM`?

In [None]:
# START TODO ################
raise NotImplementedError
# END TODO ################


#### Hints
* libsvm expects the model parameters as a single string, e.g. `"-t 2"` for the RBF kernel and `"-t 0"` for the linear kernel. 
* You can use the option `-q` to toggle the quiet mode.

#### Example output
Predictions and decision boundaries:

![example output](ex8_example_output.jpg)