# Classification: Support Vector Machines

Enough with the simple stuff! Now let's use SVMs.

SVMs maximise the margin between the support vectors.

References:
Plot Iris SVC: https://scikit-learn.org/stable/auto_examples/svm/plot_iris_svc.html <br>
Recognizing hand-written digits: https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html#sphx-glr-auto-examples-classification-plot-digits-classification-py <br>
Confusion matrix: https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html#sphx-glr-auto-examples-model-selection-plot-confusion-matrix-py

## Installation

In [None]:
%pip install numpy
%pip install matplotlib
%pip install sklearn

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets, metrics
from sklearn.model_selection import train_test_split

## Classifying the Iris dataset

We've used Iris a lot now, but we haven't seen SVMs do it.

### Viewing the Iris dataset

Let's view the dataset one more time.

In [None]:
iris_x, iris_y = datasets.load_iris(return_X_y=True, as_frame=True)

In [None]:
iris_x.head()

In [None]:
iris_y.head()

### Choosing our inputs

Once again, let's use the sepal length and width as our inputs, with a train/test.

In [None]:
# Filter out the petal width and length, then convert to a numpy array

# Split the data into training and testing sets


<details><summary>Click to cheat</summary>

```python
# Filter out the petal width and length, then convert to a numpy array
iris_sepal = iris_x.filter(items=['sepal length (cm)', 'sepal width (cm)']).to_numpy()
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    iris_sepal, iris_y.to_numpy(), train_size=0.7
)
```
</details>

### Creating our models using different kernels

In [None]:
# Choose your hyper parameters
# C = 
# gamma = 
# degree = 

# Create the models
models = (
    # Create a SVC with a linear kernel

    # Create a LinearSVC with a max_iter of a big number

    # Create a SVC with a RBF kernel

    # Create a SVC with a polynomial kernel

)

# Train the models and store in a list


<details><summary>Click to cheat</summary>

```python
# Choose your hyper parameters
C = 1.0
gamma = 0.5
degree = 3

# Create the models
models = (
    # Create a SVC with a linear kernel
    svm.SVC(kernel='linear', C=C),
    # Create a LinearSVC with a max_iter of a big number
    svm.LinearSVC(C=C, max_iter=10000),
    # Create a SVC with a RBF kernel
    svm.SVC(kernel='rbf', gamma=gamma, C=C),
    # Create a SVC with a polynomial kernel
    svm.SVC(kernel="poly", degree=degree, gamma="auto", C=C)
)

# Train the models and store in a list
models2 = [model.fit(X_train, y_train) for model in models]
```
</details>

### Plotting the Hyperplanes

In [None]:
def make_meshgrid(x, y, h=0.02):
    x_min, x_max = x.min() - 1, x.max() + 1
    y_min, y_max = y.min() - 1, y.max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    return xx, yy


def plot_contours(ax, clf, xx, yy, **params):
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    out = ax.contourf(xx, yy, Z, **params)
    return out

titles = (
    f"SVC with linear kernel (C = {C})",
    "LinearSVC (linear kernel)",
    f"SVC with RBF kernel (gamma = {gamma})",
    f"SVC with polynomial (degree {degree}) kernel",
)

# Set-up 2x2 grid for plotting.
fig, sub = plt.subplots(2, 2)
plt.subplots_adjust(wspace=0.4, hspace=0.4)

train_x0, train_x1 = X_train[:, 0], X_train[:, 1]
test_x0, test_x1 = X_test[:, 0], X_test[:, 1]
xx, yy = make_meshgrid(iris_sepal[:,0], iris_sepal[:,1])

for model, title, ax in zip(models2, titles, sub.flatten()):
    plot_contours(ax, model, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
    ax.scatter(train_x0, train_x1, c=y_train, cmap=plt.cm.coolwarm, s=20, edgecolors="k")
    ax.scatter(test_x0, test_x1, c=y_test, cmap=plt.cm.coolwarm, s=40, edgecolors="k", marker='*')
    ax.set_xlim(xx.min(), xx.max())
    ax.set_ylim(yy.min(), yy.max())
    ax.set_xlabel("Sepal length")
    ax.set_ylabel("Sepal width")
    ax.set_xticks(())
    ax.set_yticks(())
    ax.set_title(title)

plt.show()

### Confusion matrix of our models

In [None]:
from sklearn.metrics import ConfusionMatrixDisplay

iris = datasets.load_iris()

titles_options = [
    ("Confusion matrix, without normalization", None),
    ("Normalized confusion matrix", "true"),
]

# pick a model from our trained models
model = models2[0]

for title, normalize in titles_options:
    disp = ConfusionMatrixDisplay.from_estimator(
        model,
        X_test,
        y_test,
        display_labels=iris.target_names,
        cmap=plt.cm.Blues,
        normalize=normalize,
    )
    disp.ax_.set_title(title)

plt.show()


## Digits dataset

Enough of Iris! Let's start using the digits dataset.

### Loading the data

First things first, we need to load the data. Let's also view the first few samples while we're at it.

In [None]:
# Load the digits as a bunch object
# We do this to get the target names and images for plotting

# Also load the digits X as a pandas Dataframe and the y as a Series


# Plot the first few examples
_, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
for ax, image, label in zip(axes, digits.images, digits.target):
    ax.set_axis_off()
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
    ax.set_title(f"Digit {label}")

<details><summary>Click to cheat</summary>

```python
# Load the digits as a bunch object
# We do this to get the target names and images for plotting
digits = datasets.load_digits()
# Also load the digits X as a pandas Dataframe and the y as a Series
digits_X, digits_y = datasets.load_digits(return_X_y=True, as_frame=True)

# Plot the first few examples
_, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
for ax, image, label in zip(axes, digits.images, digits.target):
    ax.set_axis_off()
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
    ax.set_title(f"Digit {label}")
```
</details>

Now let's split our labelled data into training and testing sets with a 70/30 ratio.

<details><summary>Click to cheat</summary>

```python
X_train, X_test, y_train, y_test = train_test_split(
    digits_X.to_numpy(), digits_y.to_numpy(), test_size=0.7
)
```
</details>

### Create the model

In [None]:
# Create the untrained model
# Choose whatever kernel you want


# Train the model


# get the predictions


<details><summary>Click to cheat</summary>

```python
# Create the untrained model
# Choose whatever kernel you want
model = svm.SVC(kernel='rbf', gamma=0.001, C=1.0)

# Train the model
model.fit(X_train, y_train)

# get the predictions
y_pred = model.predict(X_test)
```
</details>

### Test the model

Let's see a few examples of our predictions.

In [None]:
_, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
for ax, image, prediction in zip(axes, X_test, y_pred):
    ax.set_axis_off()
    image = image.reshape(8, 8)
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
    ax.set_title(f"Prediction: {prediction}")

Let's also view our confusion matrix for good measure.

In [None]:
disp = metrics.ConfusionMatrixDisplay.from_predictions(y_test, y_pred)
disp.figure_.suptitle("Confusion Matrix")

plt.show()

Finally, we'll look at our measures of performance.

In [None]:
from sklearn.metrics import classification_report

target_names = [str(name) for name in digits.target_names]

print(classification_report(y_test, y_pred, target_names=target_names))