# Supervised Learning: More Classification

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, LinearSVC
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.datasets import make_moons
from sklearn.inspection import DecisionBoundaryDisplay

In this lecture, we’ll learn how to compare **different SVM classifiers** and how kernel choice affects model performance.

You’ll walk through several versions of Support Vector Machines (SVMs) and see how they classify a fun, slightly tricky dataset. By the end, you’ll know how to set up SVMs, interpret their confusion matrices, and visually compare their decision boundaries.

**Learning goals:**
- Understand what kernels do and when to use them.
- Practice fitting LinearSVC and SVC models.
- Interpret confusion matrices and classification reports.
- Compare decision boundaries across kernel types.

## Our Example Dataset: Interlocking Moons

We'll use a simple two-class dataset shaped like interlocking moons. It’s great for visualizing how different classifiers draw boundaries.

The points represent two classes that are not perfectly separable by a straight line — perfect for seeing how kernels handle non-linearity.

In [None]:
X, y = make_moons(n_samples=300, noise=0.25, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

plt.figure(figsize=(6, 4))
plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap='coolwarm', edgecolor='k')
plt.title('Training Data: Two Interlocking Moons')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

## What Is an SVM?
An **SVM (Support Vector Machine)** is a classifier that finds the best boundary (or **hyperplane**) to separate two classes.

When the data isn’t perfectly separable, SVMs allow for some misclassifications by using a concept called **margin** — the distance between the boundary and the nearest data points (support vectors).

Different **kernels** allow the SVM to fit different shapes of boundaries. Let’s explore them!

### Question 1: LinearSVC

Let’s start with the simplest version — **LinearSVC**. This model tries to draw a straight line that separates the two classes.

**Steps:**
1. Fit a `LinearSVC` to the training data.
2. Predict on the test set.
3. Evaluate performance using a confusion matrix and classification report.

Think about whether a straight line will capture the moon shapes effectively.

In [None]:
linear_model = LinearSVC(random_state=42, max_iter=5000)
linear_model.fit(X_train, y_train)
y_pred_linear = linear_model.predict(X_test)

print("Confusion Matrix (LinearSVC):\n", confusion_matrix(y_test, y_pred_linear))
print("\nClassification Report (LinearSVC):\n", classification_report(y_test, y_pred_linear))

#### Understanding the Metrics
- **Precision** tells us how many of the predicted positives were actually correct.
- **Recall** tells us how many of the actual positives were correctly identified.

When your data is imbalanced (like predicting rare diseases), recall often matters more. For balanced datasets like this one, both matter equally.

Think: what does your precision/recall tell you about the model’s ability to correctly label each moon?

### Question 2: SVC with Linear Kernel

Now, let’s try **`SVC(kernel='linear')`**. This model also draws a straight line, but can optimize differently than LinearSVC and gives us more control over regularization.

We’ll check if there’s any performance difference.

In [None]:
svc_linear = SVC(kernel='linear', C=1.0, random_state=42)
svc_linear.fit(X_train, y_train)
y_pred_svc_linear = svc_linear.predict(X_test)

print("Confusion Matrix (SVC - Linear Kernel):\n", confusion_matrix(y_test, y_pred_svc_linear))
print("\nClassification Report (SVC - Linear Kernel):\n", classification_report(y_test, y_pred_svc_linear))

### Question 3: SVC with RBF Kernel

The **Radial Basis Function (RBF)** kernel handles curved patterns by mapping data into higher dimensions. It’s great for cases where the relationship between classes isn’t linear — like our interlocking moons.

Let’s train an `SVC(kernel='rbf')` and see how it performs.

In [None]:
svc_rbf = SVC(kernel='rbf', gamma=0.7, C=1.0, random_state=42)
svc_rbf.fit(X_train, y_train)
y_pred_rbf = svc_rbf.predict(X_test)

print("Confusion Matrix (SVC - RBF Kernel):\n", confusion_matrix(y_test, y_pred_rbf))
print("\nClassification Report (SVC - RBF Kernel):\n", classification_report(y_test, y_pred_rbf))

### Question 4: SVC with Polynomial Kernel

The **Polynomial kernel** allows the model to create more complex, flexible curves. It’s useful when data has multiple subtle interactions.

Let’s test `SVC(kernel='poly', degree=3)` and compare it to the others.

In [None]:
svc_poly = SVC(kernel='poly', degree=3, C=1.0, random_state=42)
svc_poly.fit(X_train, y_train)
y_pred_poly = svc_poly.predict(X_test)

print("Confusion Matrix (SVC - Polynomial Kernel):\n", confusion_matrix(y_test, y_pred_poly))
print("\nClassification Report (SVC - Polynomial Kernel):\n", classification_report(y_test, y_pred_poly))

## Visualizing Decision Boundaries
Now that we’ve trained all four models, let’s plot their decision boundaries side by side. This helps us visually understand how each kernel “thinks” about the problem.

Notice how the linear models will likely form a straight separation, while RBF and polynomial create curves around the moon shapes.

In [None]:
models = {
    'LinearSVC': linear_model,
    'SVC (Linear)': svc_linear,
    'SVC (RBF)': svc_rbf,
    'SVC (Polynomial)': svc_poly
}

fig, axes = plt.subplots(2, 2, figsize=(12, 10))
axes = axes.ravel()

for ax, (name, model) in zip(axes, models.items()):
    DecisionBoundaryDisplay.from_estimator(
        model, X_train, response_method='predict', cmap='coolwarm', alpha=0.8, ax=ax
    )
    ax.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap='coolwarm', edgecolor='k')
    ax.set_title(name)
    ax.set_xlabel('Feature 1')
    ax.set_ylabel('Feature 2')

plt.suptitle('Comparing Decision Boundaries Across Kernels', fontsize=14)
plt.tight_layout()
plt.show()

### Reflection Question
Take a few minutes to discuss with your group:

1. Which kernel produced the most accurate model? Why?
2. How do the shapes of the decision boundaries differ across kernels?
3. Which kernel might you choose if your data looked more like a spiral, or like well-separated lines?
4. Do simpler models ever have advantages even if they perform slightly worse on accuracy?