# Supervised Learning: More Classification

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, LinearSVC
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.datasets import make_classification, make_moons
from sklearn.inspection import DecisionBoundaryDisplay

np.random.seed(42)

## Overview
- Practice implementing classifier algorithms
- Work with two small, clean datasets
- Learn to assemble and evaluate a **LinearSVC** classifier
- Compare results and decision boundaries from multiple **SVM kernels**

### Roadmap
1. **Dataset 1 – LinearSVC Skeleton**: Basic classification workflow and metrics.
2. **Dataset 2 – Comparing Kernels**: Visualize how different SVM kernels (linear, RBF, polynomial) separate non-linear data.

## LinearSVC Problem (Dataset 1)
We’ll start with data that’s **mostly linearly separable**.  The goal is to practice the standard steps of training, predicting, and evaluating using `LinearSVC`.

### Why LinearSVC?
A linear SVM is fast and effective when the classes can be divided by a straight line (or plane).  It maximizes the margin between classes and can still handle small amounts of overlap.

### Evaluation Metrics
- **Precision** = TP / (TP + FP) → Of the points predicted as positive, how many were correct?
- **Recall** = TP / (TP + FN) → Of all true positives, how many did we catch?

### Dataset 1 – Skeleton Code
Follow this outline to build and evaluate your LinearSVC model.

In [None]:
# Create a simple linearly separable dataset
X, y = make_classification(
  n_samples=300,
  n_features=2,
  n_informative=2,
  n_redundant=0,
  class_sep=1.5,
  random_state=42
)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

# Initialize model
model = LinearSVC(random_state=42, max_iter=5000)

# Fit the model
model.fit(X_train, y_train)

# Predict on test set
y_pred = model.predict(X_test)

# Evaluate
print("Confusion Matrix (LinearSVC):\n", confusion_matrix(y_test, y_pred))
print("\nClassification Report (LinearSVC):\n", classification_report(y_test, y_pred))

# Visualize
plt.scatter(X_train[:,0], X_train[:,1], c=y_train, edgecolor='k')
plt.title('Dataset 1 – Linearly Separable Example')
plt.xlabel('Feature 1'); plt.ylabel('Feature 2')
plt.show()

## Transition → Non-Linear Data
Now that we’ve seen a linear example, let’s move to a dataset where a straight line doesn’t work so well. We’ll use the classic **interlocking moons** pattern to compare different SVM kernels.

## Comparing Kernels (Dataset 2)
A **kernel** function lets SVMs separate data that is not linearly separable by mapping it into a higher-dimensional space.

| Kernel | Shape of Boundary | When Useful |
|--|--|--|
| Linear | Straight | Data is mostly linearly separable |
| Polynomial | Curved, medium complexity | Patterns with smooth curves |
| RBF | Flexible, non-linear | Highly curved data like circles or moons |

In [None]:
# Generate non-linear dataset
X2, y2 = make_moons(n_samples=300, noise=0.25, random_state=42)
X2_train, X2_test, y2_train, y2_test = train_test_split(X2, y2, test_size=0.3, random_state=42)

plt.scatter(X2_train[:,0], X2_train[:,1], c=y2_train, cmap='coolwarm', edgecolor='k')
plt.title('Dataset 2 – Interlocking Moons')
plt.xlabel('Feature 1'); plt.ylabel('Feature 2')
plt.show()

### Question A – `LinearSVC`
Train a LinearSVC on this curved data. What do you predict will happen when a straight line tries to separate two moon-shaped classes?

In [None]:
linear_moons = LinearSVC(random_state=42, max_iter=5000)
linear_moons.fit(X2_train, y2_train)
y2_pred_linear = linear_moons.predict(X2_test)

print('Confusion Matrix (LinearSVC):\n', confusion_matrix(y2_test, y2_pred_linear))
print('\nClassification Report (LinearSVC):\n', classification_report(y2_test, y2_pred_linear))

### Question B – `SVC(kernel='linear')`
Fit a linear-kernel SVC and compare it to `LinearSVC`. Notice if the results are similar and why small differences might appear.

In [None]:
svc_lin = SVC(kernel='linear', C=1.0, random_state=42)
svc_lin.fit(X2_train, y2_train)
y2_pred_lin = svc_lin.predict(X2_test)

print('Confusion Matrix (SVC Linear):\n', confusion_matrix(y2_test, y2_pred_lin))
print('\nClassification Report (SVC Linear):\n', classification_report(y2_test, y2_pred_lin))

### Question C – `SVC(kernel='rbf')`
Try an RBF kernel that can form curved boundaries. Does this improve accuracy for the moon-shaped pattern?

In [None]:
svc_rbf = SVC(kernel='rbf', gamma=0.7, C=1.0, random_state=42)
svc_rbf.fit(X2_train, y2_train)
y2_pred_rbf = svc_rbf.predict(X2_test)

print('Confusion Matrix (SVC RBF):\n', confusion_matrix(y2_test, y2_pred_rbf))
print('\nClassification Report (SVC RBF):\n', classification_report(y2_test, y2_pred_rbf))

### Question D – `SVC(kernel='poly', degree=3)`
Test a polynomial kernel. Does it capture the curvature well without over-fitting?

In [None]:
svc_poly = SVC(kernel='poly', degree=3, C=1.0, random_state=42)
svc_poly.fit(X2_train, y2_train)
y2_pred_poly = svc_poly.predict(X2_test)

print('Confusion Matrix (SVC Polynomial):\n', confusion_matrix(y2_test, y2_pred_poly))
print('\nClassification Report (SVC Polynomial):\n', classification_report(y2_test, y2_pred_poly))

## Visualize All Four Boundaries
We’ll use `DecisionBoundaryDisplay` to compare decision regions for each kernel side by side.

In [None]:
models = {  'LinearSVC': linear_moons,  'SVC (Linear)': svc_lin,  'SVC (RBF)': svc_rbf,  'SVC (Polynomial)': svc_poly}

fig, axes = plt.subplots(2, 2, figsize=(12, 10))axes = axes.ravel()

for ax, (name, mdl) in zip(axes, models.items()):  DecisionBoundaryDisplay.from_estimator(    mdl, X2_train, response_method='predict', cmap='coolwarm', alpha=0.8, ax=ax  )  ax.scatter(X2_train[:,0], X2_train[:,1], c=y2_train, cmap='coolwarm', edgecolor='k')  ax.set_title(name)  ax.set_xlabel('Feature 1'); ax.set_ylabel('Feature 2')
plt.suptitle('Decision Boundaries Across SVM Kernels', fontsize=14)plt.tight_layout()plt.show()

## Reflection Questions
1. Which kernel achieved the highest accuracy and why?
2. How do the decision boundaries visually differ across kernels?
3. Did you notice a trade-off between precision and recall for any model?
4. If you had a spiral-shaped dataset, which kernel would you try first and why?

Discuss briefly before we review as a class.