# Lesson 08 - Support Vector Machines and Hinge Loss


## Objectives
- Implement a linear SVM with hinge loss.
- Visualize margins and support vectors.
- Compare to logistic regression boundaries.


## From the notes

**Soft-margin SVM**
- Objective: minimize $\frac{1}{2}\|w\|^2 + C \sum_i \max(0, 1 - y^{(i)}(w^T x^{(i)} + b))$.

_TODO: Validate the SVM objective with CS229 main notes PDF._


## Intuition
SVMs trade off margin maximization and hinge loss penalties for misclassified points. Only points on the margin (support vectors) influence the solution.


## Data
We use a separable 2D dataset to show the margin geometry.


In [None]:
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)

X_pos = np.random.multivariate_normal([2, 2], np.eye(2), 60)
X_neg = np.random.multivariate_normal([-2, -1], np.eye(2), 60)
X = np.vstack([X_pos, X_neg])
y = np.hstack([np.ones(len(X_pos)), -np.ones(len(X_neg))])
Xb = np.c_[np.ones(len(X)), X]

def svm_train(X, y, C=1.0, lr=0.01, iters=2000):
    theta = np.zeros(X.shape[1])
    for _ in range(iters):
        margins = y * (X @ theta)
        misclassified = margins < 1
        grad = theta.copy()
        grad[0] = 0  # no regularization on bias
        grad -= C * (X[misclassified].T @ (y[misclassified]))
        theta -= lr * grad / len(y)
    return theta

theta = svm_train(Xb, y)


## Experiments


In [None]:
preds = np.sign(Xb @ theta)
(preds == y).mean()


## Visualizations


In [None]:
plt.figure(figsize=(6,4))
plt.scatter(X_pos[:,0], X_pos[:,1], label="+1")
plt.scatter(X_neg[:,0], X_neg[:,1], label="-1")
x1 = np.linspace(-4, 4, 100)
x2 = -(theta[0] + theta[1]*x1) / theta[2]
plt.plot(x1, x2, color="black", label="boundary")
plt.title("Linear SVM boundary")
plt.xlabel("x1")
plt.ylabel("x2")
plt.legend()
plt.show()

margins = y * (Xb @ theta)
plt.figure(figsize=(6,4))
plt.hist(margins, bins=20, alpha=0.7)
plt.title("Hinge margins")
plt.xlabel("y*(w^T x)")
plt.ylabel("count")
plt.show()


## Takeaways
- The hinge loss focuses on margin violations rather than all errors.
- Support vectors define the decision boundary.


## Explain it in an interview
- Describe the tradeoff controlled by C in a soft-margin SVM.
- Explain how hinge loss differs from logistic loss.


## Exercises
- Implement SVM with subgradient descent and compare to this version.
- Add polynomial features and observe boundary changes.
- Track how many points become support vectors as C changes.
