# Support Vector Machines (SVM)

### Intuition
- SVMs are classification algorithms that find the **best decision boundary** (hyperplane) by maximizing the **margin** between classes.
- Effective for both **linear** and **non-linear** classification (via kernels).

---

### Key Concepts
- **Maximum Margin Classifier**: Finds the hyperplane with the **largest margin** from support vectors → better generalization.
- **Classification with Inseparable Classes**: Introduces **slack variables** to allow misclassifications for robustness.
- **Error Functions & Perceptron**: Perceptron minimizes misclassifications directly; SVM minimizes **hinge loss** (convex, more stable).
- **C Parameter**: Trade-off between margin size & classification error.  
  - Large **C** → narrow margin, fewer errors → risk of overfitting.  
  - Small **C** → wider margin, more tolerance → risk of underfitting.

---

### Kernel Methods
- **Polynomial Kernel**  
  $$
  K(x, z) = (x^\top z + c)^d
  $$  
  → Maps data to polynomial feature space (curved boundaries).  
  - Degree **d** = hyperparameter.

- **RBF (Gaussian) Kernel**  
  $$
  K(x, z) = \exp(-\gamma \|x - z\|^2)
  $$  
  → Measures similarity by distance, creates “mountains” around points.  
  - Large **γ (gamma)** → narrow, sharp peaks → risk of overfitting.  
  - Small **γ (gamma)** → wide, smooth peaks → risk of underfitting.  
  - Remember: \(\gamma = \frac{1}{2\sigma^2}\).

---

### Scikit-learn Example
```python
from sklearn.svm import SVC

# Linear SVM
model_linear = SVC(kernel='linear', C=1)

# Polynomial SVM
model_poly = SVC(kernel='poly', degree=3, C=1)

# RBF SVM
model_rbf = SVC(kernel='rbf', gamma=0.1, C=1)

model_linear.fit(X_train, y_train)
y_pred = model_linear.predict(X_test)

In [9]:
# Import Statements
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np

In [10]:
# Read the data
data = np.asarray(pd.read_csv('./data/data.csv', header = None))
X = data[:, 0:2] # features are assigned to the variable X
y = data[:, 2]   # labels are assigned to the variable y

In [11]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.33, random_state = 42)

In [12]:
model = SVC(kernel = 'rbf', gamma = 27)
model.fit(X_train, y_train)
y_pred = model.predict(X_train)
acc = accuracy_score(y_train, y_pred)
acc

1.0

In [13]:
y_pred = model.predict(X_test)
acc = accuracy_score(y_test, y_pred)
acc

0.90625