# Support Vector Machine [SVM] - Classification:

**Usage**: classification & regression both [supervised].

**Goal**: Finding the best decision boundary (hyperplane) separating the classes with maximum margin.

- **Hyperplane**: Decision boundary - separating classes with max margin. 
    - Equation: $w^T\cdot x+b = 0$

- **Support Vectors**: Closest data points to hyperplane. [crucial in determining the hyperplane & margin in SVM].

- **Margin**: Distance between hyperplane & support vectors. SVM aims to maximize the margin to improve performance.

- **Kernel**: Function to map the data to a higher-dimensional space enabling SVM to handle non-linearly separable data.

- **Hard Margin**: max-margin in hyperplane that perfectly separates the data. [no misclassifications]

- **Soft Margin**: allows some missclassifications when data is not perfectly separable.

- **C**: regularization term balancing margin maximization & missclass. penalties. $\uparrow C \propto$ *stricter penalties*.

- **Hinge Loss**: loss function.


## SVM Functionality:

- Hyperplane:

    $w^T \cdot x + b = 0$
    - $w$: weight (normal to hyperplane)
    - $b$: bias (shifting the plane)
    - $x$: features (inputs)

- To predict / margin-region:
    
    $y^{(k)} =
    \begin{cases}
    +1 & \text{if } \mathbf{w}^T \mathbf{x} + b \geq 0 \\
    -1 & \text{if } \mathbf{w}^T \mathbf{x} + b < 0
    \end{cases}
    $

**Optimization**:

$||\text{w}|| = \sqrt{w_1^2 + w_2^2 + ... + w_n^2}$

> where, $||\text{w}||$ - Norm of Vector (Euclidean norm)

1. $\text{Margin} = \frac{2}{||\text{w}||}$

2. $\text{Hard-Margin} = \min_{w,b} \frac{1}{2} {||\text{w}||^2}$, 
    - subject to: $y_i(w^T \cdot x_i + b) \geq 1$
    - No missclassifications allowed 

3. $\text{soft-margin} = \min_{w,b,\xi} \frac{1}{2}{||\text{w}||^2} + C\cdot \sum_{i=1}^{n}{\xi_i}$

    - subject to: $y_i(w^T \cdot x_i + b) \geq 1 - \xi_i, \xi \geq 0$  
    - allows some missclassifications.
    - $\xi_i$ refers to the penalty (slack).

4. Hinge Loss:

    $f(x) = w^T \cdot x_i + b$

    $L(y, f(x)) = max(0, 1 - y_i\cdot f(x))$


Optimization Objective:

- $(\text{maximize margin}) + C \cdot L(y, f(x))$

In [4]:
import numpy as np, pandas as pd
import seaborn as sns, matplotlib.pyplot as plt
import plotly.graph_objects as go 
import plotly.express as px 