## Support Vector Machine (SVM) – Complete Notes

### 1. What is SVM?

Support Vector Machine (SVM) is a supervised learning algorithm used for:
- Classification
- Regression (SVR)

> SVM works by finding an optimal hyperplane that maximizes the margin between different classes.

### 2. Key Idea Behind SVM

- Find the best decision boundary
- Boundary should be as far as possible from nearest data points
- Nearest points are called Support Vectors

> Only support vectors influence the decision boundary.

### 3. What is a Hyperplane?
- 1D → Point
- 2D → Line
- 3D → Plane
- nD → Hyperplane

`Equation: wx + b=0`

### 4. Margin in SVM
- Margin = Distance between hyperplane and closest data points
- SVM aims to maximize margin

**Types of Margin:**
- Hard Margin SVM
-Soft Margin SVM

### 5. Hard Margin vs Soft Margin
- **Misclassification**
    - Hard Margin: Not allowed
    - Soft Margin: Allowed
- **Outliers**
    - Hard Margin: Not handled
    - Soft Margin: Handled
- **Data Type**
    - Hard Margin: Perfectly separable data
    - Soft Margin: Real-world data
- **Use Case**
    - Hard Margin: Theoretical scenarios
    - Soft Margin: Practical applications

### 6. Support Vectors
- Data points closest to the hyperplane
- Determine position of the decision boundary
- Removing other points does not affect the model

### 7. Kernel Trick (Very Important)
- Used when data is not linearly separable
- Transforms data into higher dimension
- Makes linear separation possible
- Computation done efficiently using kernels

### 8. Common Kernels
- **Linear Kernel**
    - Formula: (x · x′)
    - Use Case: Linearly separable data
- **Polynomial Kernel**
    - Formula: (x · x′ + c)ᵈ
    - Use Case: Curved decision boundaries
- **RBF (Gaussian) Kernel**
    - Formula: e^(−γ‖x − x′‖²)
    - Use Case: Most commonly used kernel
- **Sigmoid Kernel**
    - Formula: tanh(x · x′)
    - Use Case: Neural-network–like behavior

### 9. Multiclass SVM
- SVM is naturally binary
- Multiclass handled using:
  - One-vs-Rest (OvR)
  - One-vs-One (OvO)

### 10. Evaluation Metrics
Same as logistic regression:
- Accuracy
- Precision
- Recall
- F1-Score
- ROC-AUC (for binary)

### 11. Advantages of SVM
- Works well in high-dimensional spaces
- Effective with small datasets
- Robust to overfitting (with proper C & γ)
- Strong theoretical foundation

### 12. Disadvantages of SVM
- Computationally expensive for large datasets
- Hard to tune hyperparameters
- Less interpretable
- Slow training with complex kernels

### 13. Important Hyperparameters
- **C**
    - Meaning: Regularization strength
- **kernel**
    - Meaning: Type of kernel used
- **gamma**
    - Meaning: Influence of individual data points
- **degree**
    - Meaning: Degree of the polynomial kernel
- **epsilon**
    - Meaning: Margin of tolerance for SVR

### 14. When to Use SVM?
- High-dimensional data
- Small to medium dataset
- Non-linear boundaries
- Text & image classification