# **📘 Support Vector Machines (SVM)**

## 🔍 What is SVM?
Support Vector Machine (SVM) is a **supervised learning algorithm** used for **classification** and **regression** tasks. It works by finding the **optimal hyperplane** that best separates the data into different classes.

---

## ⚙️ The SVM Algorithm
- The goal of SVM is to **maximize the margin** between data points of different classes.
- The data points closest to the hyperplane are called **support vectors**, and they influence the position and orientation of the hyperplane.

---

## 🧠 Kernel Trick
- Real-world data is often **not linearly separable**.
- The **kernel trick** allows us to **transform data into higher dimensions** so that a **linear separator (hyperplane)** can be used.
- Popular kernels include:
  - **Linear Kernel**
  - **Polynomial Kernel**
  - **RBF (Radial Basis Function)** / Gaussian Kernel

---

## 🧩 Soft Margin SVM and Regularization
- Perfect separation isn't always possible or desirable (due to noise).
- **Soft Margin SVM** allows some misclassifications but still finds a good decision boundary.
- The **C parameter** controls the trade-off between maximizing margin and minimizing classification error (regularization).

---

## 🧪 Evaluating SVM Models
- Use metrics such as:
  - ✅ Accuracy
  - 📉 Precision, Recall, F1-Score
  - 📊 Confusion Matrix
- **Cross-validation** helps ensure the model generalizes well to unseen data.

---

## 🧱 Practical Considerations and Challenges
- 🌀 **Scaling**: SVMs are sensitive to feature scaling — use `StandardScaler` or `MinMaxScaler`.
- 🧠 **Kernel choice** is critical and depends on the data.
- 🧮 **Computational cost** increases with large datasets (especially with non-linear kernels).
- ⚖️ **Choosing the right hyperparameters** (`C`, `gamma`) is essential — often done via **GridSearchCV**.

---

## 🌍 Real-world Applications of SVM
- 📧 **Spam Detection**
- 🧬 **Bioinformatics** (e.g., cancer detection from gene expression data)
- 📸 **Image Classification**
- 💳 **Fraud Detection**
- 🧠 **Face Detection**

---

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, svm
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

In [None]:
# Dataset

iris = datasets.load_iris()
x, y = iris.data, iris.target

In [None]:
# Scaling

scaler = StandardScaler()
x_scaled = scaler.fit_transform(x)

In [None]:
# SVM

x_train, x_test, y_train, y_test = train_test_split(x_scaled, y, test_size=0.3, random_state=42)

model = svm.SVC(kernel='rbf', C=1.0, gamma='scale')
model.fit(x_train, y_train)

In [None]:
# Hyperparameter
param_grid = {
    'C': [0.1, 1, 10],
    'gamma': ['scale', 0.1, 1, 10],
    'kernel': ['rbf', 'linear', 'poly']
}
grid = GridSearchCV(svm.SVC(), param_grid, refit=True, verbose=0, cv=5)
grid.fit(x_train, y_train)

print("\nBest Parameters from GridSearchCV:", grid.best_params_)

In [None]:
# Re-evaluate

best_model = grid.best_estimator_
y_pred_best = best_model.predict(x_test)
print("Accuracy (Best Model):", accuracy_score(y_test, y_pred_best))

In [None]:
# Plot

plt.figure(figsize=(8, 6))
plt.scatter(x_test[:, 0], x_test[:, 1], c=y_pred_best, cmap='viridis', edgecolor='k', s=100)
plt.title("SVM Predictions")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.grid()
plt.show()