## Decision Rule

![image.png](attachment:image.png)

u is the vector of the point we want to classify

### 🧠 Decision Rule in Support Vector Machine (SVM)

The **decision rule** is what determines how a new data point is classified once the SVM model is trained.

---

## 🔹 Linear SVM (Without Kernel)

The decision function is:

```math
f(x) = \vec{w}^T x + b
```

Where:

* $\vec{w}$ is the weight vector (normal to the hyperplane)
* $b$ is the bias (offset from origin)
* $x$ is the input vector

### ✅ Decision Rule:

```math
\text{Predicted Class} =
\begin{cases}
+1 & \text{if } f(x) \geq 0 \\\\
-1 & \text{if } f(x) < 0
\end{cases}
```

---

## 🔹 Geometric Interpretation

* $f(x) = 0$ defines the **decision boundary (hyperplane)**
* $f(x) > 0$ ⇒ point lies on the **positive** side
* $f(x) < 0$ ⇒ point lies on the **negative** side

---

## 🔹 With Kernels (Non-linear SVM)

The decision function becomes:

```math
f(x) = \sum_{i=1}^{n} \alpha_i y_i K(x_i, x) + b
```

Where:

* $\alpha_i$: Lagrange multipliers (nonzero only for support vectors)
* $x_i$: support vectors
* $y_i$: labels (+1 or -1)
* $K(x_i, x)$: kernel function (e.g., RBF, polynomial)

### ✅ Decision Rule (Kernelized):

```math
\text{Predicted Class} =
\begin{cases}
+1 & \text{if } f(x) \geq 0 \\\\
-1 & \text{if } f(x) < 0
\end{cases}
```

---

## 🔸 Confidence of Prediction

* The **absolute value of $f(x)$** gives the **distance from the hyperplane**.
* Larger magnitude ⇒ more confident prediction

---

## 🔹 Example

Let’s say after training a linear SVM we get:

```math
\vec{w} = [2, -1],\quad b = -3
```

For a test point $x = [2, 4]$:

```math
f(x) = 2(2) + (-1)(4) - 3 = 4 - 4 - 3 = -3
```

So:

```math
\text{Predicted Class} = -1
```

---


### 🧠 What Are Support Vectors in SVM?

**Support Vectors** are the **data points that lie closest to the decision boundary (hyperplane)**.
They are *critical* for defining the margin — the SVM's entire model depends on them.

---

## 🔹 Intuition

In a binary classification problem:

* SVM looks for a **hyperplane** that separates the two classes with the **maximum margin**.
* The data points that **lie exactly on the margin boundaries** are called **support vectors**.

They are the **“bare minimum”** points needed to define the optimal separating hyperplane.

---

## 🔸 Why Are They Important?

* Only support vectors determine the **position and orientation** of the hyperplane.
* If you **remove a non-support vector**, the hyperplane **won’t change**.
* If you **remove a support vector**, the margin or the hyperplane **can shift significantly**.

---

## 🔸 Mathematical Insight

In the dual form of SVM optimization, the decision function is:

```math
f(x) = \sum_{i=1}^{n} \alpha_i y_i K(x_i, x) + b
```

* Only those points with **$\alpha_i > 0$** are **support vectors**.
* All others have $\alpha_i = 0$ and **don’t influence** the classifier.

---

## 🔸 Visualization

```
Class +1:        ○   ○
                        │← Margin →
                    ●  — HYPERPLANE —
                        │← Margin →
Class -1:        ●   ●

● = Support Vectors (lie on margin boundary)
```

---

## 🔹 Key Properties

| Property                | Description                                 |
| ----------------------- | ------------------------------------------- |
| Distance to hyperplane  | Exactly **equal to margin**                 |
| Role in optimization    | Appear in the **constraint equations**      |
| Influence on hyperplane | **Directly define** its position            |
| Count                   | Usually **a small fraction** of the dataset |

---

## 🔸 Scikit-Learn Access

After training an SVM:

```python
model.support_vectors_
```

This returns the **coordinates** of all support vectors.

```python
model.n_support_
```

This gives the **number of support vectors** per class.

---

## 🧠 Final Takeaway

SVM is entirely determined by a **small subset of the training data** — the **support vectors** — making it efficient and robust, especially in high dimensions.



![image.png](attachment:image.png)

### Optimization Function
![image.png](attachment:image.png)

This is hard-margin SVM. It works for linearly separable data. We need to use soft-margin SVM for non-linear data