# 🔹 Cost Function in SVC (Support Vector Classifier)

SVC is built on the idea of finding a **decision boundary (hyperplane)** that separates classes with the **maximum margin** while allowing some misclassifications (soft margin).

The **cost function** (also called the **objective function**) balances two goals:

1. **Maximize the margin** → keep the separating hyperplane as far as possible from the nearest points.
2. **Minimize misclassification errors** → penalize points that lie inside the margin or are misclassified.

---

## 1. **Hard-Margin SVC (No Misclassifications)**

When data is **linearly separable** (perfect split possible), the objective is:

$$
\min_{w,b} \ \frac{1}{2} \|w\|^2
$$

subject to:

$$
y_i (w \cdot x_i + b) \geq 1 \quad \forall i
$$

* Here, $\|w\|$ controls the margin width (smaller $\|w\|$ = larger margin).
* No slack variables → strict separation.

---

## 2. **Soft-Margin SVC (Realistic Case)**

When data is **not perfectly separable**, slack variables $\xi_i \geq 0$ are introduced to allow violations.

The cost function becomes:

$$
\min_{w,b,\xi} \ \frac{1}{2} \|w\|^2 + C \sum_{i=1}^n \xi_i
$$

subject to:

$$
y_i (w \cdot x_i + b) \geq 1 - \xi_i \quad \forall i
$$

* First term → $\frac{1}{2} \|w\|^2$: keeps the margin wide (regularization).
* Second term → $C \sum \xi_i$: penalizes misclassifications.
* $C$ = **regularization parameter**:

  * Large $C$: prioritizes correct classification (smaller margin, risk of overfitting).
  * Small $C$: allows more violations (larger margin, more general).

---

## 3. **Hinge Loss (Common Interpretation)**

The misclassification penalty in SVC is based on the **hinge loss**:

$$
L(y, f(x)) = \max(0, 1 - y \cdot f(x))
$$

where $f(x) = w \cdot x + b$.

* If the point is correctly classified and outside the margin → loss = 0.
* If it’s inside the margin or misclassified → loss > 0 (linear penalty).

Thus, the **primal cost function** of SVC is often written as:

$$
J(w) = \frac{1}{2} \|w\|^2 + C \sum_{i=1}^n \max(0, 1 - y_i(w \cdot x_i + b))
$$

---

## 4. **Summary**

* **Hard Margin**: Minimize $\frac{1}{2}\|w\|^2$ (no misclassifications).
* **Soft Margin**: Minimize $\frac{1}{2}\|w\|^2 + C \sum \xi_i$ (allows some errors).
* **Hinge Loss view**: Minimize

  $$
  \frac{1}{2}\|w\|^2 + C \sum \max(0, 1 - y_i f(x_i))
  $$
* $C$ controls the tradeoff between margin maximization and error tolerance.

---

👉 In simple words:
The **cost function in SVC** tries to make the margin as wide as possible while penalizing points that lie on the wrong side of the margin (using **hinge loss**).

