In Support Vector Machines (SVM), the decision function is defined as


$f(\mathbf{x}) = \mathbf{w}^\top \mathbf{x} + b.$


The decision rule for classifying a point $\mathbf{x}$ is based on the sign of $f(\mathbf{x})$: 


- If $f(\mathbf{x}) > 0$, then $\mathbf{x}$ is classified as one class (say, +1).
- If $f(\mathbf{x}) < 0$, then $\mathbf{x}$ is classified as the other class (say, -1).

The hyperplane given by


$\mathbf{w}^\top \mathbf{x} + b = 0$


is the set of all points where $f(\mathbf{x})$ is exactly zero. This hyperplane acts as the decision boundary that separates the two classes.


**Margins**

The distance between these two boundaries is the margin, which is:


$\text{Margin} = \frac{2}{\|\mathbf{w}\|}$



**Since the margin is inversely proportional to  $\|\mathbf{w}\|$ , to maximize the margin, we need to:**


$\text{Maximise } \frac{2}{\|\mathbf{w}\|} \quad \Longleftrightarrow \quad \text{Minimise } \|\mathbf{w}\|$


But instead of minimising  $\|\mathbf{w}\|$ , we minimise  $\frac{1}{2} \|\mathbf{w}\|^2$  because:  
- It’s easier to differentiate (smooth function).
- The factor  $\frac{1}{2}$  simplifies the derivative during optimization.

### Here’s the complete optimization formulation:  
Objective (Minimise the Weights):


$\min_{\mathbf{w}, b} \quad \frac{1}{2} \|\mathbf{w}\|^2$


Subject to Constraints (Correct Classification with Margin):


$t^{(i)} (\mathbf{w}^\top \mathbf{x}^{(i)} + b) \geq 1 \quad \forall i$


Where:
- $t^{(i)} = +1$  for positive class
- $t^{(i)} = -1$  for negative class

This ensures:
- Positive points are on or beyond the  +1  boundary.
- Negative points are on or beyond the  -1  boundary.

## Softmargin SVM

To allow misclassifications or points inside the margin, we introduce slack variables for each instance:


$\xi_i \geq 0 \quad \forall i$

The slack variable measures how much the $i^{(th)}$ instance is allowed to violate the margin.

The new constraint becomes:


$t^{(i)} (\mathbf{w}^\top \mathbf{x}_i + b) \geq 1 - \xi_i$


Where:
- $ \xi_i = 0$  means the point is correctly classified and outside the margin (like in hard margin SVM).  
- $0 < \xi_i < 1$  means the point is inside the margin but still correctly classified.  
- $\xi_i > 1$  means the point is misclassified.  