# Elastic Net Regression

Elastic Net combines both L1 (Lasso) and L2 (Ridge) regularization.

It adds both absolute value and squared penalties to the cost function.

---

## Original Cost Function (MSE)

$$
\text{MSE}(w) = \frac{1}{n} \sum_{i=1}^{n} (y_i - x_i^T w)^2
$$

---

## New Cost Function (Elastic Net)

$$
J(w) = \text{MSE}(w) 
+ \lambda_1 \sum_{j=1}^{d} |w_j|
+ \lambda_2 \sum_{j=1}^{d} w_j^2
$$

---

## Compact Vector Form

$$
J(w) = \frac{1}{n} \| y - Xw \|_2^2 
+ \lambda_1 \| w \|_1
+ \lambda_2 \| w \|_2^2
$$

where

$$
\| w \|_1 = \sum_{j=1}^{d} |w_j|
$$

$$
\| w \|_2^2 = \sum_{j=1}^{d} w_j^2
$$

---

## Meaning of Each Term

- First term → Data fitting (MSE)
- Second term → L1 penalty (creates sparsity)
- Third term → L2 penalty (stabilizes weights)

---

## Special Cases

If:

- $\lambda_2 = 0$ → Elastic Net becomes **Lasso**
- $\lambda_1 = 0$ → Elastic Net becomes **Ridge**

---

## Why Use Elastic Net?

- When features are highly correlated
- When you want sparsity (like L1)
- But also want stability (like L2)

It balances feature selection and coefficient shrinkage.
