```{contents}
```

## Elastic Net Regression

**Elastic Net Regression** is a **regularized linear regression** technique that combines the penalties of:

* **Lasso (L1)** → drives some coefficients to **zero** (feature selection).
* **Ridge (L2)** → shrinks coefficients (handles multicollinearity).

It is especially useful when you have **many correlated features**.

---

### Elastic Net Loss Function

For a regression model:

$$
y = X\beta + \epsilon
$$

The Elastic Net objective function is:

$$
L(\beta) = \frac{1}{2n} \sum_{i=1}^n (y_i - \hat{y}_i)^2 + \lambda \left( \alpha \sum_{j=1}^p |\beta_j| + (1-\alpha)\sum_{j=1}^p \beta_j^2 \right)
$$

Where:

* $y_i$ → actual value
* $\hat{y}_i$ → predicted value
* $\beta_j$ → regression coefficients
* $\lambda$ → overall regularization strength
* $\alpha$ → mixing parameter between L1 and L2

  * If $\alpha = 1$: becomes **Lasso**
  * If $\alpha = 0$: becomes **Ridge**

---

### Why Use Elastic Net?

* **Lasso issue**: If features are highly correlated, it tends to pick one and ignore others → unstable.
* **Ridge issue**: Keeps all features but doesn’t perform feature selection.
* **Elastic Net**: Combines both strengths →
  ✅ Keeps model stable with correlated features.
  ✅ Performs feature selection.

---

### Example Intuition

Suppose you’re predicting **house price** using:

* `square_feet`
* `bedrooms`
* `bathrooms`
* `location_score`

Since `square_feet` and `bedrooms` are highly correlated:

* **Lasso** may drop `bedrooms` entirely.
* **Ridge** will keep both but shrink their weights.
* **Elastic Net** → keeps both but controls their weights → better balance.

---

### Pros & Cons

✅ Handles multicollinearity
✅ Performs feature selection
✅ Works well when $p > n$ (more features than samples)
❌ Needs tuning of two hyperparameters ($\lambda, \alpha$)
