# Elastic Net Regression

**Elastic Net** is a hybrid regularization technique that combines the penalties of **Lasso (L1)** and **Ridge (L2)** methods.

It is designed to solve specific limitations of Lasso and Ridge, offering the "best of both worlds": **Feature Selection** (from Lasso) and **Stability** (from Ridge).

### Core Concept
$$\text{Elastic Net} = \text{L1 (Lasso)} + \text{L2 (Ridge)}$$

It is particularly useful when:
* **Lasso alone is unstable** (e.g., highly correlated features).
* **Ridge alone is not enough** (you need to eliminate useless features).
* **Features are highly correlated.**
* You want both **shrinkage** (coefficient reduction) and **feature selection** (setting coefficients to zero).

---

## 1. The Mathematical Formula

Elastic Net adds both L1 and L2 penalties to the standard Mean Squared Error (MSE) cost function:

$$
J(\beta) = \underbrace{\text{MSE}}_{\text{Error}} + \underbrace{\lambda_1 \sum_{j=1}^{p} |\beta_j|}_{\text{L1 Penalty}} + \underbrace{\lambda_2 \sum_{j=1}^{p} \beta_j^2}_{\text{L2 Penalty}}
$$

### The `sklearn` Implementation
In Python's `scikit-learn`, the formula is slightly parameterized using `alpha` (total strength) and `l1_ratio` (mix percentage):

$$
Loss = \text{MSE} + \alpha \left( \rho \sum |\beta| + \frac{(1-\rho)}{2} \sum \beta^2 \right)
$$

* **$\alpha$ (alpha):** Controls the overall strength of regularization.
* **$\rho$ (`l1_ratio`):** Controls the balance between L1 and L2.
    * If `l1_ratio = 1` $\rightarrow$ Pure Lasso.
    * If `l1_ratio = 0` $\rightarrow$ Pure Ridge.
    * If `0 < l1_ratio < 1` $\rightarrow$ Elastic Net.

---



## 2. Why Elastic Net? (The "Grouping Effect")

This is the most critical reason to use Elastic Net over Lasso.

### The Problem with Lasso
When features are **highly correlated** (e.g., Feature A and Feature B have 0.99 correlation):
* **Lasso** tends to pick just **one** feature arbitrarily and sets the other to zero.
* This causes **instability** (small data changes result in different features being selected).

### The Elastic Net Solution
Elastic Net fixes this using the L2 term.
* It creates a **Grouping Effect**: It tends to select groups of correlated features together rather than picking just one.
* It keeps the feature selection capability (setting some coefficients to exactly zero) but is more rigorous in what it keeps.

---

## 3. When Should You Use Elastic Net?

| Scenario | Recommendation |
| :--- | :--- |
| **Highly Correlated Features** | **YES.** Essential for stability (avoids random selection). |
| **High Dimensions ($p > n$)** | **YES.** Works better than Lasso when features > samples. |
| **Need Feature Selection** | **YES.** It will zero out irrelevant features. |
| **Unsure between Lasso/Ridge** | **YES.** Just tune `l1_ratio`. It's the safe middle ground. |

---

## Summary Comparison

| Method | Penalty | Feature Selection? | Handles Correlated Data? |
| :--- | :--- | :--- | :--- |
| **Ridge** | L2 ($\beta^2$) | No (Coefficients $\to$ small, not 0) | Yes (Shrinks them together) |
| **Lasso** | L1 ($|\beta|$) | **Yes** (Coefficients $\to$ 0) | **No** (Picks 1, drops others) |
| **Elastic Net** | **L1 + L2** | **Yes** | **Yes** (Selects groups) |

In [1]:
from sklearn.linear_model import ElasticNet
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_california_housing
from sklearn.metrics import mean_squared_error, r2_score

# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Elastic Net model
model = ElasticNet(alpha=0.1, l1_ratio=0.5)  # 50% L1 + 50% L2
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)

# Evaluation
print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
print("MSE:", mean_squared_error(y_test, y_pred))
print("R2 Score:", r2_score(y_test, y_pred))


Coefficients: [ 3.86285633e-01  1.29868965e-02  0.00000000e+00  0.00000000e+00
  7.93284174e-06 -3.27879039e-03 -2.40098262e-01 -2.33727233e-01]
Intercept: -19.191606736761383
MSE: 0.5730994198028208
R2 Score: 0.5626560643897964
