# 🔗 Elastic Net Regression (L1 + L2 Regularization)

**Elastic Net Regression** is a regularization technique that combines **Lasso (L1)** and **Ridge (L2)** regression to reduce overfitting while enabling **feature selection** and **multicollinearity handling**.

---

## 🧠 Motivation

- **Ridge Regression**: Shrinks all coefficients but **none go to zero**
- **Lasso Regression**: Can shrink some coefficients **exactly to zero** (sparse)
- **Elastic Net**: A hybrid approach that leverages both Ridge and Lasso benefits ✅

---

## 🧮 Loss Function

$$
\text{Loss} = \text{MSE} + A \sum |w_i| + B \sum w_i^2
$$

Where:
- `MSE` is the Mean Squared Error
- `A`: Weight for Lasso (L1 penalty)
- `B`: Weight for Ridge (L2 penalty)

In **scikit-learn**, this is simplified using:

$$
\text{alpha} = A + B,\quad \text{l1\_ratio} = \frac{A}{A + B}
$$

| Parameter    | Meaning                                   | Range        |
|--------------|--------------------------------------------|--------------|
| `alpha`      | Total regularization strength (λ)         | > 0          |
| `l1_ratio`   | Proportion of L1 (Lasso) vs. L2 (Ridge)    | [0, 1]       |
|              | `0`: pure Ridge, `1`: pure Lasso          |              |

---

## ⚙️ When to Use Elastic Net

- You have **many features**, some of which may be irrelevant
- There is **high multicollinearity** among predictors
- Unsure whether to choose Ridge or Lasso alone

> 🔄 Elastic Net generalizes well by balancing between L1's sparsity and L2's stability.

---

## 📈 Bias-Variance Trade-off

- **Alpha low** → lower regularization → high variance risk
- **Alpha high** → more regularization → possible underfitting
- **l1_ratio** controls:
  - Feature selection (via L1)
  - Smooth coefficient shrinkage (via L2)

### Trade-off Illustration:

| l1_ratio   | Effect                         |
|------------|--------------------------------|
| 0.0        | Behaves like Ridge             |
| 0.5        | Balanced Lasso + Ridge         |
| 1.0        | Behaves like Lasso             |

---

## ✅ Summary

| Concept              | Elastic Net Insight                                |
|----------------------|----------------------------------------------------|
| Regularization Type  | Combined L1 + L2 (Ridge + Lasso)                   |
| Strength             | Controlled by `alpha`                             |
| Balance              | Controlled by `l1_ratio`                          |
| Feature Selection    | ✅ Yes (through L1 component)                      |
| Handles Multicollinearity | ✅ Better than Lasso or Ridge alone         |
| Implementation       | `sklearn.linear_model.ElasticNet` or `SGDRegressor`|

---

## 📚 Best Practices

- Use **GridSearchCV** or **RandomizedSearchCV** to tune:
  - `alpha`
  - `l1_ratio`

- Always **standardize** features before applying Elastic Net.

---

## 🔚 Conclusion

Elastic Net is a **versatile** and **robust** regularization technique for models with:
- **many input features**
- **correlated predictors**
- a need for **both feature selection and shrinkage**



In [1]:


from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression,Ridge,Lasso,ElasticNet
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score


X,y = load_diabetes(return_X_y=True)

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=2)

In [2]:


# Linear Regression
reg = LinearRegression()
reg.fit(X_train,y_train)
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)



0.4399338661568968

In [3]:


# Ridge 
reg = Ridge(alpha=0.1)
reg.fit(X_train,y_train)
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)



0.45199494197195456

In [4]:


# Lasso
reg = Lasso(alpha=0.01)
reg.fit(X_train,y_train)
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)



0.44111855963110613

In [5]:


# ElasticNet
reg = ElasticNet(alpha=0.005,l1_ratio=0.9)
reg.fit(X_train,y_train)
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)



0.4531474541554823