# Tutorial 2: Ridge and Lasso Regularization

In this tutorial, we apply **Ridge** and **Lasso** regression to the *Medical Insurance Charges* dataset.

---

### Why Regularization?

Regularization helps prevent **overfitting** by penalizing large coefficients.

- **Ridge Regression (L2 penalty):**
  - Shrinks all coefficients towards zero.
  - Doesn't eliminate any features completely.

- **Lasso Regression (L1 penalty):**
  - Can shrink some coefficients **exactly to zero**.
  - Performs both **shrinkage and feature selection**.
  - We load the **preprocessed training/testing data** from Tutorial 1 and apply Ridge & Lasso regression.

---

We will:
1. Fit Ridge and Lasso models
2. Print and compare coefficients

In [16]:
import pandas as pd
from sklearn.linear_model import Ridge, Lasso
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer

import joblib
from sklearn.linear_model import Ridge, Lasso

# Load preprocessed data
X_train, X_test, y_train, y_test = joblib.load("insurance_data_split.pkl")

### Ridge Regression: 

In [28]:
ridge = Ridge(alpha=1.0)
ridge.fit(X_train, y_train)
print("Ridge Coefficients:")
print(ridge.coef_)

Ridge Coefficients:
[ 3.60327080e+03  2.05193836e+03  5.13153656e+02 -1.06355787e+01
  2.35146348e+04 -3.66152625e+02 -6.43009456e+02 -8.02913351e+02]


### Lasso Regression:

In [26]:
lasso = Lasso(alpha=1.0)
lasso.fit(X_train, y_train)
print("\nLasso Coefficients:")
print(lasso.coef_)


Lasso Coefficients:
[ 3.60821741e+03  2.05269422e+03  5.11517339e+02 -1.40450990e+01
  2.36446476e+04 -3.54189843e+02 -6.40187430e+02 -7.93012399e+02]


### Observations from Output

- **Ridge Regression Coefficients:**
  
  [ 3603.27, 2051.94, 513.15, -10.64, 23514.63, -366.15, -643.01, -802.91 ]

- **Lasso Regression Coefficients:**

  [ 3608.22, 2052.69, 511.52, -14.05, 23644.65, -354.19, -640.19, -793.01 ]

---

#### Interpretation:

- Both **Ridge** and **Lasso** resulted in very similar coefficients, indicating **low multicollinearity** among features.
- **Ridge** slightly shrinks all coefficients but **retains all features** (none become zero).
- **Lasso** also shrinks the coefficients, but in this case, **none are exactly zero** — meaning it didn’t drop any features.  
- However, with a higher regularization strength (`alpha`), **Lasso** could potentially reduce some coefficients **to zero**, performing **automatic feature selection**.
- These regularization techniques help improve **generalization** and reduce the risk of **overfitting** in the model.