## What is L1 and L2 Regularization?
Regularization is a technique used in machine learning to prevent overfitting by adding a penalty term to the loss function. It discourages complex models by reducing the magnitude of the model coefficients.

🔹 L1 Regularization (Lasso)
Name: Least Absolute Shrinkage and Selection Operator (LASSO)

Penalty term added:
$
\text{Loss}_{\text{L1}} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 + \lambda \sum_{j=1}^{p} |\beta_j|$

**Effect:**

Adds the absolute value of coefficients as penalty.

Encourages sparsity — some coefficients become exactly zero.

Useful for feature selection.

**When to use:**

When you suspect many features are irrelevant.

When you want to simplify the model by removing features.

🔹 L2 Regularization (Ridge)
Name: Ridge Regression

Penalty term added:
$
\text{Loss}_{\text{L2}} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 + \lambda \sum_{j=1}^{p} \beta_j^2
$

**Effect:**

Adds the squared value of coefficients as penalty.

Shrinks coefficients closer to zero, but not exactly zero.

Prevents large coefficients, improving model generalization.

**When to use:**

When you have many small/medium effect features.

When all features are useful, but you want to reduce overfitting.

###  Importing Libraries
In this step, we import the necessary libraries for:
- Loading the dataset (`fetch_california_housing`),
- Splitting data into training and testing sets (`train_test_split`),
- Defining regression models (`Ridge`, `Lasso`, `LinearRegression`),
- Evaluating the models (`mean_squared_error`, `r2_score`),
- Scaling the features (`StandardScaler`),
- And working with data structures (`pandas`).


In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge, Lasso, LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import StandardScaler
import pandas as pd



###  Load and Prepare Data
Here, we load the **California Housing Dataset** using `fetch_california_housing`. This dataset contains features about the California housing market.
- `X` contains the features (e.g., average rooms, population, etc.),
- `y` contains the target values (house prices).


In [None]:
# Step 1: Load the dataset
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

###  Splitting Data into Training and Test Sets
In this step, we divide our dataset into training and test sets. We use `train_test_split` to allocate 80% of the data for training and 20% for testing. The `random_state=42` ensures that the split is reproducible each time the code is run.


In [None]:
# Step 2: Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Feature Scaling
Before applying the machine learning models, we need to scale the features to ensure that all variables have the same scale. We use `StandardScaler` to standardize the features by removing the mean and scaling to unit variance. The training set is fitted and transformed, while the test set is only transformed using the same scaler.


In [None]:
# Step 3: Feature Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

###  Applying L2 Regularization (Ridge Regression)
In this step, we apply Ridge regression, which uses L2 regularization. Ridge regression helps in controlling model complexity by adding a penalty term proportional to the square of the coefficients. The `alpha=1.0` parameter controls the strength of the regularization.

After fitting the model on the scaled training data, we predict the house prices on the test data and evaluate the model using:
- **MSE (Mean Squared Error)**: A measure of prediction error.
- **R2 Score**: The proportion of variance explained by the model.


In [None]:
# Step 4: Apply L2 Regularization (Ridge Regression)
ridge = Ridge(alpha=1.0)
ridge.fit(X_train_scaled, y_train)
y_pred_ridge = ridge.predict(X_test_scaled)

print("Ridge Regression")
print("MSE:", mean_squared_error(y_test, y_pred_ridge))
print("R2 Score:", r2_score(y_test, y_pred_ridge))

Ridge Regression
MSE: 0.5558548589435988
R2 Score: 0.575815742891367


### Applying L1 Regularization (Lasso Regression)
Now, we apply Lasso regression, which uses L1 regularization. Lasso regression adds a penalty proportional to the absolute value of the coefficients. This encourages sparsity, meaning it may shrink some coefficients to zero, effectively performing feature selection.

We evaluate the model's performance using MSE and R2 Score just like with Ridge regression.


In [None]:
# Step 5: Apply L1 Regularization (Lasso Regression)
lasso = Lasso(alpha=0.1)
lasso.fit(X_train_scaled, y_train)
y_pred_lasso = lasso.predict(X_test_scaled)

print("\nLasso Regression")
print("MSE:", mean_squared_error(y_test, y_pred_lasso))
print("R2 Score:", r2_score(y_test, y_pred_lasso))


Lasso Regression
MSE: 0.6796290284328825
R2 Score: 0.48136113250290735


###  Applying Linear Regression
Finally, we apply standard Linear Regression without any regularization (L1 or L2). Linear Regression models the relationship between the target and the features without adding any penalty term.

We evaluate this model's performance using MSE and R2 Score.


In [None]:
# Linear Regression
LR = LinearRegression()
LR.fit(X_train_scaled, y_train)
y_pred_lr = LR.predict(X_test_scaled)

print("\nLasso Regression")
print("MSE:", mean_squared_error(y_test, y_pred_lr))
print("R2 Score:", r2_score(y_test, y_pred_lr))


Lasso Regression
MSE: 0.5558915986952442
R2 Score: 0.575787706032451
