# **Problem Statement**  
## **16. Apply regularization (L1, L2) to a regression model using sklearn.**

Apply L1 (Lasso) and L2 (Ridge) regularization to a regression model using scikit-learn, compare them with ordinary linear regression, and observe the effect on coefficients and model performance.

### Constraints & Example Inputs/Outputs

### Constraints
- Dataset must be numerical
- Target variable should be continuous
- Regularization strength controlled by alpha
- Use sklearn regression models only

### Example Input:
```python
X → feature matrix (n_samples × n_features)
y → target values

```

Expected Output:
- Model coefficients
- Mean Squared Error (MSE)
- Observations on coefficient shrinkage


### Solution Approach

### Why Regularization?
Regularization helps:
- Prevent overfitting
- Reduce model complexity
- Handle multicollinearity

### Types of Regularization
L1 Regularization (Lasso)
- Can shrink coefficients to zero
- Performs feature selection

L2 Regularization (Ridge)
- Shrinks coefficients smoothly
- Keeps all features

### Models Used
- Linear Regression (No regularization)
- Ridge Regression (L2)
- Lasso Regression (L1)

### Solution Code

In [1]:
# Step1: Import Libraries & Create Dataset
import numpy as np
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

In [2]:
# Step2: Generate Sample Dataset
np.random.seed(42)

X = np.random.randn(100, 5)
true_coeffs = np.array([5, 0, 3, 0, 2])
y = X @ true_coeffs + np.random.randn(100) * 2

In [3]:
# Step3: Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

In [4]:
# Brute Force: Linear Regression (No Regularization)
lr = LinearRegression()
lr.fit(X_train, y_train)

y_pred_lr = lr.predict(X_test)
mse_lr = mean_squared_error(y_test, y_pred_lr)

print("Linear Regression Coefficients:", lr.coef_)
print("Linear Regression MSE:", mse_lr)


Linear Regression Coefficients: [5.30744183 0.19042551 3.06806409 0.09101866 1.99500772]
Linear Regression MSE: 4.035770260017848


### Alternative Solution

In [6]:
# Optimized1: Ridge Regression (L2)
ridge = Ridge(alpha=1.0)
ridge.fit(X_train, y_train)

y_pred_ridge = ridge.predict(X_test)
mse_ridge = mean_squared_error(y_test, y_pred_ridge)

print("Ridge Coefficients:", ridge.coef_)
print("Ridge MSE:", mse_ridge)


Ridge Coefficients: [5.21559945 0.185848   3.02961452 0.08955526 1.9699969 ]
Ridge MSE: 3.8407881941943236


In [5]:
# Optimized2: Lasso Regression (L1)
lasso = Lasso(alpha=0.1)
lasso.fit(X_train, y_train)

y_pred_lasso = lasso.predict(X_test)
mse_lasso = mean_squared_error(y_test, y_pred_lasso)

print("Lasso Coefficients:", lasso.coef_)
print("Lasso MSE:", mse_lasso)


Lasso Coefficients: [5.1474194  0.09055187 2.9786173  0.         1.91561064]
Lasso MSE: 3.994587043748342


### Alternative Approaches

- ElasticNet (L1 + L2 combined)
- Cross-validated regularization (RidgeCV, LassoCV)
- Bayesian Ridge Regression
- Feature selection before regression

### Test Case

In [7]:
# Test Case1: Compare Coefficients

print("True Coefficients:", true_coeffs)
print("Linear:", lr.coef_)
print("Ridge :", ridge.coef_)
print("Lasso :", lasso.coef_)


True Coefficients: [5 0 3 0 2]
Linear: [5.30744183 0.19042551 3.06806409 0.09101866 1.99500772]
Ridge : [5.21559945 0.185848   3.02961452 0.08955526 1.9699969 ]
Lasso : [5.1474194  0.09055187 2.9786173  0.         1.91561064]


In [4]:
# Test Case2: Perfect Classifier

y_true = [0, 0, 1, 1]
y_scores = [0.1, 0.2, 0.9, 0.95]

roc = compute_roc_points(y_true, y_scores)
auc = compute_auc(roc)

print("ROC Points:", roc)
print("AUC:", auc)


ROC Points: [(0.0, 0.5), (0.0, 1.0), (0.5, 1.0), (1.0, 1.0)]
AUC: 1.0


In [8]:
# Test Case 3: Compare MSE

print("MSE Comparison")
print("Linear:", mse_lr)
print("Ridge :", mse_ridge)
print("Lasso :", mse_lasso)


MSE Comparison
Linear: 4.035770260017848
Ridge : 3.8407881941943236
Lasso : 3.994587043748342


In [9]:
# Test Case 4: Effect of Alpha

for a in [0.01, 0.1, 1, 10]:
    model = Lasso(alpha=a)
    model.fit(X_train, y_train)
    print(f"Alpha={a}, Coefficients={model.coef_}")


Alpha=0.01, Coefficients=[5.29145691 0.18050038 3.05903193 0.0799396  1.98712135]
Alpha=0.1, Coefficients=[5.1474194  0.09055187 2.9786173  0.         1.91561064]
Alpha=1, Coefficients=[3.79423926 0.         2.09474652 0.         1.16839496]
Alpha=10, Coefficients=[0. 0. 0. 0. 0.]


### Expected Outputs
1. Linear Regression
- Larger coefficients
- Higher variance

2. Ridge
- Smaller but non-zero coefficients
- Stable predictions

3. Lasso
- Some coefficients exactly zero
- Built-in feature selection

## Complexity Analysis

| Model             | Time Complexity    | Space Complexity |
| ----------------- | ------------------ | ---------------- |
| Linear Regression | O(n·d²)            | O(d)             |
| Ridge             | O(n·d²)            | O(d)             |
| Lasso             | Higher (iterative) | O(d)             |


#### Thank You!!