# Ridge and Lasso Regression with Regularization

## ðŸ“š Learning Objectives

By completing this notebook, you will:
- Understand L1 (Lasso) and L2 (Ridge) regularization
- Implement Ridge and Lasso regression using scikit-learn
- Compare regularization effects on model coefficients
- Select optimal regularization parameters

## ðŸ”— Prerequisites

- âœ… Understanding of linear regression
- âœ… Python 3.8+ installed
- âœ… scikit-learn, pandas, numpy, matplotlib

---

## Official Structure Reference

This notebook covers practical activities from **Course 04, Unit 1**:
- Implementing Ridge and Lasso regression with regularization
- **Source:** `DETAILED_UNIT_DESCRIPTIONS.md` - Unit 1 Practical Content

---

## Introduction to Regularization

**Regularization** prevents overfitting by penalizing large coefficients:
- **Ridge (L2)**: Penalizes sum of squared coefficients
- **Lasso (L1)**: Penalizes sum of absolute coefficients (can set coefficients to zero)


In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt_
print("âœ… Libraries imported successfully!")


In [None]:
# Generate sample data with multiple features_np.random.seed(42)
n_samples = 100

# Create features with some correlation_X = np.random.randn(n_samples, 5)
# Target: linear combination with noise_y = 2 * X[:, 0] + 1.5 * X[:, 1] - X[:, 2] + 0.5 * np.random.randn(n_samples)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize features (important for regularization)
scaler = StandardScaler()_X_train_scaled =  scaler.fit_transform(X_train)
X_train_scaled = scaler.fit_transform(X_train)_X_test_scaled =  scaler.transform(X_test)
X_test_scaled = scaler.transform(X_test)
print(f"Training set: {X_train_scaled.shape}")
print(f"Test set: {X_test_scaled.shape}")


In [None]:
# Compare: Ordinary Least Squares (OLS), Ridge, Lasso_
print("=" * 60)
print("Comparing Regression Models:")
print("=" * 60)

# OLS_ols = LinearRegression()
ols.fit(X_train_scaled, y_train)_ols_pred =  ols.predict(X_test_scaled)
ols_pred = ols.predict(X_test_scaled)_ols_mse =  mean_squared_error(y_test, ols_pred)
ols_mse = mean_squared_error(y_test, ols_pred)_ols_r2 =  r2_score(y_test, ols_pred)
ols_r2 = r2_score(y_test, ols_pred)

# Ridge Regression (alpha = 1.0)
ridge = Ridge(alpha=1.0)
ridge.fit(X_train_scaled, y_train)_ridge_pred =  ridge.predict(X_test_scaled)
ridge_pred = ridge.predict(X_test_scaled)_ridge_mse =  mean_squared_error(y_test, ridge_pred)
ridge_mse = mean_squared_error(y_test, ridge_pred)_ridge_r2 =  r2_score(y_test, ridge_pred)
ridge_r2 = r2_score(y_test, ridge_pred)

# Lasso Regression (alpha = 0.1)
lasso = Lasso(alpha=0.1)
lasso.fit(X_train_scaled, y_train)_lasso_pred =  lasso.predict(X_test_scaled)
lasso_pred = lasso.predict(X_test_scaled)_lasso_mse =  mean_squared_error(y_test, lasso_pred)
lasso_mse = mean_squared_error(y_test, lasso_pred)_lasso_r2 =  r2_score(y_test, lasso_pred)
lasso_r2 = r2_score(y_test, lasso_pred)
print(f"\nOLS - MSE: {ols_mse:.4f}, RÂ²: {ols_r2:.4f}")
print(f"Ridge - MSE: {ridge_mse:.4f}, RÂ²: {ridge_r2:.4f}")
print(f"Lasso - MSE: {lasso_mse:.4f}, RÂ²: {lasso_r2:.4f}")

# Compare coefficients_
print("\nCoefficients Comparison:")
print(f"OLS coefficients: {ols.coef_}")
print(f"Ridge coefficients: {ridge.coef_}")
print(f"Lasso coefficients: {lasso.coef_} (note: some may be zero)")


## Part 2: Tuning Regularization Parameter (Alpha)

Let's see how different alpha values affect model performance and coefficients.


In [None]:
# Test different alpha values_alphas = np.logspace(-4, 2, 50)
ridge_coefs = []_lasso_coefs =  []
lasso_coefs = []_ridge_scores =  []
ridge_scores = []_lasso_scores =  []
lasso_scores = []

for alpha in alphas:
    # Ridge_ridge = Ridge(alpha=alpha)
ridge.fit(X_train_scaled, y_train)
ridge_coefs.append(ridge.coef_)
ridge_scores.append(ridge.score(X_test_scaled, y_test))
    
    # Lasso_lasso = Lasso(alpha=alpha)
lasso.fit(X_train_scaled, y_train)
lasso_coefs.append(lasso.coef_)
lasso_scores.append(lasso.score(X_test_scaled, y_test))

ridge_coefs = np.array(ridge_coefs)_lasso_coefs =  np.array(lasso_coefs)
lasso_coefs = np.array(lasso_coefs)
print(f"Tested {len(alphas)} alpha values from {alphas[0]:.4f} to {alphas[-1]:.2f}")
print(f"Best Ridge RÂ²: {max(ridge_scores):.4f} at alpha={alphas[np.argmax(ridge_scores)]:.4f}")
print(f"Best Lasso RÂ²: {max(lasso_scores):.4f} at alpha={alphas[np.argmax(lasso_scores)]:.4f}")


## Summary

### Key Concepts:
1. **Ridge Regression (L2)**: Shrinks coefficients but doesn't eliminate them
2. **Lasso Regression (L1)**: Can set coefficients to zero (feature selection)
3. **Alpha parameter**: Controls regularization strength (higher = more regularization)
4. **Feature scaling**: Essential before applying regularization

### Applications:
- Preventing overfitting
- Handling multicollinearity
- Feature selection (Lasso)
- Improving model generalization

**Reference:** Course 04, Unit 1: "Implementing Ridge and Lasso regression with regularization"
