## ML models using overfitting regularization L1/L2

Overfitting happens when a model learns the noise and random fluctuations in the training data instead of the underlying patterns. One of the standard ways to address this is regularization, especially L1 (Lasso) and L2 (Ridge) penalties.

L1 Regularization (Lasso): Shrinks some weights to exactly zero, performing feature selection.
L2 Regularization (Ridge):Shrinks weights toward zero smoothly, keeping all features but reducing their impact.
ElasticNet combines both: L1 + L2

In [1]:
import numpy as np
from  sklearn.linear_model import LinearRegression, Lasso,Ridge
from  sklearn.model_selection import train_test_split
from  sklearn.metrics import mean_squared_error,r2_score

In [5]:
# Synthetic Data Generation
np.random.seed(42)
X = np.random.randn(100,10)
y = X[:,0]*5 + np.random.randn(100)*5 

In [11]:
X_train, X_test, y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

#Without regularization
lr = LinearRegression()
lr.fit(X_train,y_train)

print("Linear Regression MSE:", mean_squared_error(y_test, lr.predict(X_test)))
print("Linear Regression R2 Score:", r2_score(y_test, lr.predict(X_test)))

Linear Regression MSE: 29.512824624014975
Linear Regression R2 Score: 0.23523598629552778


In [19]:
# With L2 regularization (Ridge)
ridge = Ridge(alpha=1.0)  # alpha = λ
ridge.fit(X_train, y_train)
print("Ridge Regression MSE:", mean_squared_error(y_test, ridge.predict(X_test)))
print("Ridge Regression R2 Score:", r2_score(y_test, ridge.predict(X_test)))


Ridge Regression MSE: 29.272298611027974
Ridge Regression R2 Score: 0.24146872211243742


In [21]:
# With L1 regularization (Lasso)
lasso = Lasso(alpha=0.1)
lasso.fit(X_train, y_train)
print("Lasso Regression MSE:", mean_squared_error(y_test, lasso.predict(X_test)))
print("Lasso Regression R2 Score:", r2_score(y_test, ridge.predict(X_test)))
print("Lasso selected features:", np.sum(lasso.coef_ != 0))

Lasso Regression MSE: 28.504731313528236
Lasso Regression R2 Score: 0.24146872211243742
Lasso selected features: 8


Summary:
Linear Regression might overfit.
Ridge reduces overfitting by penalizing large weights.
Lasso goes further, eliminating irrelevant features by shrinking them to zero.

Use Cases:
L1 (Lasso): High-dimensional data, when you want feature selection.
L2 (Ridge): When all features are useful but you want to reduce overfitting.
ElasticNet: When you want the benefits of both.