# ðŸ”¹ PART E: REGULARIZATION TECHNIQUES (RIDGE & LASSO REGRESSION)

The objective of this section is to apply regularization techniquesâ€”Ridge and Lasso regressionâ€”to improve model generalization and handle multicollinearity among input features. The performance of these models is compared with standard multiple linear regression.

## Why Regularization is Required

In multiple linear regression, the presence of correlated input features can lead to unstable coefficient estimates and overfitting. Regularization addresses this issue by adding a penalty term to the loss function, thereby constraining the magnitude of regression coefficients.

In [1]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge, Lasso
from sklearn.metrics import mean_squared_error, r2_score

In [2]:
df = pd.read_csv("data/Student_Performance.csv")
df.head()

Unnamed: 0,Hours Studied,Previous Scores,Extracurricular Activities,Sleep Hours,Sample Question Papers Practiced,Performance Index
0,7,99,Yes,9,1,91.0
1,4,82,No,4,2,65.0
2,8,51,Yes,7,2,45.0
3,5,52,Yes,5,2,36.0
4,7,75,No,8,5,66.0


## Feature Selection

Only numeric features are considered in this section to clearly observe the effect of regularization:

- Hours Studied

- Previous Scores

- Sleep Hours

- Sample Question Papers Practiced

The target variable is Performance Index, which is continuous in nature.

In [3]:
X = df[
    [
        "Hours Studied",
        "Previous Scores",
        "Sleep Hours",
        "Sample Question Papers Practiced"
    ]
]

y = df["Performance Index"]

In [4]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

## Feature Scaling

Before applying Ridge and Lasso regression, the input features are standardized using StandardScaler.
This step is necessary because regularization penalizes coefficients based on their magnitude, and unscaled features could bias the penalty toward larger-scale variables.

In [5]:
scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## Ridge Regression (L2 Regularization)

Ridge regression adds an L2 penalty to the loss function, which shrinks coefficient values but does not force them to become zero. This helps in reducing model variance while retaining all input features.

The model is trained using a fixed regularization parameter (Î± = 1.0), and its performance is evaluated using MSE, RMSE, and RÂ² score.

In [6]:
ridge = Ridge(alpha=1.0)
ridge.fit(X_train_scaled, y_train)

y_pred_ridge = ridge.predict(X_test_scaled)

In [7]:
ridge_mse = mean_squared_error(y_test, y_pred_ridge)
ridge_rmse = np.sqrt(ridge_mse)
ridge_r2 = r2_score(y_test, y_pred_ridge)

ridge_mse, ridge_rmse, ridge_r2

(4.182878296670286, np.float64(2.045208619351651), 0.9887127730821486)

In [8]:
ridge_coefficients = pd.Series(
    ridge.coef_,
    index=X.columns
)

ridge_coefficients

Hours Studied                        7.401455
Previous Scores                     17.635881
Sleep Hours                          0.803783
Sample Question Papers Practiced     0.548510
dtype: float64

## Lasso Regression (L1 Regularization)

Lasso regression applies an L1 penalty, which can shrink some coefficients to exactly zero. As a result, Lasso performs implicit feature selection by removing less important features from the model.

The model is trained using Î± = 0.1 to clearly observe coefficient sparsity.

In [9]:
lasso = Lasso(alpha=0.1)
lasso.fit(X_train_scaled, y_train)

y_pred_lasso = lasso.predict(X_test_scaled)

In [10]:
lasso_mse = mean_squared_error(y_test, y_pred_lasso)
lasso_rmse = np.sqrt(lasso_mse)
lasso_r2 = r2_score(y_test, y_pred_lasso)

lasso_mse, lasso_rmse, lasso_r2

(4.265685489328157, np.float64(2.0653535991031067), 0.9884893232211512)

In [11]:
lasso_coefficients = pd.Series(
    lasso.coef_,
    index=X.columns
)

lasso_coefficients

Hours Studied                        7.303417
Previous Scores                     17.538040
Sleep Hours                          0.704600
Sample Question Papers Practiced     0.451049
dtype: float64

## Coefficient Comparison

The coefficients obtained from Ridge and Lasso regression are compared to analyze the effect of regularization.
While Ridge regression reduces the magnitude of all coefficients, Lasso regression eliminates weaker predictors by assigning them zero coefficients.

In [12]:
coeff_df = pd.DataFrame({
    "Ridge Coefficient": ridge_coefficients,
    "Lasso Coefficient": lasso_coefficients
})

coeff_df

Unnamed: 0,Ridge Coefficient,Lasso Coefficient
Hours Studied,7.401455,7.303417
Previous Scores,17.635881,17.53804
Sleep Hours,0.803783,0.7046
Sample Question Papers Practiced,0.54851,0.451049


## Performance Comparison

Both models are evaluated using standard regression metrics. Ridge regression generally provides more stable predictions, whereas Lasso regression offers a simpler and more interpretable model by selecting only the most influential features.

In [13]:
comparison_df = pd.DataFrame({
    "Model": ["Ridge Regression", "Lasso Regression"],
    "MSE": [ridge_mse, lasso_mse],
    "RMSE": [ridge_rmse, lasso_rmse],
    "R2 Score": [ridge_r2, lasso_r2]
})

comparison_df

Unnamed: 0,Model,MSE,RMSE,R2 Score
0,Ridge Regression,4.182878,2.045209,0.988713
1,Lasso Regression,4.265685,2.065354,0.988489


## Conclusion

Regularization techniques effectively address multicollinearity and overfitting in linear regression models. Ridge regression improves coefficient stability, while Lasso regression aids in feature selection. The choice between these methods depends on the trade-off between model simplicity and predictive performance.