# Ridge Regression

Ridge regression is a linear regression method that adds an L2 penalty to the modelâ€™s coefficients to reduce overfitting by shrinking the weights toward zero.

## Step 1: Import Packages

In [1]:
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error
import numpy as np
import pickle

## Step 2: Load preprocessed data

Loading all previously preprocessed and scaled datasets from a serialized pickle file.

In [2]:
with open("preprocessed_data.pkl", "rb") as f:
    data = pickle.load(f)

X_train = data["X_train"]
X_val   = data["X_val"]
X_test  = data["X_test"]
y_train = data["y_train"]
y_val   = data["y_val"]
y_test  = data["y_test"]
test_dates = data["test_dates"]
scaler  = data["scaler"]


## Step 3: Check dataset shapes

Sprint the dataset shapes to confirm that feature matrices and target arrays have the expected dimensions.

In [3]:
print("X_train:", X_train.shape)
print("y_train:", y_train.shape)

print("X_val:", X_val.shape)
print("y_val:", y_val.shape)

print("X_test:", X_test.shape)
print("y_test:", y_test.shape)


X_train: (997079, 21)
y_train: (997079, 1)
X_val: (213660, 21)
y_val: (213660, 1)
X_test: (213661, 21)
y_test: (213661, 1)


## Step 4: Train and evaluate Ridge Regression

In [4]:

ridge_model = Ridge(alpha=1.0)
ridge_model.fit(X_train, y_train)

ridge_pred = ridge_model.predict(X_test)

ridge_rmse = np.sqrt(mean_squared_error(y_test, ridge_pred))
print(f"RMSE for Ridge Regression: {ridge_rmse:.4f}")


RMSE for Ridge Regression: 0.1305
