### Generate synthetic regression dataset and split into train and test

In [31]:
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

X, y = make_regression(n_samples=100, n_features=4, noise=0.1, random_state=42, effective_rank=2)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=19)

### Scale features using StandardScaler

In [None]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


### Train a basic Ridge regression model

In [None]:
from sklearn.linear_model import Ridge

ridge = Ridge()
ridge.fit(X_train, y_train)

### Evaluate the basic Ridge model

In [None]:
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

y_pred = ridge.predict(X_test)
mean_absolute_error(y_test, y_pred)
mean_squared_error(y_test, y_pred)
r2_score(y_test, y_pred)

0.9997375827874928

### Define parameter grid for alpha to tune with GridSearchCV

In [None]:
param_grid = {
    'alpha': [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]
}

### Use GridSearchCV to tune alpha in Ridge regression

In [None]:
from sklearn.model_selection import GridSearchCV

ridge_cv = GridSearchCV(ridge, param_grid, cv=3, n_jobs=-1)
ridge_cv.fit(X_train, y_train)

### Evaluate the best Ridge model found by GridSearchCV

In [None]:
y_pred = ridge_cv.predict(X_test)
mean_absolute_error(y_test, y_pred)
mean_squared_error(y_test, y_pred)
r2_score(y_test, y_pred)

0.9999138365176437

### Access the best estimator and manually fit Ridge with best alpha

In [None]:
ridge_cv.best_estimator_

ridge3 = Ridge(alpha=0.001)  # alpha from best estimator
ridge3.fit(X_train, y_train)

ridge3.intercept_
ridge3.coef_

array([4.58795185, 4.99068649, 7.70322053, 3.02528604])

### Conclusion

This analysis illustrates the application of Ridge regression on a synthetic regression dataset, showing how regularization helps control model complexity and prevent overfitting. The basic Ridge model trained with a default alpha provides an initial benchmark of predictive performance on unseen test data. However, without tuning, the model may not fully capture the optimal balance between bias and variance.

By employing GridSearchCV to systematically tune the alpha hyperparameter, the model identifies the best regularization strength that maximizes prediction accuracy. This tuning enhances the model’s ability to generalize well to new data by appropriately penalizing large coefficients. The results emphasize the importance of hyperparameter optimization in regression tasks to refine model reliability and interpretability.

Overall, this process highlights the iterative nature of machine learning workflows—starting from a baseline model and improving through parameter tuning—delivering robust predictions and clearer insights into feature effects. Such methodologies are vital for developing dependable predictive models in both synthetic and real-world scenarios.