**Q1 (Based on Step-by-Step Implementation of Ridge Regression using Gradient
Descent Optimization)**

Generate a dataset with atleast seven highly correlated columns and a target variable.Implement Ridge Regression using Gradient Descent Optimization. Take different values of learning rate (such as 0.0001,0.001,0.01,0.1,1,10) and regularization parameter (10^(-15),10^(-10),10^(-5),10^(-3),0,1,10,20). Choose the best parameters for which ridge regression cost function is minimum and R2_score is maximum.

In [44]:
import numpy as np
import pandas as pd
from sklearn.datasets import make_regression
from sklearn.preprocessing import StandardScaler

In [45]:
np.random.seed(42)
n_samples=1000
n_features=7

In [46]:
X, y = make_regression(n_samples=n_samples, n_features=n_features, noise=0.1)

In [47]:
X = np.dot(X, np.random.rand(n_features, n_features))

In [48]:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

In [49]:
def ridge_regression(X, y, learning_rate, reg_param, num_iterations):
    m, n = X.shape
    theta = np.zeros(n)
    cost_history = []

    for _ in range(num_iterations):
        predictions = X.dot(theta)
        errors = predictions - y
        gradient = (X.T.dot(errors) + reg_param * theta) / m
        theta -= learning_rate * gradient

        cost = (1 / (2 * m)) * (errors.T.dot(errors) + reg_param * np.sum(theta**2))
        cost_history.append(cost)

    return theta, cost_history


In [50]:
learning_rates = [0.0001, 0.001, 0.01, 0.1, 1, 10]
regularization_params = [1e-15, 1e-10, 1e-5, 3, 10, 20]
num_iterations = 1000

In [51]:
best_theta = None
best_cost = float('inf')
best_r2_score = float('-inf')

for lr in learning_rates:
    for rp in regularization_params:
        theta, cost_history = ridge_regression(X_scaled, y, lr, rp, num_iterations)
        r2_score = 1 - np.sum((X_scaled.dot(theta) - y) ** 2) / np.sum((y - np.mean(y)) ** 2)
        if cost_history[-1] < best_cost and r2_score > best_r2_score:
            best_theta = theta
            best_cost = cost_history[-1]
            best_r2_score = r2_score

  cost = (1 / (2 * m)) * (errors.T.dot(errors) + reg_param * np.sum(theta**2))
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
  cost = (1 / (2 * m)) * (errors.T.dot(errors) + reg_param * np.sum(theta**2))


In [52]:
print(f'Best parameters: Learning rate = {lr}, Regularization parameter = {rp}')
print(f'Best cost: {best_cost}')
print(f'Best R2 score: {best_r2_score}')

Best parameters: Learning rate = 10, Regularization parameter = 20
Best cost: 36.312743654535716
Best R2 score: 0.9948759669787856


Q2) Load the Hitters dataset from the following link
https://drive.google.com/file/d/1qzCKF6JKKMB0p7ul_lLy8tdmRk3vE_bG/view?usp=sharing

(a) Pre-process the data (null values, noise, categorical to numerical encoding)

In [53]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer

In [54]:
hitters=pd.read_csv('/content/Hitters.csv')

In [55]:
hitters.dropna(inplace=True)

In [56]:
X = hitters.drop('Salary', axis=1)
y = hitters['Salary']

In [57]:
categorical_cols = X.select_dtypes(include=['object']).columns

In [58]:
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), [col for col in X.columns if col not in categorical_cols]),
        ('cat', OneHotEncoder(), categorical_cols)
    ])

In [59]:
preprocessor.fit(X)
X_preprocessed = preprocessor.transform(X)

In [60]:
print("Pre-processing completed. Data is ready for model fitting.")

Pre-processing completed. Data is ready for model fitting.


(b) Separate input and output features and perform scaling

In [61]:
from sklearn.model_selection import train_test_split

In [62]:
X_train, X_test, y_train, y_test = train_test_split(X_preprocessed, y, test_size=0.2, random_state=42)

In [63]:
print("Data has been split into training and testing sets.")

Data has been split into training and testing sets.


(c) Fit a Linear, Ridge (use regularization parameter as 0.5748), and LASSO (use
regularization parameter as 0.5748) regression function on the dataset.

In [64]:
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.metrics import mean_squared_error

In [65]:
models = {
    'Linear Regression': LinearRegression(),
    'Ridge Regression': Ridge(alpha=0.5748),
    'Lasso Regression': Lasso(alpha=0.5748)
}

In [66]:
for name, model in models.items():
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print(f'{name} Mean Squared Error: {mse}')

Linear Regression Mean Squared Error: 128284.34549672344
Ridge Regression Mean Squared Error: 126606.39854037874
Lasso Regression Mean Squared Error: 126543.07184906653


  model = cd_fast.enet_coordinate_descent(


(d) Evaluate the performance of each trained model on test set. Which model performs
the best and Why?

**Q 3 Cross Validation for Ridge and Lasso Regression**

Explore Ridge Cross Validation (RidgeCV) and Lasso Cross Validation (LassoCV)
function of Python. Implement both on Boston House Prediction Dataset (load_boston
dataset from sklearn.datasets).

In [67]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import RidgeCV, LassoCV
from sklearn.metrics import mean_squared_error