# Part 3 – Linear Regression

Use the dataset you loaded in Part 1 with the dataset splits from Part 2. You will implement two different solutions for linear regression with weight decay regularization:

- Using the closed form (normal equations with weight decay)
- Using stochastic gradient descent.

Your implementation will predict the value of the “price” variable using all the remaining numerical features. It will consist of a scikit-learn estimator API with the following parameters:

- **solver**: This parameter selects which algorithm is used to learn the coefficients of the linear regression. Passing “cf” will select the closed form solution, passing “sgd” will select the stochastic gradient descent. Set the default value to “cf”.
- **max_iter**: This parameter is relevant only when the solver is “sgd”. It controls the number of iterations (over the entire dataset) of the stochastic gradient descent algorithm. Set the default value to 100.
- **learning_rate**: This parameter is relevant only when the solver is “sgd”. Set the default value to 0.0001.


In [7]:
# Loading and splitting the dataset

import pandas as pd
import numpy as np

diamonds_dataset = pd.read_csv("../data/diamonds.csv")
diamonds_dataset = diamonds_dataset.drop(columns=["cut", "color", "clarity"])

n = len(diamonds_dataset)
splits = [int(0.6 * n), int(0.8 * n)]

training, validation, testing = np.split(
    diamonds_dataset.sample(frac=1, random_state=100), splits
)


In [83]:
# Implementing the linear regression model using
# the closed form solution and the stochastic gradient descent

from sklearn.base import BaseEstimator


class LinearRegression(BaseEstimator):
    def __init__(
        self, solver: str = "sgd", max_iter: int = 100, learning_rate: float = 0.0001
    ):
        self.solver = solver
        self.max_iter = max_iter
        self.learning_rate = learning_rate

    def _fit_cf(self, X: np.ndarray, y: np.ndarray):
        regularization_term = 1

        lambda_I = np.eye(X.shape[1]) * regularization_term
        w = np.linalg.inv(X.T @ X + lambda_I) @ X.T @ y

        return w

    def _fit_sgd(self, X: np.ndarray, y: np.ndarray):
        n_samples, n_features = X.shape
        w = np.zeros(n_features)

        for i in range(self.max_iter):
            for j in range(n_samples):
                gradient = (y[j] - w @ X[j]) * X[j]
                w += self.learning_rate * gradient

        return w

    def fit(self, X: np.ndarray, y: np.ndarray):
        if self.solver == "cf":
            self.w = self._fit_cf(X, y)
        elif self.solver == "sgd":
            self.w = self._fit_sgd(X, y)
        else:
            raise ValueError("Solver not implemented")

    def predict(self, X):
        return X @ self.w


In [84]:
# Preparing the data

X_training = training.drop(columns=["price"]).to_numpy()
y_training = training["price"].to_numpy()

X_testing = testing.drop(columns=["price"]).to_numpy()
y_testing = testing["price"].to_numpy()


In [87]:
# Calculating a "dumb" prediction and its
# associated MSE

from sklearn.metrics import mean_squared_error

dump_predictions = np.full(len(y_testing), np.mean(y_training))
dumb_error = mean_squared_error(y_testing, dump_predictions)


In [86]:
# Predicting the value of the price variable
# using the closed form solution with weight
# decay

from sklearn.metrics import mean_squared_error

cf_lr = LinearRegression(solver="cf")

cf_lr.fit(X_training, y_training)
predictions = cf_lr.predict(X_testing)
error = mean_squared_error(y_testing, predictions)

print(f"Mean squared error:\t\t{error}")
print(f"Dumb mean squared error:\t{dumb_error}")
print(f"Ratio:\t\t\t\t{error / dumb_error}")


Mean squared error:		3752521.8955342784
Dumb mean squared error:	16385256.88953446
Ratio:				0.22901819122110179


In [88]:
# Predicting the value of the price variable
# using the stochastic gradient descent

sgd_lr = LinearRegression(solver="sgd")

sgd_lr.fit(X_training, y_training)
predictions = sgd_lr.predict(X_testing)
error = mean_squared_error(y_testing, predictions)

print(f"Mean squared error:\t\t{error}")
print(f"Dumb mean squared error:\t{dumb_error}")
print(f"Ratio:\t\t\t\t{error / dumb_error}")


Mean squared error:		2871905.205080111
Dumb mean squared error:	16385256.88953446
Ratio:				0.17527373689908063
