### Linear Regression
1. Ordinary Least Squares Method: 
In this method, we find the regression coefficient weights that minimize the sum of the squared residuals.

Formula:  $$ weights = (X^T \cdot X)^{-1} \cdot X^T \cdot y$$

To find the predicted values, we multiply the feature matrix X with the weights vector.
Formula: $$ y_{pred} = X \cdot weights $$

Mean Squared Error: $$ MSE = \frac{1}{n} \sum_{i=1}^{n} (y_{pred} - y_{true})^2 $$

In [3]:
# importing the libraries
import mxnet as mx
from mxnet import nd, autograd, gluon
import numpy as np
import LinearRegression as lr
import time
import pandas as pd

In [4]:
# read data from DAT file in NDArray format
data_ctx = mx.cpu()
model_ctx = mx.cpu()
# read data with pandas
data = pd.read_csv('custom_2017_2020.csv')
# convert to numpy array
data = data.to_numpy()

Pre-processing step

In [5]:
# standardize the features
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
data = (data - mean) / std
# convert to NDArray
data = nd.array(data, ctx=data_ctx)

In [6]:
# splitting into features and labels
features = nd.array(data[:, :-1], ctx=data_ctx)
labels = nd.array(data[:, -1], ctx=data_ctx)

In [7]:
X_train = features
y_train = labels
y_train = y_train.reshape((len(y_train), 1))

In [8]:
# printing the shapes of the training set to check dimensions
print(X_train.shape)
print(y_train.shape)

(14897200, 12)
(14897200, 1)


Training the model

In [9]:
start = time.time()

In [10]:
# training the model using OLS method
model = lr.LinearRegression()
# note the time before starting the training of the data
weights = model.OLS_fit(X_train, y_train)

In [11]:
end = time.time()

Printing the result

In [12]:
print(f"Time taken to train the model using GPU: {end - start} seconds")

Time taken to train the model using GPU: 0.047386884689331055 seconds
