# REGRESSION with SQUARED ERROR LOSS

This is just an example to show how to use the "linear_regression" class or "linear_regression_gd" class

In [5]:
import numpy as np
import matplotlib.pyplot as plt

In [6]:
import math

In [7]:
# Mounting drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Loading Data

In [8]:
#below where the file is in gdrive, change with your respective location
data_path = "/content/drive/MyDrive/Colab Notebooks/PRNN_A1/Prnn_datasets/"
train_dataset = np.loadtxt(data_path + 'p1_train.csv', delimiter=',')
test_dataset = np.loadtxt(data_path + 'p1_test.csv', delimiter=',')

In [9]:
# Last column is the target variable and rest are features
features = train_dataset.shape[1]-1 #2

In [10]:
X_train = train_dataset[:,0:features]
Y_train = train_dataset[:,features]
Y_train = Y_train.reshape(Y_train.shape[0],1)

In [11]:
X_test = test_dataset[:,0:features]
Y_test = test_dataset[:,features]
Y_test = Y_test.reshape(Y_test.shape[0],1)

If variance of data is higher do normalize the data to bring it between 0 to 1 before passing it to the linear regression class

## Finding using formulae

The following class is class of linear regression you can create an object of the class for your needs.
Here we are assuming the matrix (A.T@A) is invertible

In [28]:
class Linear_regression:
  def train(self,X,Y):    # First call this function to update W value
    # Creating augmented data matrix
    tmp = np.ones((X.shape[0],1))
    X=np.column_stack((X,tmp))
    self.W = (np.linalg.inv(X.T@X))@X.T@Y
  def test(self,X,Y):     # call this function to return mean squared error on test dataset
    tmp = np.ones((X.shape[0],1))
    X=np.column_stack((X,tmp))
    Y_pred = X@self.W
    loss =(np.linalg.norm(y_pred-Y))**2
    MSE = loss/X.shape[0]
    return(MSE)
  def predict(self,X):    # call this function to predict value and return predicted value on a dataset
    tmp = np.ones((X.shape[0],1))
    X=np.column_stack((X,tmp))
    Y_pred = X@self.W
    return(Y_pred)

In [29]:
l = Linear_regression() 

In [30]:
l.train(X_train,Y_train)

In [31]:
y_pred = l.predict(X_test)

In [33]:
mse = l.test(X_test,Y_test)

In [35]:
print('The Mean squared error loss is: ',mse)

The Mean squared error loss is:  5.046436003951254


## Using Gradient Descent

Below is the gradient descent function, with parameters like epochs,epsilon aand alpha, you can play with it according to your needs.

This code is written with a variable alpha you can replace it with your form of changing alpha or gradient descent like adam or adaboost.

In [12]:
def grad_descent(self,X,Y):
  epochs = 15000
  prev_MSE = 0
  epsilon = 0.001
  alpha = 10                  # selected after a lot of experiment with different values
  tmp = np.ones((X.shape[0],1))
  X=np.column_stack((X,tmp))  # Augmented data matrix
  W=np.ones((X.shape[1],1))
  for i in range(epochs):
    c = X@W-Y
    loss = np.linalg.norm(c)
    MSE = loss*loss/X.shape[0]              # calculating loss
    grad = X.T@(X@W-Y)
    if prev_MSE<MSE:
      alpha = (alpha/10)
    W = W - alpha * grad/np.linalg.norm(grad)       # Updating weights
    prev_MSE = MSE
    if(i%2000==0):
      print('Loss in ',i,' epoch is ',MSE)
    if MSE<= epsilon:
      break
  print('Final loss is',prev_MSE)
  self.W = W


You can also add more feature (linear or non linear) to your dataset and give it to linear regression class to predict y value.

---



In [13]:
class Linear_regression_GD:
  train = grad_descent
  def test(self,X,Y):     # call this function to return mean squared error on test dataset
    tmp = np.ones((X.shape[0],1))
    X=np.column_stack((X,tmp))
    Y_pred = X@self.W
    loss =(np.linalg.norm(y_pred-Y))**2
    MSE = loss/X.shape[0]
    return(MSE)
  def predict(self,X):    # call this function to predict value and return predicted value on a dataset
    tmp = np.ones((X.shape[0],1))
    X=np.column_stack((X,tmp))
    Y_pred = X@self.W
    return(Y_pred)

In [14]:
l = Linear_regression_GD() 

In [15]:
l.train(X_train,Y_train)

Loss in  0  epoch is  1731.5804704188197
Loss in  2000  epoch is  5.059684615643722
Loss in  4000  epoch is  5.059684615643722
Loss in  6000  epoch is  5.059684615643722
Loss in  8000  epoch is  5.059684615643722
Loss in  10000  epoch is  5.059684615643722
Loss in  12000  epoch is  5.059684615643722
Loss in  14000  epoch is  5.059684615643722
Final loss is 5.059684615643722


In [16]:
y_pred = l.predict(X_test)

In [17]:
mse = l.test(X_test,Y_test)

In [18]:
print('The Mean squared error loss is: ',mse)

The Mean squared error loss is:  5.046435996689454
