# Applying Gradient Descent to a Regression Problem

Let's try to use gradient descent to solve a real-life linear regression problem. Our goal will be to build a simple linear regression model to predict the sales of some product based on the amount of money spent on advertising.

Build a model with three predictors.

Initially take all weights equal to 0, set the learning rate as 0.00005 and train the model for 500000 iterations.

Get the resulting weights for the three predictors.

In [1]:
import numpy as np
import pandas as pd

In [2]:
df = pd.read_csv('Advertising.csv',index_col=0)
df.head()

Unnamed: 0,TV,radio,newspaper,sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


In [3]:
X = df[['TV', 'radio', 'newspaper']] #Our features
y = df['sales'] #Target variable

#Bring all the data in a form convenient for the model
n = len(y)
X = np.append(np.ones((n,1)), X.values.reshape(n,3), axis = 1) #Add a column of ones
y = df['sales'].values.reshape(n,1) #Convert a target variable
par = np.zeros((4,1)) #Create a parameters vector

In [4]:
#A function for calculating the root mean square error. The result of the function we will be minimize.

def cost_function(X, y, par):
    y_pred = np.dot(X, par) #make prediction
    error = (y_pred - y)**2 #calculate the mean square error
    cost = 1/(n)*np.sum(error)
    return cost

In [5]:
#A function for calculation of gradient descent 

def grad_d(X, y, par, alpha, iterations):
    costs = []
    for i in range(iterations):
        y_pred = np.dot(X, par)
        der = np.dot(X.transpose(), (y_pred - y)) / n
        par -= alpha * der
        costs.append(cost_function(X, y, par))
    return par, costs

In [6]:
#Get results
par, costs = grad_d(X, y, par, 0.00005, 500000)
par

array([[ 2.86254595e+00],
       [ 4.59731305e-02],
       [ 1.89405798e-01],
       [-5.73781627e-04]])