In [3]:
import copy , math
import numpy as np
import matplotlib.pyplot as plt
np.set_printoptions(precision=2) 

2 Problem Statement
You will use the motivating example of housing price prediction. The training dataset contains three examples with four features (size, bedrooms, floors and, age) shown in the table below. Note that, unlike the earlier labs, size is in sqft rather than 1000 sqft. This causes an issue, which you will solve in the next lab!

Size (sqft)	Number of Bedrooms	Number of floors	Age of Home	Price (1000s dollars)
2104	5	1	45	460
1416	3	2	40	232
852	2	1	35	178
You will build a linear regression model using these values so you can then predict the price for other houses. For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.

Please run the following code cell to create your X_train and y_train variables.

In [4]:
X_train = np.array([[2104, 5, 1, 45], [1416, 3, 2, 40], [852, 2, 1, 35]])
Y_train = np.array([460, 232, 178])

In [5]:
print(f"shape of X_train : {X_train.shape} and its type is {type(X_train)}")
print(f"shape of Y_train : {Y_train.shape} and its type is {type(Y_train)}")

shape of X_train : (3, 4) and its type is <class 'numpy.ndarray'>
shape of Y_train : (3,) and its type is <class 'numpy.ndarray'>


2.2 Parameter vector w, b
𝐰
  is a vector with  𝑛
  elements.
Each element contains the parameter associated with one feature.
in our dataset, n is 4.
notionally, we draw this as a column vector
𝐰=𝑤0𝑤1⋯𝑤𝑛−1
 

𝑏
  is a scalar parameter.

In [6]:
b_init = 785.1811367994083
w_init = np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618])
print(f"w_init shape: {w_init.shape}, b_init type: {type(b_init)}")

w_init shape: (4,), b_init type: <class 'float'>


3.2 Single Prediction, vector
Noting that equation (1) above can be implemented using the dot product as in (2) above. We can make use of vector operations to speed up predictions.

Recall from the Python/Numpy lab that NumPy np.dot()[link] can be used to perform a vector dot product.

In [7]:
def predict(x,w,b):
    pred = np.dot(x,w) + b
    return pred

In [10]:
x_vec = X_train[0]
print(f"shape of x_vec : {x_vec.shape} and its type is {type(x_vec)}")

f_wb = predict(x_vec , w_init , b_init)
print(f"The predicted value is {f_wb}")

shape of x_vec : (4,) and its type is <class 'numpy.ndarray'>
The predicted value is 459.9999976194083


4 Compute Cost With Multiple Variables¶
The equation for the cost function with multiple variables  𝐽(𝐰,𝑏)
  is:
𝐽(𝐰,𝑏)=12𝑚∑𝑖=0𝑚−1(𝑓𝐰,𝑏(𝐱(𝑖))−𝑦(𝑖))2(3)
where:
𝑓𝐰,𝑏(𝐱(𝑖))=𝐰⋅𝐱(𝑖)+𝑏(4)

In contrast to previous labs,  𝐰
  and  𝐱(𝑖)
  are vectors rather than scalars supporting multiple features.

In [11]:
def compute_cost(x,y,w,b):
    m = x.shape[0]
    cost = 0
    for i in range(m):
        f_wb = np.dot(x[i],w) + b
        cost = cost + (f_wb - y[i])**2
    cost = cost / (2*m)
    return cost



In [13]:
cost = compute_cost(X_train, Y_train, w_init, b_init)
print(f'Cost at optimal w : {cost}')

Cost at optimal w : 1.5578904045996674e-12


5 Gradient Descent With Multiple Variables
Gradient descent for multiple variables:

repeat} until convergence:{𝑤𝑗=𝑤𝑗−𝛼∂𝐽(𝐰,𝑏)∂𝑤𝑗𝑏  =𝑏−𝛼∂𝐽(𝐰,𝑏)∂𝑏for j = 0..n-1(5)

where, n is the number of features, parameters  𝑤𝑗
 ,  𝑏
 , are updated simultaneously and where

∂𝐽(𝐰,𝑏)∂𝑤𝑗∂𝐽(𝐰,𝑏)∂𝑏=1𝑚∑𝑖=0𝑚−1(𝑓𝐰,𝑏(𝐱(𝑖))−𝑦(𝑖))𝑥(𝑖)𝑗=1𝑚∑𝑖=0𝑚−1(𝑓𝐰,𝑏(𝐱(𝑖))−𝑦(𝑖))(6)(7)

m is the number of training examples in the data set
𝑓𝐰,𝑏(𝐱(𝑖))
  is the model's prediction, while  𝑦(𝑖)
  is the target value

In [None]:
def compute_gradient(x,y,w,b):
    m,n = x.shape #m is the number of examples and n is the number of features
    dw_dj =np.zeros(n)
    db_dj = 0
    for i in range(m):
        err = (np.dot(x[i],w) +b) - y[i]
        for j in range(n):
            dw_dj[j] = dw_dj[j]  + err * x[i,j]
        db_dj = db_dj + err
    dw_dj = dw_dj/m
    db_dj = db_dj/m
    return dw_dj ,db_dj 

In [17]:
tmp_dj_dw ,tmp_dj_db = compute_gradient(X_train, Y_train, w_init, b_init)
print(f'dj_db at initial w,b: {tmp_dj_db}')
print(f'dj_dw at initial w,b: \n {tmp_dj_dw}')

dj_db at initial w,b: -1.6739251122999121e-06
dj_dw at initial w,b: 
 [-2.73e-03 -6.27e-06 -2.22e-06 -6.92e-05]
