# Multiple Linear Regresssion

<a name="toc_15456_2"></a>
#  Problem Statement

The training dataset contains three examples with four features (size, bedrooms, floors and, age) shown in the table below.  

| Size (sqft) | Number of Bedrooms  | Number of floors | Age of  Home | Price (1000s dollars)  |   
| ----------------| ------------------- |----------------- |--------------|-------------- |  
| 2104            | 5                   | 1                | 45           | 460           |  
| 1416            | 3                   | 2                | 40           | 232           |  
| 852             | 2                   | 1                | 35           | 178           |  

Build a linear regression model using these values so you can then predict the price for other houses. For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.  

In [1]:
import numpy as np
import math,copy
import matplotlib.pyplot as plt

In [2]:
x_train=np.array([[2104,5,1,45],[1416,3,2,40],[852,2,1,35]])
y_train=np.array([460,232,178])
print(f"x_train:\n{x_train}")
print(f"x_train.shape={x_train.shape} and type(x_train)={type(x_train)}")
print(f"\ny_train:{y_train}")
print(f"y_train.shape={y_train.shape} and type(y_train)={type(y_train)}")

x_train:
[[2104    5    1   45]
 [1416    3    2   40]
 [ 852    2    1   35]]
x_train.shape=(3, 4) and type(x_train)=<class 'numpy.ndarray'>

y_train:[460 232 178]
y_train.shape=(3,) and type(y_train)=<class 'numpy.ndarray'>


For demonstration,  𝐰
  and  𝑏
  will be loaded with some initial selected values that are near the optimal.  𝐰
  is a 1-D NumPy vector.

In [3]:
b_init = 785.1811367994083
w_init = np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618])
print(f"w_init shape: {w_init.shape}, b_init type: {type(b_init)}")

w_init shape: (4,), b_init type: <class 'float'>


# Predicting without vectorization


In [4]:
def predict_single_loop(w,b,x):
    n=x.shape[0]
    f_wb=0
    for i in range(n):
        f_wb_i=w[i]*x[i]
        f_wb=f_wb+f_wb_i
    f_wb=f_wb+b
    return f_wb

Using above prediction lets give the model with input as first row of x_train and then compare the prediction with true value

In [5]:
x_vect=x_train[0,:]
prediction=predict_single_loop(w_init,b_init,x_vect)
print(f"the shape of x_vect = {x_vect.shape} and x_vect value = {x_vect}")
print(f"The prediction for the first row of x_train is {prediction}")

the shape of x_vect = (4,) and x_vect value = [2104    5    1   45]
The prediction for the first row of x_train is 459.9999976194083


Thus if we compare this with the first row output which is 460 then we can say that our prediction is accurate for that data

# Predicting with vectorization

Using the concept of vectorization for faster and simpler calculation

In [6]:
def predict(w,b,x):
    f_wb=np.dot(w,x)+b
    
    return f_wb

In [7]:
x_vect=x_train[0,:]
prediction=predict(w_init,b_init,x_vect)
print(f"the shape of x_vect = {x_vect.shape} and x_vect value = {x_vect}")
print(f"The prediction for the first row of x_train is {prediction}")

the shape of x_vect = (4,) and x_vect value = [2104    5    1   45]
The prediction for the first row of x_train is 459.9999976194082


Thus we get the same results for both with and without vectorization where vectorization fastens calculation and also reduce the number of lines of code

# Cost Function for Multiple Variables

In [8]:
def compute_cost(w,b,x,y):
    m=x.shape[0]
    cost_sum=0
    for i in range(m):
        f_wb=np.dot(w,x[i])+b
        cost_sum=cost_sum+(f_wb-y[i])**2
    final_cost=cost_sum/(2*m)
    return final_cost

In [9]:
predic_cost=compute_cost(w_init,b_init,x_train,y_train)
print(f"Cost at optimal values of w is {predic_cost}")

Cost at optimal values of w is 1.5578904330213735e-12


# Gradient Descent for Multiple Variables

As it is multiple variables the w vector will have many many weights in it and we would need to find optimal weights for all w's
and hence dj_dw will also have vector of values unlike for single variable and dj_db will remain same

In [10]:
def compute_gradient(w,b,x,y):
    m,n = x.shape              #(number of examples, number of features)
    dj_db=0
    dj_dw=np.zeros((n,))
    for i in range(m):
        f_wb=np.dot(w,x[i])+b
        err=(f_wb-y[i])
        for j in range(n):
            dj_dw[j]=dj_dw[j]+ err*x[i,j]
        dj_db=dj_db+err  
        
    dj_dw=dj_dw/m
    dj_db=dj_db/m
    return dj_dw,dj_db


                              
        

In [11]:
#Compute and display gradient 
tmp_dj_dw, tmp_dj_db = compute_gradient(w_init, b_init,x_train, y_train)
print(f'dj_dw at initial w,b: {tmp_dj_dw}')
print(f'dj_db at initial w,b: {tmp_dj_db}')

dj_dw at initial w,b: [-2.72623577e-03 -6.27197263e-06 -2.21745571e-06 -6.92403379e-05]
dj_db at initial w,b: -1.6739251122999121e-06


In [12]:
def gradient_descent(w,b,x,y,num_of_iters,alpha):
    j_history=[]
    for i in range(num_of_iters):
        dj_dw,dj_db=compute_gradient(w,b,x,y)
        b=b-alpha*dj_db
        w=w-alpha*dj_dw
        
        if (i<100000):
            j_history.append(compute_cost(w,b,x,y))
            
            
        if i%math.ceil(num_of_iters/10)==0:
            print(f"cost: {j_history[-1]} w={w} and b={b}")
    
    return w,b,j_history

In [13]:
initial_w=np.zeros_like(w_init)
initial_b=0
alpha=5.0e-07
itterations=1000
final_w,final_b,j_hist=gradient_descent(initial_w,initial_b,x_train,y_train,itterations,alpha)
print(f"final_w={final_w} and final_b={final_b} with cost={j_hist[-1]}")


cost: 2529.4629522316304 w=[2.41334667e-01 5.58666667e-04 1.83666667e-04 6.03500000e-03] and b=0.000145
cost: 695.990315835203 w=[ 0.20235171  0.00079796 -0.00099658 -0.00219736] and b=-0.00011985961877688924
cost: 694.9206979323068 w=[ 0.20253446  0.00112715 -0.00214349 -0.00940619] and b=-0.00035965781839536297
cost: 693.8604297851197 w=[ 0.2027164   0.00145611 -0.00328876 -0.01658286] and b=-0.0005983240279392167
cost: 692.8094286135905 w=[ 0.20289753  0.00178484 -0.00443238 -0.02372751] and b=-0.0008358632706869378
cost: 691.767612370606 w=[ 0.20307785  0.00211335 -0.00557437 -0.03084027] and b=-0.0010722805476294606
cost: 690.7348997354993 w=[ 0.20325736  0.00244162 -0.00671473 -0.0379213 ] and b=-0.0013075808375690539
cost: 689.7112101076159 w=[ 0.20343608  0.00276967 -0.00785347 -0.04497072] and b=-0.0015417690972177685
cost: 688.6964635999458 w=[ 0.20361399  0.00309749 -0.00899059 -0.05198869] and b=-0.0017748502612954446
cost: 687.690581032794 w=[ 0.20379112  0.00342509 -0.010

Now lets compare our model's  prdeiction with true values

In [15]:
m=x_train.shape[0]
print("prediction\t True value ")
for i in range(m):
    predic=np.dot(final_w,x_train[i])+final_b
    print(f"{predic}\t {y_train[i]}")

prediction	 True value 
426.18530497189204	 460
286.1674720078562	 232
171.46763087132317	 178
