In [2]:
import copy, math
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('./Deeplearning_Material/deeplearning.mplstyle')
np.set_printoptions(precision=2)  # reduced display precision on numpy arrays
print("Run Sucessfully")

Run Sucessfully


<a name="toc_15456_2"></a>
# Problem Statement

You will use the motivating example of housing price prediction. The training dataset contains three examples with four features (size, bedrooms, floors and, age) shown in the table below.  Note that, unlike the earlier labs, size is in sqft rather than 1000 sqft. This causes an issue, which you will solve in the next lab!

| Size (sqft) | Number of Bedrooms  | Number of floors | Age of  Home | Price (1000s dollars)  |   
| ----------------| ------------------- |----------------- |--------------|-------------- |  
| 2104            | 5                   | 1                | 45           | 460           |  
| 1416            | 3                   | 2                | 40           | 232           |  
| 852             | 2                   | 1                | 35           | 178           |  

You will build a linear regression model using these values so you can then predict the price for other houses. For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.  



In [44]:
#giving 3 tranning data
X_Train = np.array([[2104,5,1,45],[1416,3,2,40],[852,2,1,35]])
Y_Train = np.array([460,232,178])
print("Run Sucessfully")

Run Sucessfully


In [9]:
#printing the data 
print(f"X_train have {X_Train.shape} size and elements are:\n{X_Train}")
print(f"Y_train have {Y_Train.shape} size and elements are:\n{Y_Train}")

X_train have (3, 4) size and elements are:
[[2104    5    1   45]
 [1416    3    2   40]
 [ 852    2    1   35]]
Y_train have (3,) size and elements are:
[460 232 178]


<a name="toc_15456_2.2"></a>
## Parameter vector w, b

* $\mathbf{w}$ is a vector with $n$ elements.
  - Each element contains the parameter associated with one feature.
  - in our dataset, n is 4.
  - notionally, we draw this as a column vector

$$\mathbf{w} = \begin{pmatrix}
w_0 \\ 
w_1 \\
\cdots\\
w_{n-1}
\end{pmatrix}
$$
* $b$ is a scalar parameter.  

In [19]:
b_init = 785.1811367994083
w_init = np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618])
print(f"w_init shape: {w_init.shape},  w_init type: {type(w_init)}")
print(f"b_init type: {type(b_init)}")

w_init shape: (4,),  w_init type: <class 'numpy.ndarray'>
b_init type: <class 'float'>


<a name="toc_15456_3"></a>
# Model Prediction With Multiple Variables
The model's prediction with multiple variables is given by the linear model:

$$ f_{\mathbf{w},b}(\mathbf{x}) =  w_0x_0 + w_1x_1 +... + w_{n-1}x_{n-1} + b \tag{1}$$
or in vector notation:
$$ f_{\mathbf{w},b}(\mathbf{x}) = \mathbf{w} \cdot \mathbf{x} + b  \tag{2} $$ 
where $\cdot$ is a vector `dot product`

To demonstrate the dot product, we will implement prediction using (1) and (2).

In [28]:
def Predict(x, w, b): 
    """
    single predict using linear regression
    Args:
      x (ndarray): Shape (n,) example with multiple features
      w (ndarray): Shape (n,) model parameters   
      b (scalar):             model parameter 
      
    Returns:
      p (scalar):  prediction
    """
    p = np.dot(x, w) + b     
    return p    

print("Run Sucessfully")

Run Sucessfully


In [31]:
#lets get a row from our traning dataset and make Prediction according to the our manually set variable by W Vector and P
temp_X = X_Train[0,]
print("Fetched row from tranning dataset: ",temp_X)
temp_f_wb = Predict(temp_X,w_init,b_init)
print(f"predicted value is {temp_f_wb}")
print(f"predicted value after rounf off is  {temp_f_wb:0.4f}")

Fetched row from tranning dataset:  [2104    5    1   45]
predicted value is 459.9999976194083
predicted value after rounf off is  460.0000


<a name="toc_15456_4"></a>
# Compute Cost With Multiple Variables
The equation for the cost function with multiple variables $J(\mathbf{w},b)$ is:
$$J(\mathbf{w},b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})^2 \tag{3}$$ 
where:
$$ f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x}^{(i)} + b  \tag{4} $$ 


In contrast to previous labs, $\mathbf{w}$ and $\mathbf{x}^{(i)}$ are vectors rather than scalars supporting multiple features.

In [37]:
def Compute_Cost(x,y,w,b): 
    """
    compute cost
    Args:
      X (ndarray (m,n)): Data, m examples with n features
      y (ndarray (m,)) : target values
      w (ndarray (n,)) : model parameters  
      b (scalar)       : model parameter
      
    Returns:
      cost (scalar): cost
    """
    
    sum_Cost = 0
    sizeOfInput = x.shape[0] #return how many rows are there in data
    for i in range(0,sizeOfInput):
        f_wb = np.dot(x[i],w) + b
        cost = (f_wb - y[i]) ** 2
        sum_Cost += cost
    
    total_cost = (1/(2*sizeOfInput))*sum_Cost
    return total_cost
    
    
print("Run Sucessfully")

Run Sucessfully


In [49]:
#lets Find out cost our manually seted parameters 
cost = Compute_Cost(X_Train,Y_Train,w_init,b_init)
print("The cost is",cost)

The cost is 1.5578904428966628e-12


<a name="toc_15456_5"></a>
# 5 Gradient Descent With Multiple Variables
Gradient descent for multiple variables:

$$\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline\;
& w_j = w_j -  \alpha \frac{\partial J(\mathbf{w},b)}{\partial w_j} \tag{5}  \; & \text{for j = 0..n-1}\newline
&b\ \ = b -  \alpha \frac{\partial J(\mathbf{w},b)}{\partial b}  \newline \rbrace
\end{align*}$$

where, n is the number of features, parameters $w_j$,  $b$, are updated simultaneously and where  

$$
\begin{align}
\frac{\partial J(\mathbf{w},b)}{\partial w_j}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})x_{j}^{(i)} \tag{6}  \\
\frac{\partial J(\mathbf{w},b)}{\partial b}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)}) \tag{7}
\end{align}
$$
* m is the number of training examples in the data set

    
*  $f_{\mathbf{w},b}(\mathbf{x}^{(i)})$ is the model's prediction, while $y^{(i)}$ is the target value


In [64]:
def Compute_Gradient(x,y,w,b):
    """
    Computes the gradient for linear regression 
    Args:
      X (ndarray (m,n)): Data, m examples with n features
      y (ndarray (m,)) : target values
      w (ndarray (n,)) : model parameters  
      b (scalar)       : model parameter
      
    Returns:
      dj_dw (ndarray (n,)): The gradient of the cost w.r.t. the parameters w. 
      dj_db (scalar):       The gradient of the cost w.r.t. the parameter b. 
    """
    
    Num_of_rows, Num_of_columns = x.shape 
    sum_dj_dw = 0
    sum_db_dw = 0
    
    for i in range(0,Num_of_rows):
        f_wb = np.dot(x[i],w) + b
        diffrence = f_wb - y[i]
        sigmaTermFor_dj_dw = diffrence*x[i]
        sum_dj_dw += sigmaTermFor_dj_dw
        sum_db_dw += diffrence
        
    dj_dw = (1/Num_of_rows)*sum_dj_dw
    dj_db = (1/Num_of_rows)*sum_db_dw
    
    return dj_dw,dj_db

print("Run Sucessfully")

Run Sucessfully


In [1]:
def Compute_Gradient_Descent(x, y, w_in, b_in, iteration, alpha, cost_function, Gradient_Descent ):
    """
    Performs batch gradient descent to learn w and b. Updates w and b by taking 
    num_iters gradient steps with learning rate alpha
    
    Args:
      X (ndarray (m,n))   : Data, m examples with n features
      y (ndarray (m,))    : target values
      w_in (ndarray (n,)) : initial model parameters  
      b_in (scalar)       : initial model parameter
      cost_function       : function to compute cost
      gradient_function   : function to compute the gradient
      alpha (float)       : Learning rate
      num_iters (int)     : number of iterations to run gradient descent
      
    Returns:
      w (ndarray (n,)) : Updated values of parameters 
      b (scalar)       : Updated value of parameter 
      J_History        : History of cost function changes with iteration 
    """
    J_History = []
    
    for i in range(0,iteration):
        
        #getting Gradient
        dj_dw,dj_db = Gradient_Descent(x,y,w_in,b_in)
        
        w_in = w_in - alpha*dj_dw  # "w_in" is an array so "- alpha*dj_dw" operation will perfrom on all elements of an array 
        b_in = b_in - alpha*dj_db
        
        # Save cost J at each iteration
        if i<100000:      # prevent resource exhaustion 
            J_History.append( cost_function(x, y, w_in, b_in))
            
        #print cost after each 100 iterarion 
        if(i%100==0):
            print(f"Iteration {i:4d}: Cost {J_History[-1]:8.2f}   ")
            
    return w_in, b_in, J_History
            
print("Run Sucessfully")
            
        

Run Sucessfully


In [82]:
w_init = np.zeros(4)
b_init = 0
iteration = 1000
alpha = 5.0e-7

#running gradient descent 
w_init, b_init, j_history = Compute_Gradient_Descent(X_Train,Y_Train,w_init,b_init,iteration,alpha,Compute_Cost,Compute_Gradient)

print("the final w is ",w_init)
print("the final b is ",b_init)

Iteration    0: Cost  2529.46   
Iteration  100: Cost   695.99   
Iteration  200: Cost   694.92   
Iteration  300: Cost   693.86   
Iteration  400: Cost   692.81   
Iteration  500: Cost   691.77   
Iteration  600: Cost   690.73   
Iteration  700: Cost   689.71   
Iteration  800: Cost   688.70   
Iteration  900: Cost   687.69   
the final w is  [ 0.2   0.   -0.01 -0.07]
the final b is  -0.002235407530932535


### Hear we have sucesfully find the w and b automatically but yet we see that there is sitll high cost is seen which is not favourable

let's test model and know how is it performing 
we will test model by taking tranning data as input data and notice it diffrence 

In [83]:
size,_ = X_Train.shape
for i in range(0,size):
    prid_value = Predict(X_Train[i],w_init,b_init)
    diff = Y_Train[i] - prid_value
    print(f"for {i}th input set predicted value {prid_value:0.4f} is and actual value is {Y_Train[i]:0.4f} and differences is {diff:0.4f}")

for 0th input set predicted value 426.1853 is and actual value is 460.0000 and differences is 33.8147
for 1th input set predicted value 286.1675 is and actual value is 232.0000 and differences is -54.1675
for 2th input set predicted value 171.4676 is and actual value is 178.0000 and differences is 6.5324
