# House Price prediction Model

### Problem statement

the training dataset contains three examples with four features (size, bedrooms, floors and, age) shown in the table below.  Note that,size is in sqft rather than 1000 sqft. This causes an issue, which you will solve in the next lab!


| Size (sqft) | Number of Bedrooms  | Number of floors | Age of  Home | Price (1000s dollars)  |   
| ----------------| ------------------- |----------------- |--------------|-------------- |  
| 2104            | 5                   | 1                | 45           | 460           |  
| 1416            | 3                   | 2                | 40           | 232           |  
| 852             | 2                   | 1                | 35           | 178           |  


You will build a linear regression model using these values so you can then predict the price for other houses. For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.  

- n = 4 ( no of features selected, j is the jth feature, used in subscript )
- m = 3 ( no of training examples , i is the ith training example, used in superscript )

### parameters to learn, same as no of features + 1
-  w = [ w1,w2,w3,w4 ] 
-  b 


### import packages

In [4]:
import numpy as np
import pandas as pd

### Create a Training dataset

In [5]:
X_train = np.array([[2104, 5, 1, 45], [1416, 3, 2, 40], [852, 2, 1, 35]])
y_train = np.array([460, 232, 178])

In [6]:
# data is stored in numpy array/matrix
print(f"X Shape: {X_train.shape}, X Type:{type(X_train)})")
print(X_train)
print(f"y Shape: {y_train.shape}, y Type:{type(y_train)})")
print(y_train)

X Shape: (3, 4), X Type:<class 'numpy.ndarray'>)
[[2104    5    1   45]
 [1416    3    2   40]
 [ 852    2    1   35]]
y Shape: (3,), y Type:<class 'numpy.ndarray'>)
[460 232 178]


### Parameter initalization

In [10]:
b_init = 785.1811367994083
w_init = np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618])
print(f"w_init shape: {w_init.shape}, b_init type: {type(b_init)}")

w_init shape: (4,), b_init type: <class 'float'>


### Single prediction vector will be used with final w, b parameters and given feature to predict the value

In [11]:
def single_prediction_vector(x,w,b):
    """
    single predict using linear regression. 
    """
    p  = np.dot(x,w) + b
    return p


In [13]:
# get zero row, all columns and used the initialized value to predict the value for given vector. 
single_prediction_vector(X_train[0,:],w_init,b_init)

459.9999976194083

### Compute Cost of the model ( MSE Error ) 

In [18]:
def compute_cost(X, y, w, b): 
    """
    compute cost
    Args:
      X (ndarray (m,n)): Data, m examples with n features
      y (ndarray (m,)) : target values
      w (ndarray (n,)) : model parameters  
      b (scalar)       : model parameter
      
    Returns:
      cost (scalar): cost
      
    """
    
    
    m = X.shape[0]
    cost = 0
    
    for i in range(m):
        y_hat = np.dot(X[i],w ) + b
        y_actual = y[i]
        cost = cost + ( y_hat - y_actual ) ** 2
    cost = cost/(2*m)        
    return cost

In [19]:
total_cost = compute_cost(X_train, y_train, w_init, b_init)

In [22]:
total_cost

1.5578904428966628e-12

<a name="toc_15456_5"></a>
### 5 Gradient Descent With Multiple Variables



Gradient descent for multiple variables:

$$\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline\;
& w_j = w_j -  \alpha \frac{\partial J(\mathbf{w},b)}{\partial w_j} \tag{5}  \; & \text{for j = 0..n-1}\newline
&b\ \ = b -  \alpha \frac{\partial J(\mathbf{w},b)}{\partial b}  \newline \rbrace
\end{align*}$$

where, n is the number of features, parameters $w_j$,  $b$, are updated simultaneously and where  

$$
\begin{align}
\frac{\partial J(\mathbf{w},b)}{\partial w_j}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})x_{j}^{(i)} \tag{6}  \\
\frac{\partial J(\mathbf{w},b)}{\partial b}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)}) \tag{7}
\end{align}
$$
* m is the number of training examples in the data set

    
*  $f_{\mathbf{w},b}(\mathbf{x}^{(i)})$ is the model's prediction, while $y^{(i)}$ is the target value

In [None]:
# calculate dj_dw , dj_db

def compute_gradient( ):
    """
    args: 
    X : 
    """