# Multiple Regression

In [12]:
import copy, math
import numpy as np
from matplotlib import pyplot as plt

AttributeError: module 'matplotlib' has no attribute 'get_data_path'

## 1.3 Notation
Here is a summary of some of the notation you will encounter, updated for multiple features.  

|General <img width=70/> <br />  Notation  <img width=70/> | Description<img width=350/>| Python (if applicable) |
|: ------------|: ------------------------------------------------------------||
| $a$ | scalar, non bold                                                      ||
| $\mathbf{a}$ | vector, bold                                                 ||
| $\mathbf{A}$ | matrix, bold capital                                         ||
| **Regression** |         |    |     |
|  $\mathbf{X}$ | training example maxtrix                  | `X_train` |   
|  $\mathbf{y}$  | training example  targets                | `y_train` 
|  $\mathbf{x}^{(i)}$, $y^{(i)}$ | $i_{th}$Training Example | `X[i]`, `y[i]`|
| m | number of training examples | `m`|
| n | number of features in each example | `n`|
|  $\mathbf{w}$  |  parameter: weight,                       | `w`    |
|  $b$           |  parameter: bias                                           | `b`    |     
| $f_{\mathbf{w},b}(\mathbf{x}^{(i)})$ | The result of the model evaluation at $\mathbf{x^{(i)}}$ parameterized by $\mathbf{w},b$: $f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x}^{(i)}+b$  | `f_wb` | 


# 2 Problem Statement

Housing price prediction, where the training dataset contains three examples with four features (size, bedrooms, floors and, age) 

| Size (sqft) | Number of Bedrooms  | Number of floors | Age of  Home | Price (1000s dollars)  |   
| ----------------| ------------------- |----------------- |--------------|-------------- |  
| 2104            | 5                   | 1                | 45           | 460           |  
| 1416            | 3                   | 2                | 40           | 232           |  
| 852             | 2                   | 1                | 35           | 178           |  


For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.  



In [28]:
x_train = np.array([[2104,1416,852],[5,3,2],[1,2,1],[45,40,35]])
y_train = np.array([460,232,178])
x_train = np.transpose(x_train)
print(x_train) # Created a matrix containing the input variables where each row denotes separate feature
print(x_train.shape)
dimension = 3

[[2104    5    1   45]
 [1416    3    2   40]
 [ 852    2    1   35]]
(3, 4)


$$\mathbf{X} = 
\begin{pmatrix}
 x^{(0)}_0 & x^{(0)}_1 & \cdots & x^{(0)}_{n-1} \\ 
 x^{(1)}_0 & x^{(1)}_1 & \cdots & x^{(1)}_{n-1} \\
 \cdots \\
 x^{(m-1)}_0 & x^{(m-1)}_1 & \cdots & x^{(m-1)}_{n-1} 
\end{pmatrix}
$$
notation:
- $\mathbf{x}^{(i)}$ is vector containing example i. $\mathbf{x}^{(i)}$ $ = (x^{(i)}_0, x^{(i)}_1, \cdots,x^{(i)}_{n-1})$
- $x^{(i)}_j$ is element j in example i. The superscript in parenthesis indicates the example number while the subscript represents an element.  

## Parameteres: w and b
where w is a vector and b is a scalar

$$\mathbf{w} = \begin{pmatrix}
w_0 \\
w_1 \\
\cdots \\
w_{n-1}
\end{pmatrix}
$$

- $b$ is a scalar parameter




In [45]:
# Given in the source file, values of w and b to continue demonstration
b_init = 785.1811367994083
w_init = np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618])
print(w_init)
print(x_train)
print(f"w_init shape: {w_init.shape}, b_init type: {type(b_init)}")
print(len(x_train[0]))
print(len(x_train))
# dimension of w is 4
# copied form source

[  0.39133535  18.75376741 -53.36032453 -26.42131618]
[[2104    5    1   45]
 [1416    3    2   40]
 [ 852    2    1   35]]
w_init shape: (4,), b_init type: <class 'float'>
4
3


## Linear Regression Model: Multiple Variables 
Note: /mathbf tag is used to init. a vector
$$ f_{w,b}(x^{(i)}) = \mathbf{w} \cdot \mathbf{x} +b$$
- w and b are parameters
- w and x are vector quantities
- $\mathbf{w}\cdot\mathbf{x}$ = dot product


In [37]:
# Single Predict - for loop
# x[0] is considered
def predict_single_loop(w,x,b):
    p = 0 # prediction
    n = x.shape[0]
    for i in range(n):
        p_i = w[i]*x[i]
        p = p+ p_i
    p = p+b
    return p         

In [40]:
x_vec = x_train[0, :]
print(x_vec)

f_wb = predict_single_loop(w_init,x_vec,b_init)
print(f_wb)

[2104    5    1   45]
459.9999976194083


In [41]:
# Single predict - vectorization
def predict(x,w,b):
    return np.dot(x,w)+b

In [43]:
x_vec = x_train[0, :]
print(x_vec)

f_wb = predict(w_init,x_vec,b_init)
print(f_wb)

[2104    5    1   45]
459.9999976194083


### Compute Cost Function

$$ J(w,b) = \frac {1}{2m} \sum \limits_{i=0}^{m-1}(f_{w,b}(x^{(i)})-y^{(i)})^2$$
where 
$$ f_{w,b}(x^{(i)}) = \mathbf{w} \cdot \mathbf{x}^{(i)} +b $$

In [51]:
def compute_cost(x,w,y,b):
    error = 0
    m = len(x)
    for i in range(m):
        sum_i = np.array(x[i],w) +b
        error_i = (sum_i - y[i])**2
        error = error + error_i
    error = error/(2*m)
    return error  
        
    


In [None]:
cost = compute_cost(x_train,y_trai)