# Mahcine Learning - Supervised Learning - Regression
# Linear Regression
## Part2: Linear Regression

## Reading Data

In [3]:
import numpy as np
import pandas as pd

In [15]:
df = pd.read_csv("Modified_Housing.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus,price
0,0,0.651977,4,2,3,1,0,0,0,1,2,1,0,13300000
1,1,0.825989,4,4,4,1,0,0,0,1,3,0,0,12250000
2,2,0.938983,3,2,2,1,0,1,0,0,2,1,1,12250000
3,3,0.661017,4,2,2,1,0,1,0,1,3,1,0,12215000
4,4,0.651977,4,1,2,1,1,1,0,1,2,0,0,11410000


In [27]:
df.columns[1]
df.drop([df.columns[0]], axis=1, inplace=True)
df.shape[1]

13

### 1- Regression Linear Equation

- for simplicity the Features (columns) will be given mathimatical variables such as X1, X2, ... as illustrated in the table below

|Index | Feature Name   | Variable Name |
|:----:|:-------------  |:--------------|
|0     |area            |$$x_1$$        |
|1     |bedrooms        |$$x_2$$        |
|2     |bathrooms       |$$x_3$$        |
|3     |stories         |$$x_4$$        |
|4     |mainroad        |$$x_5$$        |
|5     |guestroom       |$$x_6$$        |
|6     |basement        |$$x_7$$        |
|7     |hotwatereating  |$$x_8$$        |
|8     |airconditioning |$$x_9$$        |
|9     |parking         |$$x_{10}$$       |
|10    |prefarea        |$$x_{11}$$       |
|11    |furnishingstatus|$$x_{12}$$       |

- Target Feature is the "**price**" column and will be named as f(x) or y

- number of features (columns) is: **m** = 12

- sample size is: **n**

- linear equation of one data point will be:

  $f(x) = y = a_1x_1 + a_2x_2 + a_3x_3 + ... + a_mx_m + a_0 $
  
  $ $


- the equation above can be writen using matrices notation:

  ${\begin{bmatrix} y \end{bmatrix}}_{1x1} = {\begin{bmatrix} x_1 & x_2 & x_3 & \cdots & x_m & 1\end{bmatrix}}_{1x13} . {\begin{bmatrix} a_1 \\ a_2 \\ a_3 \\ \vdots \\ a_m \\ a_0 \end{bmatrix}}_{13x1} $

- the equation for all data points:

  ${\begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ \vdots \\ y_n \end{bmatrix}}_{nx1} = {\begin{bmatrix} {x_1}_1 & {x_2}_1 & {x_3}_1 & \cdots & {x_m}_1 & 1 \\ {x_1}_2 & {x_2}_2 & {x_3}_2 & \cdots & {x_m}_2 & 1 \\ {x_1}_3 & {x_2}_3 & {x_3}_3 & \cdots & {x_m}_3 & 1 \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ {x_1}_n & {x_2}_n & {x_3}_n & \cdots & {x_m}_n & 1 \end{bmatrix}}_{nx13} . {\begin{bmatrix} a_1 \\ a_2 \\ a_3 \\ \vdots \\ a_m \\ a_0 \end{bmatrix}}_{13x1} $

- which is equivelant to:

  ${\begin{bmatrix} Target & Matrix \end{bmatrix}}_{nx1} = {\begin{bmatrix} y \end{bmatrix}}_{nx1} = {\begin{bmatrix} Features & Matrix \end{bmatrix}}_{nx(m+1)} . {\begin{bmatrix} Linear & Equation & Coefficients & Matrix \end{bmatrix}}_{(m+1)x1} .  $

### 2- Assuming  Linear Equation Coefficients

*Random matrix can be generated using numpy for the coefficients matrix*


### 3- Equation Evaluation & Finding Y Prediction ($\hat{y}$)

*by using the same equation of Target Matrix:* 

${\begin{bmatrix} \hat{y} \end{bmatrix}}_{nx1} = {\begin{bmatrix} Features & Matrix \end{bmatrix}}_{nx(m+1)} . {\begin{bmatrix} Assumed & Linear & Equation & Coefficients & Matrix \end{bmatrix}}_{(m+1)x1} .  $

### 4 - Calculating Mean Square Error (MSE)
$ {\begin{bmatrix} error \end{bmatrix}}_{nx1} = {\begin{bmatrix} y \end{bmatrix}}_{nx1} - {\begin{bmatrix} \hat{y} \end{bmatrix}}_{nx1} $

$MSE = {{\sum\limits_{i=0}^{n-1} {error^2}} \over n} = {{\sum\limits_{i=0}^{n-1} {(y-\hat{y})^2}} \over n} = {1 \over n} \Sigma {\begin{bmatrix} error^2 \end{bmatrix}}_{nx1} = {1 \over n} \Sigma ({\begin{bmatrix} y \end{bmatrix}}_{nx1} - {\begin{bmatrix} \hat{y} \end{bmatrix}}_{nx1})^2$

### 5- Gradient Descent Algorithm
- the following equation is for one point data and one coefficient (index =1 for instance)

  $a_{new_1} = a_1 - \alpha . { \delta MSE \over \delta a_1 } = 
a_1 - \alpha . { \delta {1 \over n} \sum\limits_{i=0}^{n-1} ({\begin{bmatrix} y_i \end{bmatrix}}_{nx1} - {\begin{bmatrix} \hat{y}_i \end{bmatrix}}_{nx1})^2 \over \delta a_1 } =
a_1 - \alpha . {- 2 \over n} . \sum\limits_{i=0}^{n-1} {(y_i - \hat{y}_i). x_{1i}} = 
a_1 + {2 \alpha  \over n} . \sum\limits_{i=0}^{n-1} {x_{1i}(y_i - \hat{y}_i)} $


- for any index either point data (record) or feature (column)
  
  $a_{new_j} = a_j - \alpha . { \delta MSE \over \delta a_j } = 
a_j - \alpha . { \delta {1 \over n} \sum\limits_{i=0}^{n-1} ({\begin{bmatrix} y_i \end{bmatrix}}_{nx1} - {\begin{bmatrix} \hat{y}_i \end{bmatrix}}_{nx1})^2 \over \delta a_j } =
a_j + {2 \alpha  \over n} . \sum\limits_{i=0,j=0}^{n-1,m} {x_j(y_i - \hat{y}_i)}$ 

   where,
          - $ \alpha $ is the learning rate 
          
          - i: looping index for sample records (0 to n-1)
          
          - j: looping index for features columns (0 to m)
          
- the equation for all coefficeints is:

  $ {\begin{bmatrix} {a_{new}}_1 \\ {a_{new}}_2 \\ \vdots \\ {a_{new}}_{12} \\ {a_{new}}_0  \end{bmatrix}}_{13x1}= {\begin{bmatrix} a_1 \\ a_2 \\ \vdots \\ a_{12} \\ a_0  \end{bmatrix}}_{13x1} + {2\alpha \over n} . {\begin{bmatrix} \sum\limits_{i=0}^{n-1} {x_1(y_i - \hat{y}_i)} \\ a_2 \\ \vdots \\ a_{12} \\ a_0  \end{bmatrix}}_{13x1} $
  
- in another way:

  ${\begin{bmatrix} new & coeffiecient & Matrix \end{bmatrix}}_{m+1x1} = {\begin{bmatrix} coeffiecient & Matrix \end{bmatrix}}_{m+1x1} + {2\alpha \over n} . {\begin{bmatrix} MSE & / & coefficient & derivative & Matrix \end{bmatrix}}_{m+1x1} $


---

## Class Implementation
### Class Members
#### Public Members:
- **num_iterations**: number of iterations
- **lr** : learning rate
- **df** : data frame that contains sample

#### Private members
- **n**: sample numbers
- **coeff_mat**: coefficient matrix (m+1 x1)
- **features_mat**: features matrix (n x m+1)
- **y** : target matrix (n x 1)

### Class Methods
#### Public Methods
- **__init__** : constructor get the public members and produces the private members
- **Fit**: run perform training and get the training model
- **Predict**: use the training model to get houses price

#### Private Methods
- **y_hat** : return the predicted values
- **ErrorCalc** : return the error and MSE
- **DMat**: return the derivitave matrix
- **GD**: Gradient Descent Algoritm , returns the final coefficeint matrix

In [32]:
import numpy as np
import pandas as pd

In [59]:
class LinearRegression:
    def __init__(self, df, lr=0.01, num_iterations=1000):
        # public members
        self.df = pd.DataFrame(df) 
        self.lr = lr
        self.num_iterations = num_iterations
        # private members
        self.n = self.df.shape[0]
        self.m = self.df.shape[1]-1
        np.random.seed()
        self.coeff_mat = np.random.rand(self.m+1,1)
        
        

In [60]:
test = LinearRegression(df)

In [62]:
df[:-1]

Unnamed: 0,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus,price
0,0.651977,4,2,3,1,0,0,0,1,2,1,0,13300000
1,0.825989,4,4,4,1,0,0,0,1,3,0,0,12250000
2,0.938983,3,2,2,1,0,1,0,0,2,1,1,12250000
3,0.661017,4,2,2,1,0,1,0,1,3,1,0,12215000
4,0.651977,4,1,2,1,1,1,0,1,2,0,0,11410000
...,...,...,...,...,...,...,...,...,...,...,...,...,...
539,0.151412,2,1,1,0,0,0,0,0,1,0,2,1855000
540,0.152542,2,1,1,1,0,1,0,0,2,0,2,1820000
541,0.084746,3,1,1,0,0,0,0,0,0,0,1,1767150
542,0.222599,2,1,1,1,0,0,0,0,0,0,2,1750000
