# Simple Linear Regression:

## Important function derivation

Note on notation:
* $x^{(i)}$ is a vector including the feature values for the $i^{th}$ datapoint
* $y^{(i)}$ is the label value for the $i^{th}$ datapoint

And $X$ is the matrix that includes the transposed feature vectors as such: $$X=\begin{pmatrix}(x^{(1)})^{T} \\ (x^{(2)})^{T} \\ . \\ . \\ . \\ (x^{(m)})^{T}\end{pmatrix}$$

**Hypothesis Function**:
$$H(X)=\theta_{0}+\theta_{1} x^{1}$$

$\theta_{0}$ being the cut and $\theta_{1}$ being the slope

**Cost (Squared Error) Function:**
$$J(\theta_{0},\theta_{1})=\frac{1}{2m}\sum^{m}_{i=1}[H(x^{(i)})-y^{(i)}]^2$$

**Gradient Descent:**
$$T_{i} := \theta_{i} - \alpha\frac{\partial}{\partial\theta_{i}}J(\theta_{0},\theta_{1})$$

$$T_{i} := \theta_{i} - \alpha\frac{\partial}{\partial\theta_{i}}\frac{1}{2m}\sum^{m}_{i=1}[H(X_{i})-Y_{i}]^2$$

Taking partial derivatives using the simple Linear regression:

$$\frac{\partial}{\partial\theta_{i}}\frac{1}{2m}\sum^{m}_{i=1}[\theta_{0}+\theta_{1}X-Y_{i}]^2$$

for $\theta_{0}$

$$\frac{\partial}{\partial\theta_{0}}\frac{1}{2m}\sum^{m}_{i=1}[\theta_{0}+\theta_{1}X-Y_{i}]^2 = \frac{1}{m}\sum^{m}_{i=1}[\theta_{0}+\theta_{1}X-Y_{i}]^2 = \frac{1}{m}\sum^{m}_{i=1}[H(X_{i})-Y_{i}]^2$$

for $\theta_{1}$

$$\frac{\partial}{\partial\theta_{1}}\frac{1}{2m}\sum^{m}_{i=1}[\theta_{0}+\theta_{1}X-Y_{i}]^2 = \frac{1}{m}\sum^{m}_{i=1}[\theta_{0}+\theta_{1}X-Y_{i}]^2X = \frac{1}{m}\sum^{m}_{i=1}[H(X_{i})-Y_{i}]^2X$$

**Therefore:** Simultaneous updating of the dynamic functions

$$T_{0} := \theta_{0} - \alpha\frac{1}{m}\sum^{m}_{i=1}[H(X_{i})-Y_{i}]^2 \longrightarrow \theta_{0} := T_{0}$$

$$T_{1} := \theta_{1} - \alpha\frac{1}{m}\sum^{m}_{i=1}[H(X_{i})-Y_{i}]^2X \longrightarrow \theta_{1} := T_{1}$$

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import pdb

In [2]:
os.chdir(r'C:\Users\andre.bravo\Documents\python\Python-Data-Science-and-Machine-Learning-Bootcamp\ML from Scratch\1. Simple Linear Regression\Datasets')

In [3]:
dataset = pd.read_csv('Housing_Data.csv')

In [4]:
#first column is indepndent variable X
indepX = dataset.iloc[:,0].values
#second column is dependent variable Y
depY = dataset.iloc[:,1].values

## Train-Test Split

In [5]:
from sklearn.model_selection import train_test_split

In [6]:
indepX_train, indepX_test, depY_train, depY_test = train_test_split(indepX,
                                                                   depY,
                                                                   test_size=0.2,
                                                                   random_state=42)

## Main Function

In [7]:
def main():
    
    # Initial values
    init_theta = np.zeros(2) #creates array, (0,0)
    learn_rate = 0.05
    num_iterat = 10 # may change later
    #hypothesis vector, loop over an entire vector where we assign values as 0
    H = [0 for i in range(len(indepX_train))]
    
    #call functions that will run algorithm
    theta = gradientDescent(indepX_train,
                            depY_train,
                            init_theta,
                            learn_rate,
                            num_iterat)
    H = hyp(theta, indepX_train)

## Hypothesis Function

In [8]:
def hyp(theta, x):
    return theta[0] + theta[1] * x

## Gradient Descent Function

Two looping processes:
1. Loop over entire dataset to calculate loss (or gradient) then adjust our parameters
2. Loops the first loop several times until optimal parameters are reached (iteration of the algorithm)
Split these two operations over two different functions

In [9]:
def gradientDescent(indepX, depY, init_theta, learn_rate, num_iterat):
    #initialization
    theta = init_theta
    for i in range(num_iterat):
        theta = grad(indepX, depY, theta, learn_rate)
        
    
    
    return theta

In [10]:
def grad(indepX, depY, curr_theta, learning_rate):
    #initialization
    grad = np.zeros(2)
    new_theta = curr_theta
    m = len(indepX)
    
    #Loop for adjustment and calculating the gradient
    #pdb.set_trace()
    for i in range(m):
        x = indepX[i]
        y = depY[i]
        
        # Note that summation is acchieved through "grad[0] + " which is "+="
        grad[0] += (1/m) * ((curr_theta[0] + curr_theta[1] * x) - y)
        grad[1] += (1/m) * ((curr_theta[0] + curr_theta[1] * x) - y) * x
    
    #begin assignment using temporary values
    temp0 = curr_theta[0] - (learning_rate * grad[0])
    temp1 = curr_theta[1] - (learning_rate * grad[1])
    
    new_theta[0] = temp0
    new_theta[1] = temp1
    
    return new_theta

In [20]:
def main():
    # Initial values
    init_theta = np.zeros(2) #creates array, (0,0)
    learn_rate = 0.05
    num_iterat = 56
    # may change later
    #hypothesis vector, loop over an entire vector where we assign values as 0
    H = [0 for i in range(len(indepX_train))]
    
    #call functions that will run algorithm
    theta = gradientDescent(indepX_train, depY_train, init_theta, learn_rate, num_iterat)
    
    H = hyp(theta, indepX_train)
    for i in range(len(depY_test)):
        print(float(H[i]))
        print(depY_test[i])
        print('---------------------')

In [21]:
if __name__ == '__main__':
    main()

-inf
118525
---------------------
-inf
332981
---------------------
-inf
260935
---------------------
-inf
402892
---------------------
-inf
157694
---------------------
-inf
439039
---------------------
-inf
189318
---------------------
-inf
183504
---------------------
-inf
282501
---------------------
-inf
159783
---------------------


  
