# Implementing univariate linear regression:

* hypothesis:
$\newline$
$h(x) = \theta_{0} + \theta_{1} x$
$\newline$
* cost func:  
$\newline$
$ J(\theta_{0}, \theta_{1}) = \frac{1}{2m} \sum_{i=1 \to m} (h(x^{i}) - y^{i})^{2}$
$\newline$
* gradient descent: 
$\newline$
$\theta_{j} := \theta_{j} - \alpha(\frac{d}{d\theta_{j}} [J(\theta_{0}, \theta_{1}])$
$\newline$

In [1]:
import numpy as np

In [2]:
# hypothesis function
def create_model(i: float,
                 j: float,
                 x: int) -> float:
    return (i + (j*x))

In [3]:
print(create_model(0, 2, 3));

6


In [4]:
# split data into training and test sets
def split_data(x: np.float64,
               y: np.float64,
               train_size: int,
               random_seed: int = 0):
    
    assert len(x) == len(y), 'independent and dependent list should contain the same number of elements.'
    assert 0 < train_size < 101, 'train size should be in the range [0, 100]'
    
    div =  int((len(x) * train_size)/100) # create a division number using percentage provided through test_size param
    
    x_train = np.array(x[:div])
    y_train = np.array(y[:div])
    x_test  = np.array(x[div:])
    y_test  = np.array(y[div:])
    
    return x_train, y_train, x_test, y_test

# brainstroming how to write a derivative function in python:

* power rule:
$\newline$
$ \frac{d}{dx} [x^n] = nx^{n-1}$
$\newline$
* product rule:
$\newline$
$ \frac{d}{dx} [f(x)\times g(x)] = f(x)\times\frac{d}{dx}(g(x))$
$\newline$
* quotient rule:
$\newline$
$\frac{d}{dx} [\frac{f(x)}{g(x)}] = \frac{g(x) \frac{d}{dx}f(x) - f(x) \frac{d}{dx} g(x)}{g(x)^2}$
$\newline$
* chain rule:
$\newline$
$\frac{d}{dx} [f(g(x))] = \frac{d}{dx}(f(g(x)) \times \frac{d}{dx}g(x)$
$\newline$

gradient descent equation:

$\theta_{j} := \theta_{j} - \alpha(\frac{d}{d\theta_{j}} [J(\theta_{0}, \theta_{1}])$

derivative:
$\frac{df}{du}(a) = \frac{f(a + \delta) - f(a - \delta)}{2 \times \delta}$: definition from the deep learning from scratch book

In [69]:
def squared_error_0(size: int,
                  theta0: float,
                  theta1: float,
                  x: np.array,
                  y: np.array):
    
    squared_err = 0;
    for i in range(size):
        squared_err += (theta0 + theta1 * x[i] - y[i])
        
    return squared_err/size

def squared_error_1(size: int,
                  theta0: float,
                  theta1: float,
                  x: np.array,
                  y: np.array):
    
    squared_err = 0;
    for i in range(size):
        squared_err += (theta0 + theta1 * x[i] - y[i]) * x[i]
        
    return squared_err/size

In [70]:
def gradient_descent(theta0: float,
                     theta1: float,
                     episode: int,
                     alpha: float = 0.2):
    
    for i in range(episode):
        temp0 = theta0 - alpha * (1/len(x_train)) * squared_error_0(len(x_train), theta0, theta1, x_train, y_train);
        temp1 = theta1 - alpha * (1/len(x_train)) * squared_error_1(len(x_train), theta0, theta1, x_train, y_train);
        theta0 = temp0;
        theta1 = temp1;
        
    return theta0, theta1

In [71]:
import pandas as pd

In [72]:
imported_data = pd.read_csv('SOCR-HeightWeight.csv')
imported_data

Unnamed: 0,Index,Height(Inches),Weight(Pounds)
0,1,65.78331,112.9925
1,2,71.51521,136.4873
2,3,69.39874,153.0269
3,4,68.21660,142.3354
4,5,67.78781,144.2971
...,...,...,...
24995,24996,69.50215,118.0312
24996,24997,64.54826,120.1932
24997,24998,64.69855,118.2655
24998,24999,67.52918,132.2682


In [73]:
dependent_var = ['Weight(Pounds)']
independent_var = ['Height(Inches)']

In [74]:
x_train, y_train, x_test, y_test = split_data(imported_data[independent_var], imported_data[dependent_var], 80)

In [75]:
print(len(x_train))
print(len(x_test))

20000
5000


In [76]:
theta0, theta1 = gradient_descent(0, 0, 300)
print(theta0)
print(theta1)

[0.02729169]
[1.8699685]


In [77]:
hyp = create_model(theta0, theta1, x_test[1000])
print(f'input height in inches: {x_test[1000]}, \npredected weight in pounds : {hyp}, \nactual weight in pounds: {y_test[1000]}')

input height in inches: [68.84104], 
predected weight in pounds : [128.75786815], 
actual weight in pounds: [122.8631]


In [78]:
def test_model(size: int,
               theta0: float,
               theta1: float,
               x: np.array,
               y: np.array):
    
    squared_error = 0
    for i in range(size):
        squared_error += pow((create_model(theta0, theta1, x[i]) - y[i]), 2)
        
    return float(squared_error/(2 * size))

In [79]:
err = test_model(len(x_test), theta0, theta1, x_test, y_test)
print(f'over {len(x_test)} data points, there is a squared error difference of {err}')

over 5000 data points, there is a squared error difference of 53.176873930569506
