<b>Linear Regression (one-variable):</b>

  Y = wX + b

  Y --> Dependent Variable<br>
  X --> Independent Variable<br>
  w --> weight<br>
  b --> bias<br><br>


<b>Gradient Descent:</b>

  We try to minimize value of the cost function.<br><br>
  The equation for cost with one variable is:<br>
    $$J(w,b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2 \tag{1}$$ 
  
  where 
    $$f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{2}$$

    
  - $f_{w,b}(x^{(i)})$ is our prediction for example $i$ using parameters $w,b$.  
  - $(f_{w,b}(x^{(i)}) -y^{(i)})^2$ is the squared difference between the target value and the prediction.   
  - These differences are summed over all the $m$ examples and divided by `2m` to produce the cost, $J(w,b)$.  
  - `m` can also be used instead of `2m`. The cost fucntion remains interchangeable with both values since the point of minimum remains unchanged.

  
  gradient descent* is described as:

  $$\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline
  \;  w &= w -  \alpha \frac{\partial J(w,b)}{\partial w} \tag{3}  \; \newline 
  b &= b -  \alpha \frac{\partial J(w,b)}{\partial b}  \newline \rbrace
  \end{align*}$$
  where, parameters $w$, $b$ are updated simultaneously.  
  The gradient is defined as:
  $$
  \begin{align}
  \frac{\partial J(w,b)}{\partial w}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})x^{(i)} \tag{4}\\
    \frac{\partial J(w,b)}{\partial b}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)}) \tag{5}\\
  \end{align}
  $$

  Here *simultaniously* means that you calculate the partial derivatives for all the parameters before updating any of the parameters.<br><br>

<b> Learning Rate:</b><br><br>
  <b>$\alpha$</b> can be used to vary the rate of change of w or b. Giving a value too large can cause the gradient descent to never converge. It must me chosen very less in the order of ~ 0.001 and varied based on success rate.


In [1]:
import numpy as np
import math, copy

<b>Linear Regression</b>

In [2]:
from model_functions import Linear_Regression

<b>Plotting</b>

In [3]:
import matplotlib as plt

<b>Training the Model</b>

In [4]:
import pandas as pd



In [5]:
# Loading Data
salary_data = pd.read_csv('./sample data/salary_data.csv')
salary_data.head()
X_train = salary_data.YearsExperience
Y_train = salary_data.Salary
X_train.shape

(30,)

In [9]:
alpha = 0.01
iterations = 1000
w = 1
b = 0

new_model = Linear_Regression(alpha, iterations)
new_model.fit(X_train, Y_train, w, b)
new_model.compute_cost(w, b)
new_model.compute_gradient(w, b)
new_model.gradient_descent(w, b)

(np.float64(9456.050491915646), np.float64(25751.17363383391))