# Deep Learning Series

## Linear Regression using Gradient Descent


### Gradient Descent Overview

<img src="../Assets/gd.gif">

For this lesson, we have a data set is a `collection of test scores and the amount of hours studied`.

X values are the amount of hours studied.
Y values are test scores.

In [25]:
import numpy as np

points = np.genfromtxt("../Datasets/udacity_student_data.csv", delimiter=",")


In [53]:
# Hyper parameters

# Choosing Learning rate => it is a balance between choosing high value or low value.
# Its like a bell curve. If the value is too low, our model will take too long to converge.
# If the value is too high, model will never converge. We have to guess and check for the optimal value.
learning_rate = 0.0001 

# y = mx + b
# m => Slope
# b => y-intercept
initial_b = 0
initial_m = 0

# Depends on how large the data set is. Bigger the data set, Higher the number of iterations.
num_iterations = 1000

#### Sum of Squared Errors
<img src="../Assets/sose.JPG"/>

#### Formula

<img src="../Assets/gd_formula.png"/>

#### Gradient Descent Partial Derivative Formula

<img src="../Assets/gd_partialderivative.png" />

In [27]:
# Every time-step our goal is to improve model prediction (accuracy).
# For each time-step (iteration) we compute error and try to reduce it in the next time-step.
# We want to measure the distance from each data point to the line and then square them and sum them all together
# and divide by total number of points. This is error value and our goal is to minimize this.
# "SUM OF SQUARED ERRORS"
def compute_error_for_given_points(b, m, points):
    total_error = 0
    for i in range(0, len(points)):
        x = points[i, 0]
        y = points[i, 1]
        total_error += (y - (m * x + b)) **2
    return total_error / float(len(points))


In [37]:
def step_gradient(b_current, m_current, points, learning_rate):
    # Gradient Descent Logic
    # Gradient => slope => it gives us the way or the direction to move our point towards optimal value
    # To calculate this gradient we do partial detivative
    b_gradient = 0
    m_gradient = 0
    N = float(len(points))
    
    for i in range(0, len(points)):
        x = points[i, 0]
        y = points[i, 1]
        # As per the partial derivative formula
        b_gradient += (y - ((m_current * x) + b_current))
        m_gradient += (x * (y - ((m_current * x) + b_current)))
    b_gradient = -(2/N) * b_gradient
    m_gradient = -(2/N) * m_gradient
    new_b = b_current - (learning_rate * b_gradient)
    new_m = m_current - (learning_rate * m_gradient)
    return [new_b, new_m]
        

In [29]:
def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations):
    b = starting_b
    m = starting_m
    
    for i in range(num_iterations):
        b, m = step_gradient(b, m, np.array(points), learning_rate)
    return [b, m]

In [54]:
[b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations)
print(b)
print(m)
#0.0889365199374
# 1.47774408519

0.0889365199374
1.47774408519
