# Foundations of AI & ML
## Session 05
### Experiment 1 - Part 3
## "Sequential" Gradient Descent

**Objectives:** We will use single sample(Sequential) Gradient Descent Method in this Experiment and see the variations when each and every single point is used instead of the Batch.

**Expected Time:** This Experiment should take around 15 mins

In [None]:
import pandas as pd
import numpy as np
import scipy.stats as stat
%matplotlib notebook
import matplotlib.pyplot as plt
from sklearn.utils import shuffle
import time

### Read the data:

In [None]:
data = pd.read_csv("../Datasets/regr01.txt", sep=" ", header=None, names=['l', 't'])
#print(data.head())
#print(data.tail()) 
#data = shuffle(data)
#data

In [None]:
l = data['l'].values
t = data['t'].values
tsq = t * t

In [None]:
l.shape

### Single sample - Sequential

Now we will calculate the essenial parts of the Gradient Descent method
using each and every single point.

$y = mx + c$

$E$ = $(y_i - y)^2$

$\frac{\partial E }{\partial m}$ = $ -(y_i - (mx_i + c)) * x_i$

$\frac{\partial E }{\partial c}$ = $ -(y_i - (mx_i + c))$

In [None]:
def train(x, y, m, c, eta):
    ycalc = m * x + c
    error = (y - ycalc) ** 2
    delta_m = -(y - ycalc) * x
    delta_c = -(y - ycalc)
    m = m - delta_m * eta
    c = c - delta_c * eta
    return m, c, error

def train_per_sample(x,y,m,c,eta):
    for x_sample, y_sample in zip(x, y):
        m, c, e = train(x_sample, y_sample, m, c,eta)
        #print(m,c,e)
    return m, c, e

def train_sequential(x, y, m, c, eta, iterations=1000):
    for iteration in range(iterations):
        m, c, err = train_per_sample(x, y, m, c, eta)
    return m, c, err

Let us visualize the training in this case:

### $\eta$ = 0.001

In [None]:
# Initializing m and c to 0
m, c = 0, 0

In [None]:
# Fixing learning rate
lr = 0.001

In [None]:
# Training for 1000 iterations, plotting after every 100 iterations:
fig = plt.figure(figsize=(5, 5))
ax = fig.add_subplot(111)
plt.ion()
fig.show()
fig.canvas.draw()

for num in range(10):
    m, c, error = train_sequential(l, tsq, m, c, lr, iterations=200)
    print("m = {0:.6} c = {1:.6} Error = {2:.6}".format(m, c, error))
    y = m * l + c
    ax.clear()
    ax.plot(l, tsq, '.k')
    ax.plot(l, y)
    fig.canvas.draw()
    time.sleep(1)

**Exercise: Experiment with more iterations**

## Plotting error vs iterations

In [None]:
ms, cs,errs = [], [], []
m, c = 0, 0
lr = 0.001
for times in range(100):
    m, c, error = train_sequential(l, tsq, m, c, lr, iterations=100) # We will plot the value of for every 100 iterations
    ms.append(m)
    cs.append(c)
    errs.append(error)
epochs = range(0, 10000, 100)
plt.figure(figsize=(8, 5))
plt.plot(epochs, errs)
plt.xlabel("Iterations")
plt.ylabel("Error")
plt.title("Sequential Gradient Descent")
plt.show()

**Exercise: Is this better than vanilla gradient descent?**

Hint: Check the error value at saturation, and the number of iterations it takes to reach saturation.

In [None]:
#### Last Error at saturation: 0.007
def find_itr(epochs,errs):
    for i in range(len(epochs)):
        if(errs[i] <= 0.007):
            return epochs[i]

In [None]:
find_itr(epochs,errs)