## Generating the dataset

In [1]:
from random import Random
from matplotlib.pyplot import figure
%matplotlib inline

SEED = 5
random_gen = Random(x = SEED)
def generate_pts(N = 1000):
    return(
        [random_gen.uniform(a = 0, b = 1) for _ in range(N)],
        [random_gen.uniform(a = 0, b = 1) for _ in range(N)]
    )
data_x, data_y = generate_pts()

## Defining the loss function

In [2]:
from math import sqrt
def loss(x_p, y_p):
    return (1/len(data_x))*sum(
        [sqrt((x_i - x_p)**2+(y_i - y_p)**2) for x_i, y_i in zip(data_x, data_y)]
    )

## Gradient descent function

In [3]:
# This is the function carrying out the gradient descent updating the xp and yp values
# It takes the initial values for x_p and y_p, the H used to calculate the derivative, the DELTA which is the STEP or
# the learning rate used to update x_p and y_p and EPOCHS (no of iterations)
# It returns the final values for x_p and y_p and two lists we used to keep trach of the different x_p and y_p values
# with each epoch

def optimize_return_parameters_list(x_p_init, y_p_init, H, DELTA, EPOCHS):
    x_p, y_p = x_p_init, y_p_init
    x_p_list =[]
    y_p_list =[]
    for _ in range(EPOCHS):
        x_p_list.append(x_p)
        y_p_list.append(y_p)
        dloss_dx = (loss(x_p+H, y_p) - loss(x_p, y_p))/H
        dloss_dy = (loss(x_p, y_p+H) - loss(x_p, y_p))/H
        x_p-= dloss_dx * DELTA   
        y_p-= dloss_dy * DELTA
    return x_p, y_p, x_p_list, y_p_list

## Dr.Meena's Original trial

In [4]:
x_p_final, y_p_final, x_p_list, y_p_list = optimize_return_parameters_list(5, 5, 0.001, 0.01, 3000)

print(f"The final xp value: {x_p_final} and the final yp value: {y_p_final}")

fig = figure(figsize=(20,10))

xp_ax = fig.add_subplot(1, 2, 1)
yp_ax = fig.add_subplot(1, 2, 2)

xp_ax.plot(x_p_list)
yp_ax.plot(y_p_list)

xp_ax.set_title('Xp values with iterations')
yp_ax.set_title('Yp values with iterations')

#since our value is decimal I wanted to see more grading on the y axis using set_ticks
xp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
xp_ax.set_ylim(bottom = 0, top = 6)

yp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
yp_ax.set_ylim(bottom = 0, top = 6)

fig.show()

## Trial 1
**Negative initial xp and yp**

In [5]:
x_p_final, y_p_final, x_p_list, y_p_list = optimize_return_parameters_list(-5, -5, 0.001, 0.01, 3000)

print(f"The final xp value: {x_p_final} and the final yp value: {y_p_final}")

fig = figure(figsize=(20,10))

xp_ax = fig.add_subplot(1, 2, 1)
yp_ax = fig.add_subplot(1, 2, 2)

xp_ax.plot(x_p_list)
yp_ax.plot(y_p_list)
xp_ax.set_title('Xp values with iterations')
yp_ax.set_title('Yp values with iterations')

# xp_ax.yaxis.set_ticks([0.2*i for i in range(-25, 6)])
# xp_ax.set_ylim(bottom = -5, top = 1)

# yp_ax.yaxis.set_ticks([0.2*i for i in range(-25, 6)])
# yp_ax.set_ylim(bottom = -5, top = 1)

fig.show()

### Comment
Since the initial value was negative unlike the previous example thee xp and yp values went up instead of decreasing, it did converge at the end to the optimum value, maybe taking a little longer than the previous example because it had a larger gap from the optimum value than the original example.

## Trial 2
**xp and yp are close to 0.5**

In [6]:
x_p_final, y_p_final, x_p_list, y_p_list = optimize_return_parameters_list(0.7, 0.7, 0.001, 0.01, 3000)

print(f"The final xp value: {x_p_final} and the final yp value: {y_p_final}")

fig = figure(figsize=(20,10))
xp_ax = fig.add_subplot(1, 2, 1)
yp_ax = fig.add_subplot(1, 2, 2)

xp_ax.plot(x_p_list)
yp_ax.plot(y_p_list)
xp_ax.set_title('Xp values with iterations')
yp_ax.set_title('Yp values with iterations')

xp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
xp_ax.xaxis.set_ticks([200*i for i in range(16)])
xp_ax.set_ylim(bottom = 0, top = 6)

yp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
yp_ax.set_ylim(bottom = 0, top = 6)

fig.show()

### Comment
By setting the xp and yp to a number close to the optimum solution, convergence did happen much faster than in the original trial. It almost settled after 200 epochs.

## Trial 3
**Trying a bigger H**

In [7]:
x_p_final, y_p_final, x_p_list, y_p_list = optimize_return_parameters_list(5, 5, 0.1, 0.01, 3000)

print(f"The final xp value: {x_p_final} and the final yp value: {y_p_final}")

fig = figure(figsize=(20,10))
xp_ax = fig.add_subplot(1, 2, 1)
yp_ax = fig.add_subplot(1, 2, 2)

xp_ax.plot(x_p_list)
yp_ax.plot(y_p_list)

xp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
xp_ax.set_ylim(bottom = 0, top = 6)

yp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
yp_ax.set_ylim(bottom = 0, top = 6)

xp_ax.set_title('Xp values with iterations')
yp_ax.set_title('Yp values with iterations')

fig.show()

### Comment
Due to the large H as compared to the original in the derivative definition where h approaches zero, the calculation of the derivative is way off and the final values printed above are pretty far from the optimum solution. **The error is large**.

## Trial 4
**Trying a bigger DELTA**

In [8]:
x_p_final, y_p_final, x_p_list, y_p_list = optimize_return_parameters_list(5, 5, 0.001, 1.5, 3000)

print(f"The final xp value: {x_p_final} and the final yp value: {y_p_final}")

fig = figure(figsize=(20,10))
xp_ax = fig.add_subplot(1, 2, 1)
yp_ax = fig.add_subplot(1, 2, 2)

xp_ax.scatter([i for i in range(3000)], x_p_list, s=2)
yp_ax.scatter([i for i in range(3000)], y_p_list, s=2)
xp_ax.set_title('Xp values with iterations')
yp_ax.set_title('Yp values with iterations')

xp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
xp_ax.set_ylim(bottom = 0, top = 6)

yp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
yp_ax.set_ylim(bottom = 0, top = 6)

fig.show()

### Comment
Using a higher learning rate, xp and yp converged faster but did not reach the optimal solution because fluctuations keep occuring around the optimal solution but can not reach it due to the much bigger step.

## Trial 5
**Trying a much smaller DELTA**

In [9]:
x_p_final, y_p_final, x_p_list, y_p_list = optimize_return_parameters_list(5, 5, 0.001, 0.0001, 3000)

print(f"The final xp value: {x_p_final} and the final yp value: {y_p_final}")

fig = figure(figsize=(20,10))
xp_ax = fig.add_subplot(1, 2, 1)
yp_ax = fig.add_subplot(1, 2, 2)

xp_ax.plot(x_p_list)
yp_ax.plot(y_p_list)

xp_ax.set_title('Xp values with iterations')
yp_ax.set_title('Yp values with iterations')

xp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
xp_ax.set_ylim(bottom = 0, top = 6)

yp_ax.yaxis.set_ticks([0.2*i for i in range(60)])
yp_ax.set_ylim(bottom = 0, top = 6)

fig.show()

### Comment
Using a very small DELTA, the step with which xp and yp are updated is so small that we need a very very large number of EPOCHS in order to reach the optimum values of xp and yp. We might get a more accurate estimation when we actually come near the optimum value, but it would take too many computations just to get there.