## Gradient Descent Algorithm
Gradient Descent minimize the loss between objective and cost function by taking partial derivatives with respect to each parameter(w, b in this example)

・Gradient \
$\nabla L = (\frac{\partial L}{\partial w} \frac{\partial L}{\partial b})$

・Partial Derivatives \
$\frac{\partial}{\partial w}((y-(wx^2+b)) = -2x^2(-b-wx^2+y)$ \
$\frac{\partial}{\partial b}((y-(wx^2+b)) = -2(-b-wx^2+y)$

・Update parameters \
$w ← w - \eta \frac{\partial f}{\partial w} $ \
$b ← b - \eta \frac{\partial f}{\partial b} $

In [6]:
import random

# Cost function
def nn(w,x,b):
    return w*(x**2) + b

w = random.random()
b = random.random()
lr = 0.01

x = 2
y = 0

def update_params():
    global w, b
    y_pred = nn(w,x,b)
    print(y_pred)
    derivative_wrt_w = -2*x**2*(y - y_pred) # partial derivative
    derivative_wrt_b = -2*(y - y_pred) # partial derivative
    w = w - lr*derivative_wrt_w # Update w
    b = b - lr*derivative_wrt_b # Update b

for i in range(100):
    update_params()

4.626872156973847
3.053735623602739
2.0154655115778075
1.330207237641353
0.8779367768432929
0.5794382727165733
0.3824292599929383
0.25240331159533924
0.16658618565292388
0.10994688253092977
0.07256494247041367
0.04789286203047294
0.03160928894011217
0.020862130700474046
0.013769006262312922
0.009087544133126513
0.0059977791278635895
0.003958534224389965
0.0026126325880974077
0.0017243375081443801
0.001138062755375202
0.0007511214185477177
0.0004957401362415403
0.0003271884899194166
0.00021594440334682385
0.00014252330620889708
9.40653820978854e-05
6.208315218458882e-05
4.0974880441857486e-05
2.7043421091610398e-05
1.7848657920427335e-05
1.1780114227533112e-05
7.774875390165192e-06
5.131417757597845e-06
3.3867357199790504e-06
2.2352455751883937e-06
1.4752620796087967e-06
9.736729725817739e-07
6.42624161995009e-07
4.2413194700774426e-07
2.799270850672997e-07
1.8475187613109512e-07
1.2193623821765698e-07
8.047791721921271e-08
5.311542528030344e-08
3.505618073607053e-08
2.313707925249986e-

In [8]:
epsilon = 0.0000000000000001 # Threshold 
y_hat = nn(w,x,b) # prediction of y
(y - y_hat) < epsilon # Check if the prediction is correct

True