## Let's Optimize the function
### $y = x^{4} - 2x^{2}$
### $\frac{\partial{y}}{\partial{x}} = 4x^{3} - 4x$
### $\frac{\partial^{2}{y}}{\partial{x}^{2}} = 12x^{2} - 4$

In [4]:
import numpy as np

from dezero import Variable
from dezero import Function

def f(x):
    y = x**4 - 2*x**2
    return y

x = Variable(np.array(2.0))
y = f(x)
y.backward(create_graph=True)
print(x.grad)

Variable(24.0)


In [2]:
gx = x.grad
gx.backward()
print(x.grad)

Variable(68.0)


### $\frac{\partial{y}}{\partial{x}}|_{x=2} = 24$
### $\frac{\partial^{2}{y}}{\partial{x}^{2}}|_{x=2} = 44$

above result is wrong. it should be **44** not **68**

This happend because we didn't cleared the 1st order grad

In [5]:
gx = x.grad

# We need to clear the 1st order gradient!
x.cleargrad()

gx.backward()
print(x.grad)

Variable(44.0)


## Newton Method manually!
### $x\leftarrow x-\frac{f'(x)}{f''(x)}$

Compared to $x\leftarrow x-\alpha f'(x)$ normally $\alpha$ is chosen manually by human but at here $\alpha$ is chosen as $\alpha = \frac{1}{f''(x)}$

In [8]:
def f(x):
    y = x**4 - 2*x**2
    return y

def gx2(x):
    return 12*x**2 - 4

x = Variable(np.array(2.0))
iters = 10

xs = []
ys = []


for i in range(iters):
    print(i, x)
    
    y = f(x)
    xs.append(x.data)
    ys.append(y.data)    
    
    x.cleargrad()
    y.backward()
    
    x.data = (x.data - x.grad / gx2(x.data)).data

0 Variable(2.0)
1 Variable(1.4545454545454546)
2 Variable(1.1510467893775467)
3 Variable(1.0253259289766978)
4 Variable(1.0009084519430513)
5 Variable(1.0000012353089454)
6 Variable(1.000000000002289)
7 Variable(1.0)
8 Variable(1.0)
9 Variable(1.0)


## Newton method automatically!
### $x\leftarrow x-\frac{f'(x)}{f''(x)}$

we can see its same as upper!

In [10]:
def f(x):
    y = x**4 - 2*x**2
    return y

x = Variable(np.array(2.0))
iters = 10

xs = []
ys = []


for i in range(iters):
    print(i, x)
    
    y = f(x)
    xs.append(x.data)
    ys.append(y.data)    
    
    # 1st
    x.cleargrad()
    y.backward(create_graph=True)
    
    # 2nd order differential
    gx = x.grad
    x.cleargrad()
    gx.backward()
    gx2 = x.grad
    
    x.data -= gx.data / gx2.data

0 Variable(2.0)
1 Variable(1.4545454545454546)
2 Variable(1.1510467893775467)
3 Variable(1.0253259289766978)
4 Variable(1.0009084519430513)
5 Variable(1.0000012353089454)
6 Variable(1.000000000002289)
7 Variable(1.0)
8 Variable(1.0)
9 Variable(1.0)
