In [None]:
from IPython.core.display import HTML
with open ("../style.css", "r") as file:
    css = file.read()
HTML(css)

# Automatic Differentiation with `autograd`

Technically, `autograd` is layer that wraps and extends `numpy`.  Hence it is most often imported as follows:

In [None]:
import autograd
import autograd.numpy as np

The function `sigmoid` implements the [sigmoid function](https://en.wikipedia.org/wiki/Sigmoid_function), which is defined as
$$ \texttt{S}(x) = \frac{1}{1 + \mathrm{e}^{-x}}. $$

In [None]:
def S(x):
    return 1.0 / (1.0 + np.exp(-x))

The function `Q(x)` computes the square of `x`, i.e. we have 
$$ Q(x) = x^2. $$
Of course, the derivate of $x^2$ is just $2\cdot x$, i.e. we have
$$ \frac{\mathrm{d} Q}{\mathrm{d} x} = 2 \cdot x. $$

In [None]:
def Q(x):
    return np.multiply(x, x)

In [None]:
Q_grad = autograd.grad(Q)

In [None]:
Q_grad(1.0)

The function `S_prime` computes the [derivative](https://en.wikipedia.org/wiki/Derivative) of the Sigmoid function.  We implement it using *automatic differentiation*.  This is the closest thing to magic I have seen yet.

In [None]:
S_prime = autograd.grad(S)

In the lecture we have seen that the following identity holds for the derivative of the sigmoid function:
$$ S'(x) = S(x) \cdot \bigl(1 - S(x)\bigr) $$
Let's test this identity.

In [None]:
for x in np.arange(-2.0, 2.0, 0.1):
    print(S_prime(x)- S(x) * (1.0 - S(x)))

The identity seems to hold up to rounding errors.

The cool thing about `autograd` is that it can take the derivative of a Python function.
The function `mySqrt(x)` computes the square root of `x` using [Newton's method](https://en.wikipedia.org/wiki/Newton%27s_method).

In [None]:
def mySqrt(x): 
    root = 1.0
    eps  = 2.0e-15
    while abs(x - root * root) > eps:
        root = 0.5 * (root + x / root)    
    return root

In [None]:
[mySqrt(n)  for n in range(10)]

In [None]:
mySqrtGrad = autograd.grad(mySqrt)

In [None]:
import random as rnd
rnd.seed(42)

In [None]:
rnd.random()

As we have
$$ \frac{\mathrm{d}\; }{\mathrm{d} x}\sqrt{x} = \frac{1}{2} \cdot \frac{1}{\sqrt{x}}, $$
we expect the following loop to not return any errors.

In [None]:
for k in range(1000):
    x = rnd.random()
    error = mySqrtGrad(x) - 0.5 / mySqrt(x)
    if error > 1.0e-15:
        print(error, x, mySqrt(x), mySqrtGrad(x))

Unfortunately, `autograd` has its limitations, as shown by the next cell.

In [None]:
mySqrtGrad(1.0)

We can fix this bug by rewriting the function `mySqrt`.  The problem with the old implementation was that we returned a constant value in the case that $x = 1.0$.  

In [None]:
def mySqrt(x): 
    root = 0.5 * x
    eps  = 2.0e-15
    while abs(x - root * root) > eps:
        root = 0.5 * (root + x / root)    
    return root

In [None]:
mySqrtGrad = autograd.grad(mySqrt)

In [None]:
mySqrtGrad(1.0)

## Implementing Newton's Method with `autograd`

[Newton's method](https://en.wikipedia.org/wiki/Newton%27s_method) for solving an equation of the form
$$  f(x) = 0 $$
defines a sequence $(x_n)_{n\in\mathbb{N}}$ inductively:
* $x_0 = 1.0$
* $x_{n+1} = x_n - \frac{\displaystyle f(x_n)}{\displaystyle f'(x_n)}$  
 
Then, if the function $f$ is convex and twice differentiable, the limit 
$$ \bar{x} = \lim\limits_{n\rightarrow\infty} x_n $$
satisfies $f(\bar{x}) = 0$.

The function `newton` takes a function `f` and its derivative `fs` and computes the 
value `x` such that $f(x) = 0$ using Newton's method.

In [None]:
def newton(f):
    fs = autograd.grad(f)
    x = 1.0 
    eps = 1.0e-14
    while abs(f(x)) > eps:
        x = x - f(x) / fs(x)
        print(x)
    return x

We proceed to solve the equation
$$ \cos(x) - x = 0. $$
To this end we define the function $f(x) = \cos(x) - x$.

In [None]:
def f(x): 
    return np.cos(x) - x

In [None]:
x = newton(f)

In [None]:
np.cos(x)