# Automatic differentiation

There are lots of applications in engineering that requires the computation of the derivative of a function. The simplest example is Newton's method which is a fast iterative algorithm that finds the root of a nonlinear equation:

$$
x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}.
$$

There are, of course, more complicated applications where (apart from the function values) the derivatives are required, for instance training a neural network requires the minimization of a cost function by gradient descent which requires the derivatives of the cost function with respect to the parameters (weights, biases).


### How to calculate the derivative of a function?

* symbolic differentiation: either the derivative of $f$ is known exactly in advance or a computer algebraic program calculates it
* numerical differentiation: approximate the value of $f'(x)$ with finite differences:
$$
f'(x)\approx \frac{f(x+\epsilon) - f(x-\epsilon)}{2\epsilon}
$$
* automatic differentiation: use a mechanism that calculates the derivative together with the function value without any extra effort, that is, automatically.



### Dual numbers

A dual number can be represented as a symbolic expression $a + b\cdot\epsilon$, where $\epsilon^2 = 0$. More precisely, a dual number is a pair of real numbers $(a, b)$, such that addition, subtraction, multiplication and division is defined as follows:

* $(a + b\epsilon) + (c + d\epsilon) = (a + c) + (b + d)\epsilon$
* $(a + b\epsilon) \cdot (c + d\epsilon) = ac + (ad + bc)\epsilon$
* $-(a + b\epsilon) = -a + (-b)\epsilon$
* $(a + b\epsilon) / (c + d\epsilon) = \frac{a}{c} + \frac{bc - ad}{c^2}\epsilon$

Let us assume that $p$ is a polynomial of degree $n$:
$$
p(x) = a_0 + a_1x + a_2x^2 + \ldots + a_nx^n.
$$
Let $x = a + b\epsilon$ be a dual number, then
$$
p(x) = p(a + b\epsilon) = p(a) + b\cdot p'(a)\cdot\epsilon.
$$
Then for any analytic function $f$, the dual number $(a, b)$ is mapped to the dual number $(f(a), bf'(a))$. This means that by extending the usual function definitions to dual numbers, the derivative is calculated along the function value with no extra effort.

As an example, the trigonometric sine function should be overriden with the following new definition:
$$
\sin x = \sin\ (a, b) := (\sin a, b\cdot cos a).
$$

In [None]:
# for python 2.x
from __future__ import division

In [None]:
# source: https://github.com/jeppe742/AutoDiff
from autodiff import AdFloat, cos

In [None]:
MAX_ITERATION = 20

def evaluate(d, func):
    dual = AdFloat(d)
    return func(dual)


def find_root(func, initial, tolerance=1e-10):
    x_approx = initial
    nr_iterations = 0
    while nr_iterations < MAX_ITERATION:
        evaluated = evaluate(x_approx, func)
        x_approx_next = x_approx - evaluated.x / evaluated.dx
        if abs(x_approx - x_approx_next) / abs(x_approx + tolerance) < tolerance:
            return x_approx
        x_approx = x_approx_next
        nr_iterations += 1
    print("did not converge")    
    return

In [None]:
f = lambda x: x ** 3 + x - 1000
initial = 8

root = find_root(f, initial)
print(root)

In [None]:
f = lambda x: cos(x) - x
initial = 0

root = find_root(f, initial)
print(root)