Skip to content

Latest commit

 

History

History
91 lines (55 loc) · 4.21 KB

dualnumbers.rst

File metadata and controls

91 lines (55 loc) · 4.21 KB

Dual Numbers

A dual number is a number in the form a + b \epsilon, where a, b\in\mathbb{R} and \epsilon is the dual unit (or infinitesimal), which has the property \epsilon^2=0. In this tutorial, we will review the theory of dual numbers, define their algebra and discuss why dual numbers are useful for automatic differentiation.

Arithmetic operations

Let us consider two dual numbers F=f+f'\epsilon and G=g+g'\epsilon (the choice for this notation will become clear in a moment). First, addition and subtraction on dual numbers can be written as

(f+f'\epsilon) + (g+g'\epsilon) = (f+g)+(f'+g')\epsilon
(f+f'\epsilon) - (g+g'\epsilon) = (f+g)+(f'-g')\epsilon

For multiplication and division, recalling \epsilon^2=0 we have

(f+f'\epsilon)(g+g'\epsilon) = fg+f'g\epsilon+fg'\epsilon+\underbrace{f'g'\epsilon^2}_{=0}=fg+(f'g+fg')\epsilon
\frac{f+f'\epsilon}{g+g'\epsilon}=\frac{(f+f'\epsilon)(g-g'\epsilon)}{(g+g'\epsilon)(g-g'\epsilon)}=\frac{fg+(f'g-fg')\epsilon}{g^2}=\frac{f}{g}+\frac{f'g-fg'}{g^2}\epsilon

If you observe the expressions for the dual part, you may notice they are the formulae for the derivative of addition, subtraction, multiplication and division of functions.

Finally, using the rule for multiplication and the induction principle, we can show that

(f+f'\epsilon)^n=f^n+nf^{n-1}f'\epsilon

again, we can see how the dual part corresponds to the derivation rule for the power nf^{n-1}, multiplied by the derivative of the inner function f', as expected by the chain rule.

Polynomials

Let us start by considering a polynomial P(x)=\sum_{i=0}^nc_ix^i. Let us see what happens when this polynomial is evaluated at the dual number a+\epsilon.

\begin{align}
P(a+\epsilon)&=\sum_{i=0}^nc_i(a+\epsilon)^i=c_0+\sum_{i=1}^nc_i(a^i+ia^{i-1}\epsilon)\\&=c_0+\sum_{i=1}^nc_ia^i+\epsilon\sum_{i=1}^nic_ia^{i-1}\\&=P(a)+P'(a)\epsilon
\end{align}

Hence when computing P(a+\epsilon), the primal part will be the value of the polynomial at a and the dual part the value of the derivative.

The same calculations can be repeated for the more generic case P(f+f'\epsilon) leading to

P(f+f'\epsilon)=P(f)+P'(f)f'\epsilon

hence the primal part is the composition of functions and the dual part is the derivative of the composition, computed as expected by the chain rule.

Elementary Functions

So far we have seen how evaluating a polynomial over a dual number can be used to compute the derivative of the polynomial. What about an arbitrary elementary function f? If it is infinitely many times differentiable over its domain, then it has a Taylor series around a

f(a+h) = \sum_{i=0}^\infty\frac{f^{(i)}(x)h^i}{i!},

where f^{(i)} is the ith derivative of f. Now replacing h by the dual unit \epsilon and recalling that \epsilon^2=0 (and hence also all the higher powers) we get

f(a+\epsilon) = \sum_{i=0}^\infty\frac{f^{(i)}(x)\epsilon^i}{i!}=f(a)+f'(a)\epsilon

that is, the dual part is the derivative of the function. Considering the more generic case we get

f(g+g'\epsilon) = \sum_{i=0}^\infty\frac{f^{(i)}(g)g'^i\epsilon^i}{i!}=f(g)+f'(g)g'\epsilon

and hence the primal part is the composition of the functions and the dual part the derivative of the composition by the chain rule.

We have those shown that the algebra of dual numbers is the algebra of derivatives. Given a function f, its derivative at a can be extracted from the dual part of f(a+\epsilon).

This can be easily implementing in a programming language via function (and operator) overloading. First, define a type dual and overload arithmetic operations as described here and elementary functions using the derivation rules. If the function is written as a composition of the available overloaded elementary functions, the derivative can be computed automatically using dual numbers. This is exactly what is done in ForwardModeAD.