# Exploring algorithmic differentation

In this notebook you will explore for yourself how Julia's composability allows to very quickly superchange existing code with new features, just by changing the way the compiler sees the already existing code. Custom types and Julia's multiple dispatch are crucial to make this work out easily as we will see.

## Babylonian square root

In this work we will consider the *Babylonian Square Root* algorithm, which is a simple iterative algorithm the Babylonians invented for computing the square root of a number $x$:

   * Initialise $t \leftarrow (1 + x) / 2$
   * Repeat $t \leftarrow (t + x / 2) / 2$ for $N$ times.
   * $t$ converges to $\sqrt{x}$.

In Julia code this can be implemented as follows:

In [None]:
function babylonian_sqrt(x; N=10)
    t = (1 + x) / 2
    for i = 2:N
        t = (t + x / t) / 2
    end
    t
end

**Exercise:**

1. Confirm that `babylonian_sqrt` converges to $\sqrt{x}$ by comparing it with Julia's standard `sqrt` function. As a reference compute `sqrt(big"2.0")`, then compute `sqrt_babylonian` for `N=1` to `N=10`.
    * What is the error of `babylonian_sqrt` against the reference in each case?
    * Plot the absolute error (note the `abs` function) on a semilog scale versus `N` (use the `Plots` package and pass the kwarg `yaxis=:log` to the `plot` function)
    * Does increasing `N` reduce the error?
    * How can you get a more accurate answer without changing the implementation of `babylonian_sqrt`?
    * Why is `sqrt(big"2.0")` a good reference in the first place?

In [None]:
# You're solution ...

2. Do the same thing as in 1., but using a different data type for the input number, try `Float16`, `Float32`, `Float64`, and `BigFloat` and again vary `N`. Plot the error against the reference for all data types in one plot (use the `plot!` function to add another plot to an existing canvas).

In [None]:
# You're solution ...

## Algorithmic differentiation

We already saw [previously](13_Composability_Code_Reuse.ipynb) that we can obtain new features in Julia by changing the input type. A powerful data type invented by Clifford in 1873 is the [*dual number*](https://en.wikipedia.org/wiki/Dual_number). Based on these dual numbers one can (for example) perform what is now known as forward-mode automatic differentiation (AD).

In practice Julia already has the [ForwardDiff](https://github.com/JuliaDiff/ForwardDiff.jl) package to bring this feature to the ecosystem, but to understand a bit better how this works we will roll our own simple (and incomplete) Julia implementation:

In [None]:
struct Dual <: Number
    x::Float64   # Value
    δx::Float64  # Derivative
end

# Implementation of basic derivative rules:
Base.:+(a::Dual, b::Dual) = Dual(a.x + b.x, a.δx + b.δx)  # (f+g)'(x) = f'(x) + g'(x)
Base.:-(a::Dual, b::Dual) = Dual(a.x - b.x, a.δx - b.δx)
Base.:*(a::Dual, b::Dual) = Dual(a.x * b.x,  a.x * b.δx + a.δx * b.x )  # (f*g)'(x) = f(x)*g'(x) + f'(x)*g(x)
Base.:/(a::Dual, b::Dual) = Dual(a.x / b.x, (b.x * a.δx - a.x  * b.δx) / b.x^2)

# Handling type conversion
Base.convert(::Type{Dual}, x::Real) = Dual(x, zero(x))
Base.promote_rule(::Type{Dual}, ::Type{<:Number}) = Dual

With these 15 lines of code derivatives of Julia functions (that only rely on `+`, `-`, `*` and `/`) can now be obtained. For convenience we introduce the following function for this purpose:

In [None]:
derivative(f::Function, x::Number) = f(  Dual(x, one(x))  ).δx

In [None]:
derivative(x -> x, 2.0)

**Exercise:**

3. Compare `derivative(babylonian_sqrt, some_number)` against the correct analytical derivative of $\sqrt{x}$ at this point. Does it give the correct derivative?

4. Try `derivative` on other functions or algorithms. For example code up a recursive exponentiation function like
    ```julia
    pow(x, n) = n <= 0 ? one(x) : x * pow(x, n - 1)
    ```

In [None]:
# You're solution ...

## Symbolic manipulations

Now imagine you are tasked to quickly confirm that the Babylonian approximation for `N=4` is equivalent to the analytical form

$$
\text{babylonian_sqrt}(x; N=4) \approx \begin{equation}
\frac{\frac{1}{32768} + \frac{15}{4096} x + \frac{455}{8192} x^{2} + \frac{15}{4096} x^{7} + \frac{455}{8192} x^{6} + \frac{1001}{4096} x^{3} + \frac{6435}{16384} x^{4} + \frac{1001}{4096} x^{5} + \frac{1}{32768} x^{8}}{\left( \frac{1}{2} + \frac{1}{2} x \right) \left( \frac{1}{8} + \frac{1}{8} x^{2} + \frac{3}{4} x \right) \left( \frac{1}{128} + \frac{1}{128} x^{4} + \frac{7}{32} x + \frac{35}{64} x^{2} + \frac{7}{32} x^{3} \right)}
\end{equation}
$$

**Exercise:**

5. Is this formula correct?
    - Hint: Use the `Symbolics` package, in particular `@variables x` and `simplify`.
    - Note: Make sure you have `N=4` since for larger `N` problems might start to occur.

In [None]:
# You're solution here ...