# Welcome to the autodiff30 tutorials

This tutorial will show you how to install and then use autodiff30. This is a fairly long notebook, the table of contents in the left hand menu may help jump between sections.

## Installation

It is recommended to work in a virtual environment to avoid and potential dependency conflicts, e.g. in Anaconda Prompt you can run:
```
conda create -n autodiff30_testing -y python=3.10.4 numpy jupyterlab
conda activate autodiff30_testing
```
Once in the environment install the package from PyPI:
```
python -m pip install autodiff30
```
Then to run this tutorial copy this jupyter file into the folder in which you would like to run it, navigate there in Anaconda Prompt and run ```jupyterlab```. This example is just for conda but the same can be achieved using any other distribution and a package like pyenv for environment management.

## Basic package usage

Before using the package it needs to be imported into the script, we recommend the alias ad:

In [45]:
import autodiff30 as ad

The key data structure the package provides is a decorator called ```adfunction```. This wraps a user-defined function and allows the gradient of that function be be computed by accessing the ```grad``` method of the resulting data structure (which is called an ```adstruc```, not that this is neccessary for you to know). We can see an example below for an extremely simple function, which just ```square```s an input. Clearly the derivative of this is just double the input.

Arithmetic within user-defined functions can be written as normal, but please see later sections for how more complex functions need to be written (e.g. using exponentials or trigonometry).

In [46]:
# Wrap your function with the decorator
@ad.adfunction
def square(x):
    """
    x is a numeric type, int or float
    square returns the square of the numeric input
    """
    return x ** 2

We can see the type of square has now changed:

In [47]:
type(square)

autodiff30.ad.adstruc

We can still call the function as we would previously:

In [48]:
numeric_input = 3
square(numeric_input)

9

But we now have access to a new method, grad, which uses automatic differentiation to find the gradient of *square* at *numeric_input*: 

In [49]:
square.grad(numeric_input)

6

It is as simple as that! Next we will show some examples with more complex functions.

## Advanced package usage

### Complex operators

First, we will still consider scalar functions, but we will involve more complex operators. As noted previously, arithmetic, including any of the operators listed below, can be written as in vanilla python. Supported operations include multiplication, addition, subtraction, divison, powers and comparison.

autodiff30 supports a number of higher level functions. **These must come from the ad package, numpy equivalents will not work**. For the full list of available functions please see functions.py in the source code, but in brief available functions are:
 - Trigonometric (sin, cos, tan, arcsin, arccos, arctan)
 - Exponential, logarithmic and logistic
 - Hyperbolic (sinh, cosh, tanh)

##### A hard-to-differentiate arithmetic function

In [50]:
@ad.adfunction
def hard_to_diff(x):
    """
    x is a numeric type, int or float
    hard_to_diff returns a float
    """
    return x*((3*(x**2)) - (4*x) + (5/x))**3

In [51]:
evaluated_at = 1
print("Function value: {}".format(hard_to_diff(evaluated_at)))
print("Gradient value: {}".format(hard_to_diff.grad(evaluated_at)))

Function value: 64.0
Gradient value: -80.0


This result can be confirmed at https://www.wolframalpha.com/input?i=differentiate+x%283x%5E2+-+4x+%2B+5%2Fx%29%5E3+at+x%3D1

##### A simple trigonometric function
Note the use of ad.sin and ad.cos in the below example

In [52]:
@ad.adfunction
def trig(x):
    """
    x is a numeric type, int or float
    trig returns a float
    """
    return (ad.sin(x)**2) + (ad.cos(x)) 

In [53]:
from numpy import pi
evaluated_at = pi/2
print("Function value: {}".format(trig(evaluated_at)))
print("Gradient value: {}".format(trig.grad(evaluated_at)))

Function value: 1.0
Gradient value: -0.9999999999999999


This result can be confirmed at https://www.wolframalpha.com/input?i=+differentiate+sin%28x%29%5E2+%2B+cos%28x%29+at+x%3Dpi%2F2

##### A function with multiple inputs
Autodiff30 handles functions with multiple inputs and multiple outputs using Python built-in lists. For example the below function has 3 inputs and outputs a scalar.

In [54]:
@ad.adfunction
def R3_to_R(x):
    """
    x is a list of numeric types, int or float
    R3_to_R returns an int or float
    """
    return (x[0]*x[2]) + (x[1]**2) + (x[2]**3)

In [55]:
evaluated_at = [3,2,4]
print("Function value: {}".format(R3_to_R(evaluated_at)))
print("Gradient value: {}".format(R3_to_R.grad(evaluated_at)))

Function value: 80
Gradient value: [4, 4, 51]


Notice here how the output is now a list, as the gradient of the function is taken with respect to the three different input directions.

##### A function with multiple inputs and outputs
The below function has 2 inputs and 2 outputs.

In [56]:
@ad.adfunction
def R2_to_R2(x):
    """
    x is a list of numeric types, int or float
    R2_to_R2 returns a list of floats
    """
    return [(x[0]**2), (x[0]**2)*(x[1]**2)]

In [57]:
evaluated_at = [2,3]
print("Function value: {}".format(R2_to_R2(evaluated_at)))
print("Gradient value: {}".format(R2_to_R2.grad(evaluated_at)))

Function value: [4, 36]
Gradient value: [[4, 0], [36, 24]]


Now we see that the output is a 2x2 matrix, in list form. This is the jacobian matrix, with each element representing the derivative of one of the oupts with respect to one of the inputs. Specifically, if we let $u$ = R2_to_R2() then the jacobian here is:

$$
\nabla \mathbf{u} =
\begin{bmatrix}
  \frac{\partial u_0}{\partial x_0} & 
    \frac{\partial u_0}{\partial x_1} \\[1ex] % <-- 1ex more space between rows of matrix
  \frac{\partial u_1}{\partial x_0} & 
    \frac{\partial u_1}{\partial x_1} \\[1ex]
\end{bmatrix}
$$

### An applied example

Here we will imagine we want to find the roots of a polynomial $f(x) = 6x^5-5x^4-4x^3+3x^2$ using [Newton's method](https://en.wikipedia.org/wiki/Newton%27s_method). This example is taken from [this](https://danielhomola.com/learning/newtons-method-with-10-lines-of-python/) website. The polynomial can be factored to show that the roots are x = 0, 1, ~ -0.79 and ~ 0.63

In [58]:
# Define the polynomial
@ad.adfunction
def f(x):
    return (6*x**5)-(5*x**4)-(4*x**3)+(3*x**2)
          
# Define a function that tells us how far from a root (f(x)=0) we are
def dx(f, x):
    return abs(0-f(x))
                   
def newtons_method(f, x0, e):
    # Get initial distance from root
    delta = dx(f, x0)
    while delta >= e:  # While not close enough to a root
        # Update the guess for x, using the grad method of f
        x0 = x0 - f(x0)/f.grad(x0)
        # Get new distance from root
        delta = dx(f, x0)
    print('Root is at: ', x0)
    print('f(x) at root is: ', f(x0))
                   
# Try Newton's method for a few different starting guesses
x0s = [0, .5, 1]
for x0 in x0s:
    newtons_method(f, x0, 1e-5)

Root is at:  0
f(x) at root is:  0
Root is at:  0.6286680781673307
f(x) at root is:  -1.3785387997788945e-06
Root is at:  1
f(x) at root is:  0


### Optimization

autodiff30 includes functions that use the underlying automatic differentiation capabilites to do optimization. Please see the background section of the documentation for a more complete introduction to these functions, only their application is considered here. Here we will use the optimization functions to find the minimum of a $R^2->R$ function, which can be visualized [here](https://www.wolframalpha.com/input?i=plot+x**2+%2B+y**2+-+10sin%28x%29*cos%28y%29+from+-10+to+10). We see that as before, we need to set our function to minimize using the ```adfunction``` decorator and ```ad.sin```, ```ad.cos``` functions.

In [59]:
@ad.adfunction
def f(x):
    """
    x is a list of numeric types, int or float
    f returns a float
    """
    return x[0]**2 + x[1]**2 - ad.sin(x[0])*ad.cos(x[1])

# Starting point
x0 = [1., 1.]

# Run optimization
res1 = ad.GD(f, x0)
res2 = ad.Adam(f, x0)

# The functions return the cooridnates of the minima
print(f" Solution for minimizing f with x0 = {x0} and algorithm GD is {res1}\n")
print(f" Solution for minimizing f with x0 = {x0} and algorithm Adam is {res2}\n")

 Solution for minimizing f with x0 = [1.0, 1.0] and algorithm GD is [4.50183627e-01 3.74356753e-08]

 Solution for minimizing f with x0 = [1.0, 1.0] and algorithm Adam is [ 4.50183526e-01 -6.36105660e-08]



Note there are many input parameters to these optimizers that may need exploring in order to get certain functions to converge (e.g. the learning rate and maximum number of iterations). See the docstring for more details.

There are further examples in the docs/examples folder. Happy differentiating!