# autodiffpy Documentation

## Introduction

Being able to calculate derivatives is crucial for optimization, probabilistic inference, simulations and modeling in the physical sciences, and much more. However, functions used for these sorts of purposes in the real world are often very complex, and it can be quite challenging to calculate the derivatives of these functions in practice. Our automatic differentiation (AD) software computes the derivatives of any function, with respect to any and all of the function's variables, to machine-precision-level accuracy, by breaking the function down into its elementary operations and using the chain rule (see **Background** for more details).

In addition to calculating derivatives, our AD software can also perform backpropagation.  Backpropagation is the process of altering a function's parameters until the outputs of the function behave as expected.  For example, given a function that has a set of fixed inputs, a set of weights, and a set of desired outputs, our software can be used to tweak the weights until the output of the function matches the desired output.

Our AD software has many applications, including for sensitivity analysis, numerical computation, and machine learning. 

## Background

Automatic differentiation is possible because any function, no matter how complicated, can be represented as a combination of **elementary operations**, such as addition, multiplication, exponentiation, and trigonometry. In other words,  $f(x)$ can be represented as $g_{n}(g_{n-1}(g_{n-2}(...g_1(x))))$, where $g_i(x)$ is the value of the $i^{th}$ elementary operation at x.

Automatic differentiation uses the **chain rule** to calculate a function's derivative. Recall that, using the chain rule, the derivative of function $h\left(u\left(t\right)\right)$ is $\dfrac{\partial h}{\partial t} = \dfrac{\partial h}{\partial u}\dfrac{\partial u}{\partial t}.$

For example, let's say that we want to compute $f^{\prime}\left(\dfrac{\pi}{16}\right)$ of a complicated function $f(x)$, where $f'(x)$ denotes the derivative of $f(x)$:
$$f\left(x\right) = x - \exp\left(-2\sin^{2}\left(4x\right)\right).$$

The evaluation trace below shows how the function $f(x)$ is broken down into combinations of elementary operations. The first column indexes each elementary operation, with the first row representing the value of $x$ itself.  The second column shows the form of each elementary operation, while the third column shows the form of the derivative of each elementary operation.  The fourth column lists the numerical value of each elementary operation and its derivative, respectively.

| Trace    | Elementary Operation &nbsp;&nbsp;&nbsp;| Derivative &nbsp;&nbsp;&nbsp; | $\left(f\left(a\right), \space f^{\prime}\left(a\right)\right)$ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|
| :------: | :----------------------:               | :------------------------------: | :------------------------------: |
| $x_{1}$  | $\dfrac{\pi}{16}$                      | $1$                | $\left(\dfrac{\pi}{16}, 1\right)$ |
| $x_{2}$  | $4x_{1}$                               | $4\dot{x}_{1}$                 | $\left(\dfrac{\pi}{4}, 4\right)$ |
| $x_{3}$  | $\sin\left(x_{2}\right)$               | $\cos\left(x_{2}\right)\dot{x}_{2}$            | $\left(\dfrac{\sqrt{2}}{2}, 2\sqrt{2}\right)$ |
| $x_{4}$  | $x_{3}^{2}$                            | $2x_{3}\dot{x}_{3}$                   | $\left(\dfrac{1}{2}, 4\right)$ |
| $x_{5}$  | $-2x_{4}$                              | $-2\dot{x}_{4}$ | $\left(-1, -8\right)$ |
| $x_{6}$  | $\exp\left(x_{5}\right)$               | $\exp\left(x_{5}\right)\dot{x}_{5}$ | $\left(\dfrac{1}{e}, - \dfrac{8}{e}\right)$ |
| $x_{7}$  | $-x_{6}$                               | $-\dot{x}_{6}$                  | $\left(-\dfrac{1}{e}, \dfrac{8}{e}\right)$ |
| $x_{8}$  | $x_{1} + x_{7}$                        | $\dot{x}_{1} + \dot{x}_{7}$ | $\left(\dfrac{\pi}{16} - \dfrac{1}{e}, 1 + \dfrac{8}{e}\right)$ |

Therefore, $\space f^{\prime}\left(\dfrac{\pi}{16}\right) = 1 + \frac{8}{e} = 3.9430355293715385. $

The **computational graph** drawn below visualizes the evaluation trace. Each node with an incoming arrow represents an elementary operation, which is applied to the node at the tail-end of that same arrow.

![](fig/graph1.png)

Our automatic differentiation package uses this approach to calculate the derivatives of a given function.

## How to Use the Package

### How to Install

We have distributed our package on PyPI.  Therefore, users only have to enter `pip install autodiffpy` into a terminal to download our package through PyPI.

Another way for users to download our package is from our github page: https://github.com/rajayuco/cs207-FinalProject.  To do so, users can (1) click the previous link, click the green "Clone or Download" button on the website, and then download the zip file manually, OR (2) download the package folder by typing the following into a terminal: `git clone https://github.com/rajayuco/cs207-FinalProject.git`.

To install external dependencies, users can navigate into the package directory from within a terminal and type `pip install -r requirements.txt`.


### Demo

Please see the **Implementation** section below for examples of using the package.

## Software Organization

* Our package is structured as follows:
    
    -autodiffpy\
         -autodiffpy\
              -__init__.py
              -autodiff.py
              -autodiff_math.py  
         -tests\
              -__init__.py
              -autodiff_test.py
              -autodiff_math_test.py
         -docs\
              -autodiffpy_doc.ipynb
         -README.md
         -setup.py
         -requirements.txt
         -LICENSE


* Our autodiffpy module is organized as follows:
    * autodiff class (autodiff.py)
        * Overwrites elementary operations (such as `__add__`, `__pow__`, `__mul__`, `__truediv__`, etc.), allowing the calculation of derivatives using automatic differentiation
        * Includes the reverse forms of those elementary operations to support both commutative and non-commutative operations (`__radd__`, `__rpow__`, `__rmul__`, `__rtruediv__`, etc.)  
        * jacobian method
            * Returns an n-dimensional (nd) array representation of the given autodiff instance's derivatives
        * backprop method
            * Returns the relative error in the inputs of the autodiff instance
    * autodiff_math module (autodiff_math.py)
        * Contains methods for performing mathematical operations, such as exp(), sinh(), arccos(), tan(), and logistic(), on autodiff instances 

### Test suite

Our test suite is performed through both TravisCI and Coverall.  We make use of both unit testing and doctests, which together break our modules' methods into pieces, subject each piece to a series of tests, and compare the tests' results to what we have declared the output should be.  These tests allow us to detect, in a compartmentalized manner, when new changes to our code cause potential problems.

In particular, TravisCI works in parallel with our developed software, as it takes the code we have written and committed to GitHub and runs the test suite that we have defined. So instead of running all tests by hand before deployment, we have been able to focus almost entirely on implementation, meaning we have spent more time building our package and less time checking for structural integrity.


## Implementation

### The autodiff Module

#### The autodiff class

The module **autodiff** contains three components meant for the user: the class *autodiff*, the method *jacobian*, and the method *backprop*.

The *autodiff* class allows users to generate variables, such as `x` and `y`, and then use those variables to form a function. The class then performs automatic differentiation on that function.  In the process, the class calculates (1) the numerical value of that equation and (2) the numerical value of that functions’ derivatives with respect to all variables encountered within the function.


To carry out this process, the user must first initialize each variable of the desired function separately, as a different instance of the *autodiff* class.

Each variable (aka, *autodiff* instance) requires the following inputs:

* `name` [string, required]: The name that the user would like to use for this variable (i.e., "x" or "y").  
* `val` [number/nd-array of numbers, required]: The numerical value/nd-array of values that the user would like to assign to this variable.
* `der` [number/nd-array of numbers, required; default=1]: The value/nd-array of this variable’s derivative(s) with respect to itself.

***Note that our package assumes implicitly that the user will give each variable a name unique from all other variables, and that all variables will be given values for `val` (and `der`, if specified) that are of the same dimension.***

The user can then perform mathematical operations on these variables in the form of a function.  Doing so will return a new instance of the *autodiff* class, which will have the following output attributes relevant to the user:

* `val` [number/nd-array of numbers]: This returns the computed numerical value of the function.
* `der` [dictionary]: This returns a dictionary that contains the values of the function’s derivatives, calculated with respect to every single variable encountered in the function.


From this returned instance of the *autodiff* class, the user therefore has numerical values/nd-arrays of values for both the function and its derivatives, for all variables encountered within the function.


Underneath the ‘hood’ of the code, so to speak, the *autodiff* class contains private dunder methods that the user should *not* attempt to access.  These methods override elementary operations (`__add__`, `__sub__`, `__mul__`, `__neg__`, etc.) and the reverse of those operations (`__radd__`, `__rsub__`, `__rmul__`, etc.).  Each overridden method determines the derivatives of the elementary operation, calculated with respect to each unique variable key name contained in the variables’ attribute dictionary `der`.  Each overridden method then returns a new instance of the *autodiff* class, which contains the updated function value/nd-array of values and derivative values/nd-arrays of values stored in its attributes.


The example below demonstrates how users can interact with the *autodiff* class in our software:

```python
>>> # Import the autodiff class
>>> from autodiffpy import autodiff as AD
>>> # Create variable instances of the class
>>> x = AD.autodiff(name="x", val=[3, 1])
>>> y = AD.autodiff(name="y", val=[-4.5, 7])
>>> # Define the equation to evaluate
>>> f = x**2 + y - x/y
>>>
>>> # Output the results
>>> print(f.val) # Numerical value of equation
[ 5.16666667  7.85714286]
>>> print(f.der["x"]) # Numerical values of equation’s derivative with respect to "x"
[ 6.22222222  1.85714286]
>>> print(f.der["y"]) # Numerical values of equation’s derivative with respect to "y"
[ 0.85185185,  0.97959184]
```

#### The jacobian method

The *jacobian* method within the *autodiff* class allows users to format the derivatives of an instance of the *autodiff* class in `numpy` nd-array form.  Users can specify the variables and the ordering that they would like the nd-array to follow using the input argument `order`.  If `order` is not specified, then the ordering of the returned nd-array will be arbitrary.  *jacobian* returns a dictionary, where the `numpy` nd-array is stored beneath the keyword "jacobian", and the ordering is stored beneath the keyword "order".

The below example, which continues from the previous example, demonstrates the operation of the *jacobian* method:



```python
>>> # Print the previously-calculated derivatives in numpy array form (real output won’t have rounded values)
>>> f.jacobian() # Dictionary containing formatted nd-array form and ordering
{'jacobian': array([[ 6.22222222,  1.85714286],
       [ 0.85185185,  0.97959184]]), 'order': ['x', 'y']}
>>>
>>> # Access the formatted nd-array
>>> f.jacobian()["jacobian"] 
array([[ 6.22222222,  1.85714286],
       [ 0.85185185,  0.97959184]])
>>>
>>> # Access just the derivatives for "x"
>>> f.jacobian(order="x")["jacobian"] 
array([[ 6.22222222,  1.85714286]])
>>>
>>> # Format the derivatives in the order of "y", "x"
>>> f.jacobian(order=["y","x"])["jacobian"] 
array([[ 0.85185185,  0.97959184],
       [ 6.22222222,  1.85714286]])
```

#### The backprop method

The backpropagation method allows users to calculate the change in variables necessary to bring a function's *actual* outputs closer to outputs *desired* for that function.  Essentially, this method passes errors backward through a given function, and then it determines a gradient that, if applied to the function's variables, will in turn apply a correction to the function relative to the desired outputs.

The below example demonstrates a way to use the *backprop* method:

In [None]:
!!!!!!





### The autodiff_math Module

External, elementary mathematical operations, exluding simple arithmetic operations, are included in the *autodiff_math* module. Users are required to import this module to perform these operations on autodiff instances, because `numpy` and other standard math libraries in `python` are not equipped to handle instances of our *autodiff* class.

The operations in *autodiff_math* include trigonmetric functions, logarithmic functions, exponential functions, and the logistic function. Each operation within this module returns a new autodiff instance with a properly updated value(s) stored in `val` and an updated dictionary of derivatives stored in `der`.

The below example demonstrates how to use our *autodiff_math* module:

```python
# Import autodiff
>>> from autodiffpy import autodiff as AD
>>> from autodiffpy import autodiff_math as adm
>>>
>>> # Create autodiff instances
>>> x = AD.autodiff('x', 100) # Create an autodiff instance
>>> y = AD.autodiff('y', 1.5) # Create another autodiff instance
>>>
>>> # Perform external mathematical operations
>>> f1 = adm.log(x, base=10)
>>> f2 = adm.sinh(f1*y)
>>>
>>> # Print the results
>>> print(f2.val)
[ 10.01787493]
>>> print(f2.der)
{'x': array([ 0.06558495]), 'y': array([ 20.13532399])}
```

### External Dependencies

Our package requires `numpy` (version 1.15.1), which we use to organize the output of our *jacobian* method and for performing inner math functions (such as $e^x$) within our *autodiff_math* module. 


## Work for the Future

Our implementation of backpropagation (our *backprop* method in the *autodiff* class) calculates the changes in variables necessary to bring a function's actual outputs closer to given desired outputs.  Our implementation calculates these changes efficiently and to machine precision.

However, we note that for certain optimization and machine learning applications, such as for neural networks, linear algebra through matrix operations (i.e., through dot product) is extremely useful.

In its current form, our backpropagation method, as well as our autodiffpy package as a whole, cannot *directly* perform matrix operations.  There are ways to work around this restriction with our package; linear algebra, for example, is merely a combination of linear equations, and our package is completely equipped to handle linear equations.  But as of yet, our package cannot perform matrix operations.  For now, we leave implementing matrix operations into our package for future work.
