## Automatic Differentiation Documentation

### Introduction

---

Automatic differentiation (AD) encompasses a suite of tools used to compute the derivatives of functions, evaluate the functions at specified values, and evaluate the functions' derivates at specified values. In a situation where analytically deriving the derivative of compicated functions is not feasible (within the user's limitations), AD guarantees to return the exact solution in theory. In practice, rounding errors may compound given that AD performs a series of elementary operations.  




### Background

---


The way automatic differentiation works is by taking a possibly complex function and breaking it down into a sequence of elementary functions 
(i.e. summation, multiplication, cosine, etc.), where the output or outputs of past elementary functions are fed into the input of the next elementary 
function. The sequence of the elementary functions starts by first assigning a value or values to the variables in the function, 
then working its way from the inside out of the function by sequentially performing the elementary functions until you build out the whole function. 
This sequence can be expressed in a graph structure. ![Graph Image](https://blog.paperspace.com/content/images/2019/03/computation_graph_forward.png) 
Once you have the sequence of elementary functions, what automatic differentiation does is "passing down" the evaluation of the elementary function and the evaluation 
of the derivate through the sequence to get the whole function and its derivative evaluated at certain values. Mathematically, to evaluate an elementary function node, 
you need to take the function evaluation outputs of the nodes that feed into it and use that to evaluate that node. There is more subtlety for passing along the derivative evaluation. 
In a way, each elementary function has some variables that it depends on, but those variables from the previous nodes depend on other variables and thus, to pass the derivative along 
the sequence we need to use the chain rule $$\frac{\partial f}{\partial x}=\frac{\partial f}{\partial y}\frac{\partial y}{\partial x}$$. Now we can see that we need 
to take the derivate of the node we are in and multiply it by the evaluated derivative(s) of the previous node(s)! But we still need a derivate evaluated for the initial mode, 
and this will be assigned with a seed vector of the choosing. Intuivitely, what this seed vector is doing is making the derivative into a directional derivative,
the seed vector being the direction where the derivative (or the Jacobian in the case of multiple functions) is being projected in.

For example, a user may want to evaluate the derivative of a complicated function $f'$ at a given point. Let us define a function $f$:
$$
f(x) = sin^3(x^2 + cos(\sqrt{x}))
$$
The derivative of the function $f'$, given below, is messy and tedious to derive.
$$
f'(x) = 3 \left(2x - \frac{sin(\sqrt{x})}{2\sqrt{x}}\right) cos(x^2 + cos(\sqrt{x})) sin^2(x^2 + cos(\sqrt{x})) 
$$
However, the user does not have to analytically derive the derivative of the given function when using automatic differentiation. 
Provided the user supplies a function of interest and point(s) of interest, the derivative of the given function will be evaluated
at the given point(s) of interest. 

### Software Organization

---

#### Directory Structure


    Auto_diff/

        __init__.py  
        fd/
            __init__.py
            FD.py  # create FD objects
                     ...
        rd/
            __init__.py
            RD.py  # create RD objects
                     ...
        utils/ 
            __init__.py 
            jacobian.py # helps create jacobian matrix
                           ...
        tests/
            __init__.py 
            test_basic_v2.py  # test basic operations for forward mode 
            test_rd.py  # test basic operations for reverse mode
            test_jacobian.py  # test jacobian helper 
                   ...
                   
#### Modules

We have three modules within our package `Auto_diff`.
*  `FD`: a module that contains the following class:
    * The `FD` class used for instatiating an `FD` object, which is used to perform the forward mode of automatic differentiation and produces the numerical output.
*  `RD`: a module that contains the following class:
    * The `RD` class used for instatiating an `RD` object, which is used to perform the reverse mode of automatic differentiation and produces the numerical output.
*   `jacobian`: a module that contains the following function:
    * The function`Jacobian` used for handling functions of multiple inputs. This function takes as an argument a list defining the values for each input for the given function and returns a list of `FD` objects.
*  `test_basic_v2`: a module that tests all the elementary functions (addition, multiplication, power, etc) and functions like `get_value` and `get_derivative` for the forward mode. 
*  `test_rd`: a module that tests all the elementary functions (addition, multiplication, power, etc) and functions like `get_value` and `get_gradient` for the reverse mode. 
*  `test_jacobian`: a module that tests function `Jacobian` on a single function and multiple functions.


#### Testing

Module testing can be found in the files `test_basic_v2.py`, `test_rd.py` and `test_jacobian.py`. 
* ** test_basic: ** Each elementary function for the forward mode is tested individually inside `test_basic_v2.py`.
* ** test_rd: ** Each elementary function for the reverse mode is tested individually inside `test_rd.py`.
* ** test_jacobian: ** Function `Jacobian` is tested on a single function and multiple functions respectively to see if the function works in both scenarios.

Our test suite is included in the subdirectory `tests` that runs with pytest automatically on TravisCI and CodeCov. 

### Installation

##### Installing Python
You will need an updated version of python that is compatible with your system. These downloads can be found [here](https://www.python.org/downloads/).
Downloading a python version $\geq$ 3.4 will also install pip, the package manager for python.

#### Option 1 - Installing from GitHub
##### Installing Git
Git is a version control software that will be used in order to pull all relevant package data from the Github repository. This step is not
necessary, but it greatly simplifies the process of downloading all relevant data. The steps used to install git for your given machine
can be found [here](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).

The automatic differentiation package can be installed by cloning the necessary github repository using the following command.

    git clone https://github.com/AutoDiff-Dream-Team/cs107-FinalProject.git


The dependencies needed to properly use this pacakge can be installed by running the following command. First, you must be located in the directory that contains the `requirements.txt` file.

    pip install -r requirements.txt


#### Option 2 - Installing using pip
The automatic differentiation package can be installed by using the following command.

    pip install iamautodiff

The above command will install our package and the necessary dependencies. 

The package `Auto_diff` contains 3 modules `FD`, `RD`, and `jacobian`. These can be imported all at once as follows.

    from Auto_diff import *

These can also be imported individually as follows.

    from Auto_diff import FD
    from Auto_diff import RD
    from Auto_diff import jacobian


### Implementation
---
Classes: 
There will be one main class that users will interact with in order to perform AD. The `FD` class is capable of performing the forward mode of automatic differentiation. The basic workflow is as follows: 
The user instantiates an instance of the `FD` class by setting the value of the given variable and uses the newly created `FD` object as the input to a user-defined function. 
The `FD` object stores the value of the function and the value of the derivative in the attributes `val` and `der`. The `RD` class performs
the reverse mode of automatic differentiation. The user instantiates an instance of the `RD` class, similar to instantiation of an `FD` instance.

<u>Inputs, attributes, and methods for an `FD` object:</u>

Inputs:
* **val**
    * Type: Numeric (default is 1)
    * The value of the function evaluated at the specified user input for the given variable
* **der** 
    * Type: Numeric (default is 1)
    * The value of the derivative of the function evaluated at the specified user input
    * This input is likely to be changed from 1 when computing partial derivatives for functions of multiple variables

Attributes:
* **val**
    * Type: Numeric (default is initially 1)
    * The value of the function evaluated at the specified user input
* **der** 
    * Type: Numeric (default is initially 1)
    * The value of the derivative of the function evaluated at the specified user input

Methods:
* The `FD` class does not contain methods that will be commonly accessed by the user.
* Basic operations including addition, subtraction, multiplication, division, power, exponential, negation, and the trigonometric functions sine, cosine, and tangent
  are overloaded in the definition of the class.
* Methods that will be explicitly called from the class include:
    * `logarithm`: The first argument is a `FD` object and the second argument is the base
    * `logistic`: The only argument is a `FD` object.
    * `get_derivatives`: Returns the derivative with respect to the given variable for a list of `FD` objects, evaluated at the value of the given variable.
    * `get_values`: Returns the value of the given `FD` object for a list of `FD` objects.

<u>Inputs, attributes, and methods for an `RD` object:</u>

Inputs:
* **val**
    * Type: Numeric (default is 1)
    * The value of the function evaluated at the specified user input for the given variable

Attributes:
* **val**
    * Type: Numeric (default is initially 1)
    * The value of the function evaluated at the specified user input
* **grad**
    * Type: Numeric (default is 1)
    * The partial derivative of the given function with respect to the given variable
* **children**
    * Type: List of tuples
    * The first element of each tuple is the derivative of the elementary function. The second element of each tuple is a `RD` object which is a child of the given object. Traversal over these children is used for the reverse mode.

Methods:
* Most methods from the `RD` class will not be commonly accessed by the user.
* Basic operations including addition, subtraction, multiplication, division, power, exponential, negation, and the trigonometric functions sine, cosine, and tangent
  are overloaded in the definition of the class.
* Methods that will be explicitly called from the class include:
    * `logarithm`: The first argument is a `RD` object and the second argument is the base
    * `logistic`: The only argument is a `RD` object.
    * `get_gradient`: Returns the derivative with respect to the given variable evaluated at the value of the given variable.
    * `get_value`: Returns the value of the given variable.

Additionally, the `Jacobian` function can be used in order to streamline the process of computing the Jacobian matrix. Given the input 
values for the variables of a given function passed as a list when using the `Jacobian` function, a 2D array of `FD` objects is returned.
This allows users to compute the Jacobian for a given function without needing to manually compute each element of the Jacobian. 

<u>Inputs, attributes, and methods for a `Jacobian` object:</u>

Inputs:
* **arr**
    * Type: Numeric List
    * The value of each input variable for the given function or functions

Output:
* Type: 2D numpy array
* Each element in the 2D numpy array is an `FD` object

Necessary Dependencies:

`Numpy` is the only dependency that will be required in order to properly use our package. 
`Numpy` will be used to handle the elementary functions such as $sin(x)$ and $exp(x)$.


### How to Use
---

### Instantiating and using an FD object (Forward Mode of Automatic Differentiation)

Import the FD module and numpy using the following commands. 

    
    from Auto_diff import FD
    import numpy as np

Instantiate an FD object. You can change the first argument, but leave the second argument as 1 for this example.
    
    # Instantiates the FD object
    val = 3
    x = FD(val,1)

Define a function that takes one argument as input. This will be your FD object.

    
    # User defined function (you can change the body of this function)
    def f(x):
        output = np.sin(x**2 + 3)
        return output

Once the function is defined, you can call the function by passing the FD object as the argument.

    
    # Running this will change the values of the val and der attributes (these are shown in output which is an FD object)
    output = f(x)

    # If you dont wan't to define a function you can compute them on the fly
    output = np.sin(x**2 + 3)

    print(f"The value of the function evaluated at {val} is {output.val}.")
    print(f"The value of the derivative of the function evaluated at {val} is {output.der}.")

### Using multiple FD objects (functions of multiple inputs)

You can define functions of multiple variables where each variable is an FD object. An example is given below. When using multiple variables
the second arguments in the instantiations of the AD objects will define the seed vector and therefore will dictate the value of the computed 
derivative. For example, let us define a function $f=x^2+y$. Let us set the values for $x$ and $y$ as 2 and 3 respectively.
We can instantiate the FD objects as follows and define our function as follows.

    # Instantiates the FD objects
    x = FD(2,1) 
    y = FD(3,1) 

    # Define the function
    def f(x,y):
        return x**2 + y



We will first manually derive the value of the function and the values of the partial derivatives of the function.

We can evaluate the function at $x=2$ and $y=3$. 
$$
f(2,3) = 2^2 + 3 = 7
$$
We can also determine $\frac{\delta f}{\delta x}$ and $\frac{\delta f}{\delta y}$.
$$ 
\frac{\delta f}{\delta x} = 2x
$$
$$ 
\frac{\delta f}{\delta y} = 1  
$$
Using our values for $x$ and $y$, $\frac{\delta f}{\delta x}=4$ and $\frac{\delta f}{\delta y}=1$.
We can compute $\frac{\delta f}{\delta x}$ by setting the second argument for `x` to 1 and the second
argument for `y` to 0. Similarly, we can compute $\frac{\delta f}{\delta y}$ by setting the second argument 
for `x` to 0 and the second argument for `y` to 1. This is done as follows.

    # Evaluate the function and the partial derivative with respect to x for x=2 and y=3
    x = FD(2,1) 
    y = FD(3,0) 
    output1 = f(x,y)

    # Evaluate the function and the partial derivative with respect to y for x=2 and y=3
    x = FD(2,0) 
    y = FD(3,1) 
    output2 = f(x,y)

    # The value of the function should be the same for both evaluations
    function_value1 = output1.val
    function_value2 = output2.val
    assert function_value1 == function_value2

    der_x = output1.der
    der_y = output2.der

    print(f"The value of the function f at x=2 and y=3 is {function_value1}".)
    print(f"The partial derivative with respect to x of f at x=2 and y=3 is {der_x}".)
    print(f"The partial derivative with respect to y of f at x=2 and y=3 is {der_y}".)

### Computing the Jacobian

We also give the user the option of directly taking the Jacobian Matrix of a function or set of functions with one or multiple variables. The user will
import the Jacobian module and call the Jacobian method by passing just the values of the variables they want and the method will return a numpy ndarray 
of FD objects ready to be used in a single function or in a list of functions. This will return a matrix of n x n of FD objects evaluated at the 
correct seed vectors for the Jacobian. You can get only the derivatives by using the get_derivative method.
    
    from Auto_diff import Jacobian, FD

    # Passing in list of values for x amount of variables
    x = Jacobian([2, 7, 10])

    # This assignment will return a 3x3 matrix of AD objects
    jacobian_results = [np.exp(x[0]/x[1] + x[2]),tan(sin(3^x[0])*x[2]), x[0]+x[1]+x[2]]

    # Getting only the derivatives of the 3x3 matrix, i.e., the Jacobian matrix
    jacobian_matrix = FD.get_derivatives(jacobian_results) 

    print(jacobian_matrix)

### Reverse Mode



We implemented the reverse mode of automatic differentiation. This implementation is meant to overcome one of the main pitfalls of 
the forward mode, namely computing multiple partial derivatives such as has to be done when computing the Jacobian. For example, given 
we want to compute $\frac{\delta f}{\delta x}$ and $\frac{\delta f}{\delta y}$ for some function $f$, we would need to perform the forward
mode twice. Once when setting the seed for the `FD` object associated with $x$ to 1 and the `FD` object associated with $y$ to 0 and another
time setting the seed for the `FD` object associated with $x$ to 0 and the `FD` object associated with $y$ to 1. 

We can think of performing the reverse mode as inverting the expressions for the derivatives when peforming the chain rule. Our implementation
performs a forward pass through the mathematical expression, where each elemental operation is represented as a node, storing the 'children'
of each node. After carrying out the forward pass, we do a backwards pass throught all the elemental operations carrying the gradient through
them until reaching the input variable we want to take the gradient of. Once the forward pass is done, we can take the gradient of any number of 
input variables easily without having to instatiate new objects. It has to be noted that, different from forward differentation, once a Reverse
Differentation object is used for particular function, this object can't be reused for another function. More precisely, if `n` is the number of input
variables and `m` is the number of functions, if we are using Forward Differentation we have to create and carry out the forward pass `n` number of times, 
while if we perform Reverse Differentation, we carry out the process m number of times.



### Instantiating and using an RD object (Reverse Mode of Automatic Differentiation)

Peforming the reverse mode of automatic differentiation is similar to performing the forward mode previously outlined. Below is a quick
tutorial on how to perform the reverse mode.

    from Auto_diff import FD
    import numpy as np

    # Instantiates the FD object
    val_x = 3
    val_y = 5
    val_y = 10
    x = RD(val_x)
    y = RD(val_y)
    z = RD(val_z)

    # User defined function (you can change the body of this function)
    def f(x,y,z):
        output = np.sin(x**2 + 3) + np.tan(y/z)
        return output

    # Running this will change the value of the grad attrubute of x, y, and z
    output = f(x,y,z)

    # If you dont want to define a function you can compute them on the fly
    output = np.sin(x**2 + 3) + np.tan(y/z)

    # You can pull the values of x, y, and z the value of the function, and the derivative of the function with respect to x, y, and z
    val_x = x.get_value()
    val_y = y.get_value()
    val_z = z.get_value()
    val_f = output.val
    grad_x = x.get_gradient()
    grad_y = y.get_gradient()
    grad_z = z.get_gradient()

    print(f"The values of x, y, and z are {val_x}, {val_y}, and {val_z} respectively.")
    print(f"The value of the function evaluated at {val_x}, {val_y}, and {val_z} is {val_f}.")
    print(f"The value of the derivative of the function with respect to x is {grad_x}.")
    print(f"The value of the derivative of the function with respect to y is {grad_y}.")
    print(f"The value of the derivative of the function with respect to z is {grad_z}.")

As can be seen, the `get_gradient()` method can be called on each `RD` object in order to get the derivative with respect to the given 
variable. Only one forward pass per function is performed. One reverse pass is performed for each variable (this is done when using the
`get_gradient()` method).

### Compatible Functions
---

Both of our FD and RD classes are compatible with the following functions:
- np.sin, np.arcsin, np.sinh
- np.cos, np.arccos, np.cosh
- np.tan, np.arctan, np.tanh
- np.exp
- np.sqrt
- FD.logarithm, RD.logarithm (both function take an input `base`, ex. FD.logarithm(my_FD, np.e), this will give a logarithm with base $e$)
- FD.logistic, RD.logistic (logistic function)

### Broader Impact and Inclusivity Statement
---

In the closing years of the 2010s, the technology industry came under fire for turning a blind eye to the unintended consequences created from new-age technology. The classic example is that of Facebook and the divisiveness the United States experienced in the 2016 and 2020 elections. This statement aims to reflect upon the issue how our automatic differentiation package could have unintended, exclusionary consequences.

Our development team operated under the assumption that users of our package have a basic familiarity with object oriented programming, calculus and mathematical terminologies in English. While we built the package to have a smooth user experience, the software functions inside of a python environment, and rides upon the basic assumption that users have fundamental programming, mathematical capabilities and some basic english. While this was by design, the package does, as a result, exclude anyone without these fundamental abilities, of which there is a large portion of the United States / Cambridge / Harvard community. In the case that a student or professional aimed to solve a complex automatic differentiation problem, and did not have basic capabilities in python, they would be excluded from our package.

A potential solution that would make our package more inclusive is the development of a web interface , similar to the one demo'd in class, through which any user hoping to compute a complex derivative to machine precision could simply enter their input functions and values and get a quick and easy result. This could be an extension for the future of our package.

We had some additional considerations on how this package could be more inclusive. To start, we could expand it beyond python and make it accessible in different languages. This package is meant for python developers, a subset of the coding population which may be relatively young and early-career. For developers later in their career that have not necessarily learned python (or earlier, for that matter) we would need a package that speaks their native coding language. In this way, we could expand our software. 

### Future
---

There are many more extensions that will be a great add-on to our package in the future: 

1. adding functions such as factorial, cube root, gamma function, permutation, combination, etcetera.

2. implementation for complex numbers

3. implementation for higher order derivatives or mixed derivatives
* This is useful for modeling many physical phenomena

4. additional features for root finding
* This is useful for optimization problems that arise in machine learning, biomedical research, finance, etc. Optimization problems
are found in almost every field. Root finding is an obvious complement to automatic differentiation.

5. implementation for non-differentiable functions






