# ```superautodiff``` Documentation
#### CS207 Fall '19 Final Project
#### Group 1: _Team Gillet_
#### Lucie Gillet, Jussi Sakari Jukarainen, Jovin Leong, Huahua Zheng


---

# Introduction

<br>

Derivatives play a critical role in the natural and applied sciences, with optimization being one of the core applications involving derivatives. Traditionally, derivatives have been approached either symbolically or through numerical analysis (_e.g._ finite differences). Although numerical approaches to solving derivatives are simple to compute, they are prone to stability issues and round-off errors. Meanwhile, although symbolic derivatives enable the evaluation of derivatives to machine precision, the process is limited by its computational intensity. Recently, the size and complexity of the functions involving derivatives have grown; these demands necessitate an alternative to symbolic and numerical methods that is able to compute derivatives with higher accuracy at a lower cost. Automatic Differentiation (AD) addresses these issues by executing a sequence of elementary arithmetic operations to compute accurate derivatives. 

<br>

Our team aims to develop a Python package, ```superautodiff```, that implements both forward-mode and reverse-mode AD (the latter approach being our project extension). This document will review some of the the mathematical foundations behind our approach and provide relevant information on documentation and usage of ```superautodiff```. Finally, the documentation will discuss some of the underlying implementation details along with plans for how the project might develop in the future.

---

# Background
<br>

## Mathematical Foundations

AD relies heavily on the chain rule and several other key mathematical concepts in order to compute derivatives. We now consider some background mathematical foundations that form the theoretical basis of our approach to AD.

<br>

**Differential calculus**

Differential calculus is concerned with the evaluation and study of gradients and/or rates of change. Numerically, we can formally define the derivative of a function $f$ evaluated at $a$ as:

$$f'(a)=\lim _{h\to 0}{\frac {f(a+h)-f(a)}{h}}$$.


**Elementary functions and their derivatives**

  Here are some examples of elementary functions used by AD and their corresponding derivatives:

  <br>

  **<center> Table 1. Elementary functions and their derivatives  </center>**
  <br>

| $$f(x)$$     | $$f'(x)$$    | 
| ------------- |:-------------:| 
| $$c$$        |         $0$   | 
| $$x$$        |         $1$   | 
| $$x^n$$      | $$nx^{n-1}$$ | 
| $$\frac{1}{x}$$ | $$-\frac{1}{x^2}$$ |
| $$e^x$$      | $$e^x$$ | 
| $$log_ax$$      | $$\frac{1}{x \ln a}$$ | 
| $$\ln x$$      | $$\frac{1}{x}$$ | 
| $$\sin(x)$$      | $$\cos(x)$$ | 
| $$\cos(x)$$      | $$-\sin(x)$$ | 
| $$\tan(x)$$      | $$\frac{1}{\cos^2x}$$ |<br>


<br>

**Chain rule for composite functions**

  The chain rule is a formula used to compute composite derivatives containing multiple variables. For instance, if we have a variable $z$ depending on $y$, which itself depends on $x$, we can subsequently employ the chain rule to express the derivative of $z$ with respect to $x$ is given by:

<br>

$${\frac  {dz}{dx}}={\frac  {dz}{dy}}\cdot {\frac  {dy}{dx}}$$

<br>

**<center> The chain rule </center>**

<br>

**Forward and reverse mode**

  For functions where we have intermediate components in our derivatives, we can keep track of the derivatives of each component using either of the following two modes: the forward mode and the reverse mode.
  -        The forward mode starts with the input and computes the derivative with respect to the input using the chain rule at each subcomponent. The process involves storing the intermediate values of the derivatives of variables with respect to the input in order to evaluate the overall derivative: <br> <br> 
  $$\frac{dw_i}{dx} = \frac{dw_i}{dw_{i-1}}\frac{dw_{i-1}}{dx}$$<br>
   **<center> Forward mode </center>**  
   
<br>
  
  -        The reverse mode, meanwhile, involves both a forward pass that evaluates the values of the functions along with a backward pass that stores the derivatives of the output with respect to the different variables: <br> <br> $$\frac{dy}{dw_i} = \frac{dy}{dw_{i+1}}\frac{dw_{i+1}}{dw_i}$$ <br>    **<center> Reverse mode </center>**

<br>

**Computational graph representation**

  The elementary operations involved in the forward accumulation involved in the forward mode can be visually represented through a computational graph. For instance, the computational graph of the function $f(x)=x−\exp\{−2\sin^2(4x)\}^{[1]}$ is illustrated on Figure 1; Figure 2 presents a more complex computational graph.

<br>

  The graph breaks down the given function into a sequence of elementary operations that are visually charted out through the computational graph. The graph operates similarly to a flowchart and illustrates how each elementary operation modifies our initial parameter inputs in order to recover the function.


<br>

<img src="fig/graph_1.png" style="height:300px;">

**<center> Figure 1. A computational graph for $f(x)=x−\exp\{−2\sin^2(4x)\}^{[1]}$</center>**

<br>

<img src="fig/graph_2.png" style="height:450px;">

**<center> Figure 2. A more complex computational graph</center>**

<br>

[1] D. Sondak, lecture 10, CS207 Fall '19

<br>

## What our package is doing
Essentially, our package utilizes the aforementioned mathematical concepts in order to implement the AD through the forward mode. A primary function in our package, ```autodiff()```, takes in mathematical functions and corresponding points at which to evaluate the mathemetical functions and obtains an evaluative trace (similar to that of the graph structure above). Subsequently, this trace is used to perform differentiation on said mathematical function, using the chain rule to evaluate both the derivatives, the derivative values, and the current values at each component of the trace.

Under the hood, we might perceive of the function's calculations as equivalent to populating the table illustrated in Table 2. This is basically the core of forward-mode AD; the functionality and operation of our package is discussed in greater detail in the subsequent section.

<br>

**<center>Table 2. An evaluation table for a foward-mode neural network</center>**

| Trace | Elementary Function | Current Value | Elementary Function Derivative | $\nabla_{x}$ Value  | $\nabla_{y}$ Value  | 
| :---: | :-----------------: | :-----------: | :----------------------------: | :-----------------: | :-----------------: | 
| $x_{1}$ | $x$ | $x$ | $\dot{ x}_{1}$ | $1$ | $0$ |
| $x_{2}$ | $y$ | $y$ | $\dot{x}_{2}$ | $0$ | $1$ |
| $x_{3}$ | $w_{21}x_1$ | $w_{21}x$ | $w_{21}\dot{x}_{1}$ | $w_{21}$ | $0$ |
| $x_{4}$ | $w_{12}x_2$ | $w_{12}y$ | $w_{12}\dot{x}_{2}$ | $0$ | $w_{12}$ |
| $x_{5}$ | $w_{11}x_1$ | $w_{11}x$ | $w_{11}\dot{x}_{1}$ | $w_{11}$ | $0$ |
| $x_{6}$ | $w_{22}x_2$ | $w_{22}y$ | $w_{22}\dot{x}_{2}$ | $0$ | $w_{22}$ |
| $x_{7}$ | $x_4 + x_5$ | $w_{11}x + w_{12}y$ | $$\dot{x}_{4} + \dot{x}_{5}$$ | $w_{11}$ | $w_{12}$ |
| $x_{8}$ | $x_3 + x_6$ | $w_{21}x + w_{22}y$ | $$\dot{x}_{3} + \dot{x}_{6}$$ | $w_{21}$ | $w_{22}$ |
| $x_{9}$ | $z(x_7)$ | $z(w_{11}x + w_{12}y)$ | $$\dot{x}_{7}z'(x_7)$$ | $w_{11}z'(w_{11}x + w_{12}y)$ | $w_{12}z'(w_{11}x + w_{12}y)$ |
| $x_{10}$ | $z(x_8)$ | $z(w_{21}x + w_{22}y)$ | $$\dot{x}_{8}z'(x_8)$$ | $w_{21}z'(w_{21}x + w_{22}y)$ | $w_{22}z'(w_{21}x + w_{22}y)$ |
| $x_{11}$ | $w_{out,1}x_9$ | $$w_{out,1}z(w_{11}x + w_{12}y) $$  | $$w_{out,1}\dot{x}_9$$ | $w_{out,1}w_{11}z'(w_{11}x + w_{12}y)$ | $w_{out,1}w_{12}z'(w_{11}x + w_{12}y)$ |
| $x_{12}$ | $w_{out,2}x_{10}$ | $$w_{out,2}z(w_{21}x + w_{22}y) $$ | $$w_{out,2}\dot{x}_{10}$$ | $w_{out,2}w_{21}z'(w_{21}x + w_{22}y)$ | $w_{out,2}w_{22}z'(w_{21}x + w_{22}y)$ |
| $x_{13}$ | $x_{11} + x_{12}$ | $$w_{out,1}z(w_{11}x + w_{12}y) + w_{out,2}z(w_{21}x + w_{22}y) $$ | $$\dot{x}_{11} + \dot{x}_{12}$$ | $$w_{out,1}w_{11}z'(w_{11}x + w_{12}y) + w_{out,2}w_{21}z'(w_{21}x + w_{22}y)$$ | $$w_{out,1}w_{12}z'(w_{11}x + w_{12}y) + w_{out,2}w_{22}z'(w_{21}x + w_{22}y)$$ |





---

# How to use ```superautodiff```
## User interaction with the package
### Installation
Our package will be distributed throughy PyPI (which will be detailed in the subsequent section). Users will first install the package by running (this will work only if the user has our ```requirements.txt``` file in their working directory):

```pip install superautodiff -r requirements.txt```

For more users who think they have the required Python dependencies and do not wish to reinstall said dependencies, the following command should be run instead:

```pip install superautodiff```

an alternative command which would similarly install our package is as follows:

```pip install -i https://test.pypi.org/simple/ superautodiff==1.0.5```

Users will then need to import ```superautodiff``` as in the above use case and will need to import our modules in order to access the package functionalities. Most importantly, users will have to import ```autodiff``` to instantiate AD objects. Subsequently, users will simply have to instantiate their functions and points within the objects in order to perform AD. For the use of the other modules, users will need to import them from our package.

Alternatively if users experience some issues with the ```pip``` installation or would prefer to manually install the package, users can download or clone the repository onto their local machine. Subsequently, users need only ensure that the modules are in their working directory and should be able to import the various modules into their Python environment. 

The approach of downloading from our package repository is less convenient and is not recommended for basic users; however, the approach should be considered an important alternative for developers and those experiencing issues with ```pip install```.

### Virtual Environments
Although this is not required, users can choose to create a virtual environment in which the user can install ```superautodiff``` and interact with the package's functionalities within the virtual environment. We will walk through how virtual environments can be created using ```conda``` and ```virtualenv```.

##### On ```virtualenv```
Users should run the following command to set-up ```virtualenv```:

```sudo easy_install virtualenv```

Subsequently, users can create a virtual environment ```env``` through the following command:

```virtualenv env```

Next, users will need to activate the virtual environment with the following command:

```source env/bin/activate```

Now that the user has created and activated a virtual environment ```env```, the user should be able to either manually install ```superautodiff``` through a download or through a ```pip install```.

Finally, once the user is done working in the virtual environment, the user can deactivate the virtual environment by running the following command in the terminal in the package directory:

```deactivate ```

##### On ```conda```
To set up a ```conda``` environment ```env_name``` with Python at version 3.7, users should run the following command:

```conda create --name env_name python=3.7```

Next, users will need to activate the virtual environment with the following command:

```conda activate env_name``` 

Alternatively, if the shell is not configured to use ```conda activate```, users can either run
```conda init``` before activating the virtual environment or run the following command:

```source activate env_name```

As before, now that the user has created and activated a virtual environment ```env```, the user should be able to either manually install ```superautodiff``` through a download or through a ```pip install```.

Finally, once the user is done working in the virtual environment, the user can deactivate the virtual environment by running the following command in the terminal in the package directory:

```conda deactivate ```

### Importing
After installing the package, users need to subsequently import the various modules into their Python environment. For simplicity's sake, users can just import ```superautodiff``` using the following import command to retrieve all the modules in the package:

```python
import superautodiff
```

Alternatively, it is recommended that users run the following import alias command for concision and consistency: 
```python
import superautodiff as sad
```

## Instantiating AD objects

```superautodiff``` is a Python package and its core module is ```autodiff```. Within ```autodiff``` is the ```AutoDiff``` class, where class objects accepts an input $x \in \mathbb{R}$ (stored as the ```val``` attribute) and initializes the derivative (```der``` attribute) at $1$. The ```AutoDiff``` object then supports basic arithmetic operations (_e.g._ addition, multiplication) with integers, floats, and other ```AutoDiff``` objects. These operations will be implemented commutatively through dunder methods as appropriate. With an ```AutoDiff``` object, the user can evaluate the derivatives of a vector of functions at a specified vector of points.
<br><br>



## Usage

We illustrate several use cases of our package's core functionality and show how it can be used to evaluate derivatives for functions about a given point. 

Summarily, the approach that is illustrated below involves importing the module and instantiating an ```AutoDiff``` object. Subsequently, users should use mathematical operations as they see fit in order to map the ```AutoDiff``` object to their target mathematical function. The usage of vectorized ```AutoDiffVector``` objects is similar.

### Scalar case

In [1]:
# Command to import autodiff module
%cd ../../cs207-FinalProject
import superautodiff as sad
import pandas as pd
import numpy as np
import math

# Initalize variable inputs and instantiate AutoDiff object
f1 = sad.AutoDiff("x", 3.0)

# Examine initial values
print("Value of f1: {};\nValue of first derivative of f1: {}".format(f1.val, f1.der['x']))

C:\Users\jovin\OneDrive\Desktop\CS207\cs207-FinalProject
Value of f1: 3.0;
Value of first derivative of f1: 1.0


In [2]:
# For target function f_a(x) = x**2 + 2x + 3
# We expect the value to be 18 and the value of the derivative to be 8
f1_a = f1 ** 2 + 2 * f1 + 3
print("Value of f1_a: {};\nValue of first derivative of f1_a: {}".format(f1_a.val, f1_a.der['x']))

Value of f1_a: 18.0;
Value of first derivative of f1_a: 8.0


In [3]:
# For target function f_b(x) = cos(πx) + 5x + 4
# We expect the value to be 18 and the value of the derivative to be 5 (approximately)
f1_b = sad.cos(f1 * math.pi) + 5 * f1 + 4
print("Value of f1_b: {};\nValue of first derivative of f1_b: {}".format(f1_b.val, f1_b.der['x']))

Value of f1_b: 18.0;
Value of first derivative of f1_b: 4.999999999999999


In [4]:
# For target function f_c(x) = exp(3x) + 2ln(x) - 12x
# We expect the value to be 8069.2 and the value of the derivative to be 24297.9 (approximately)
f1_c = sad.exp(f1 * 3) + 2 * sad.log(f1) - 12 * f1
print("Value of f1_c: {};\nValue of first derivative of f1_c: {}".format(f1_c.val, f1_c.der['x']))

Value of f1_c: 8069.28115215272;
Value of first derivative of f1_c: 24297.918449392822


### Vector case

We have two approaches to generating our ```AutoDiffVector``` objects that we will illustrate:

In [5]:
# First appraoch using AutoDiff objects
# Initalize variable inputs and instantiate AutoDiff objects
fv_a = sad.AutoDiff("a", 3.5)
fv_b = sad.AutoDiff("b", -5)
fv_c = sad.AutoDiff("c", 7)

# Create an array of AutoDiff objects
f_vect = [fv_a, fv_b, fv_c]

# Use the array to instantiate AutoDiffVector Objects
fv_1 = sad.AutoDiffVector(f_vect)

# Examine initial values
print("AutoDiffVector object: ", fv_1)
print("Value of fv_a: {};\nValue of first derivative of fv_a: {}".format(fv_1.objects['a'].val, fv_1.objects['a'].der))
print("Value of fv_b: {};\nValue of first derivative of fv_b: {}".format(fv_1.objects['b'].val, fv_1.objects['b'].der))
print("Value of fv_c: {};\nValue of first derivative of fv_c: {}".format(fv_1.objects['c'].val, fv_1.objects['c'].der))

AutoDiffVector object:  <superautodiff.autodiff.AutoDiffVector object at 0x0000026390D74EB8>
Value of fv_a: 3.5;
Value of first derivative of fv_a: Counter({'a': 1.0})
Value of fv_b: -5.0;
Value of first derivative of fv_b: Counter({'b': 1.0})
Value of fv_c: 7.0;
Value of first derivative of fv_c: Counter({'c': 1.0})


In [6]:
# Second approach using arrays of variable names and values
# Initialize variable and value arrays
variables = ['d', 'e', 'f']
values = [4,-1, 1.1]

# Use vectorize to generate ADV objects
fv_2 = sad.vectorize(variables, values)

# Examine initial values
print("AutoDiffVector object: ", fv_2)
print("Value of fv_d: {};\nValue of first derivative of fv_d: {}".format(fv_2.objects['d'].val, fv_2.objects['d'].der))
print("Value of fv_e: {};\nValue of first derivative of fv_e: {}".format(fv_2.objects['e'].val, fv_2.objects['e'].der))
print("Value of fv_f: {};\nValue of first derivative of fv_f: {}".format(fv_2.objects['f'].val, fv_2.objects['f'].der))

AutoDiffVector object:  <superautodiff.autodiff.AutoDiffVector object at 0x0000026390D77198>
Value of fv_d: 4.0;
Value of first derivative of fv_d: Counter({'d': 1.0})
Value of fv_e: -1.0;
Value of first derivative of fv_e: Counter({'e': 1.0})
Value of fv_f: 1.1;
Value of first derivative of fv_f: Counter({'f': 1.0})


In [7]:
# We can perform mathematical operations on our AutoDiffVector object
fv_3 = fv_1 + 3

fv_4 = sad.tan(fv_1) * 4

# Addition
print("\nfv_3 Case:")
print("Value of fv_a: {};\nValue of first derivative of fv_a: {}".format(fv_3.objects['a'].val, fv_3.objects['a'].der))
print("Value of fv_b: {};\nValue of first derivative of fv_b: {}".format(fv_3.objects['b'].val, fv_3.objects['b'].der))
print("Value of fv_c: {};\nValue of first derivative of fv_c: {}".format(fv_3.objects['c'].val, fv_3.objects['c'].der))

# Trigonometric operation and multiplication
print("fv_4 Case:")
print("Value of fv_a: {};\nValue of first derivative of fv_a: {}".format(fv_4.objects['a'].val, fv_4.objects['a'].der))
print("Value of fv_b: {};\nValue of first derivative of fv_b: {}".format(fv_4.objects['b'].val, fv_4.objects['b'].der))
print("Value of fv_c: {};\nValue of first derivative of fv_c: {}".format(fv_4.objects['c'].val, fv_4.objects['c'].der))


fv_3 Case:
Value of fv_a: 6.5;
Value of first derivative of fv_a: Counter({'a': 1.0})
Value of fv_b: -2.0;
Value of first derivative of fv_b: Counter({'b': 1.0})
Value of fv_c: 10.0;
Value of first derivative of fv_c: Counter({'c': 1.0})
fv_4 Case:
Value of fv_a: 1.4983425606343788;
Value of first derivative of fv_a: Counter({'a': 4.561257607252096})
Value of fv_b: 13.522060024986343;
Value of first derivative of fv_b: Counter({'b': 49.71152682983342})
Value of fv_c: 3.485791930897275;
Value of first derivative of fv_c: Counter({'c': 7.037686346377139})


### Jacobian matrix generation
Additionally, our package supports the generation of Jacobian matrices via our ```jacobian``` function. The function takes in an array of functions (defined through ```AutoDiff``` objects) and an array of variables that we differentiate our functions by. The function, then, prints out a ```NumPy``` array corresponding to the Jacobian matrix.

In [8]:
# Initialize autodiff objects
g = sad.AutoDiff('g', 3.4)
h = sad.AutoDiff('h', -4)
i = sad.AutoDiff('i', 7)

# Create functions
f1 = sad.tan(g) + 4
f2 = h**2 + 2*h - 12
f3 = sad.log(i) - sad.exp(i * 2)

# List of functions and variables
variables = ['g', 'h', 'i', 'j']
functions = [f1, f2, f3]

# Obtain Jacobian
sad.jacobian(variables, functions)

array([[ 1.06986342e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00, -6.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00, -2.40520843e+06,
         0.00000000e+00]])

---


# Software Organization

## Directory structure



        cs207-FinalProject/
                    .coverage
                    .coverage.xml
                    .travis.yml
                    LICENSE
                    README.md
                    requirements.txt
                    setup.py
                    setup.cfg
                    build/
                    docs/
                          milestone_1.ipynb
                          milestone_2.ipynb
                          documentation.ipynb
                          documentation.md
                          fig/
                              graph_1.png
                              graph_2.png     
                    dist/
                    htmlcov/
                    superautodiff/
                          .coverage
                          __init__.py
                          autodiff.py
                          autodiffreverse.py
                          functions.py
                    superautodiff.egg-info/      
                    test-reports/
                    tests/
                          __init__.py
                          .coverage
                          .coverage.XML
                          tests_autodiff.py
                          tests_autodiffreverse.py       
                          tests_autodiffvector.py         
                
<br>

## Modules

```superautodiff``` contains four modules in our corresponding to our package's four main competencies. The modules are summarily described here and explained in detail in the subsequent sections.
- ```autodiff.py```: This module contains the core functionality of package—a forward mode AD library that is able to work with a vector of input variables for a vector of functions.
- ```autodiffreverse.py```: This module contains the our reverse mode AD implementation.
- ```functions.py```: This module contains the bulk of the mathematical operations used by our module along with our Jacobian function.

<br>

## Docs

The docs folder contains our documentation as a Python Notebook and as a Markdown file. We also have the notebooks for our first and second milestones.

<br>

## Testing
Testing is largely relevant to developers looking to edit and/or build upon our package; general users need not read this section. Our test suites - ``` test_autodiff.py ```, ``` test_autodiffreverse.py ```, and ``` test_autodiffvector.py ``` - are stored in our ```tests/``` folder and each script governs the testing of different aspects of our package. Our testing will be largely monitored through both Travis CI and CodeCov. Our GitHub repository will be fully integrated with Travis CI and CodeCov with relevant badges on our ```README.md``` to reflect the build status on Travis CI and the code coverage status on CodeCov. 

```superautodiff``` also supports ```pytest```. To run our tests, users will need to have ```pytest``` installed on their environment and navigate to the repository. Subsequently, users should run the following code:

```python -m pytest```

or:

``` pytest ```

This will run all our tests and provide summary statistics on the outcome of said tests.

<br>

## Package Distribution
Our package is distributed using PyPI. We use _setuptools_ and _wheel_ to generate our distribution archives and we use _twine_ to upload our package to PyPI.

The reason for this choice of tools is that they are simple, easy-to-use, and reliable. Our package does not have many complicated dependencies; we, therefore, want to employ simple packaging and distribution tools to ensure that our package is easily distributed to users with minimal hassle.

As mentioned above, users will simply have to call ```pip install superautodiff``` in order to install our package. The installation instructions and troubleshooting will be available on our GitHub repository.

The dist, superautodiff.egg-info, and build folders are used for our package distribution.

---

# Implementation 

Thus far, ```superautodiff``` has a working forward mode implementation and we have partly implemented multivariable automatic differentiation.

## Data structures
In our present implementation, the primary data structures used are Counters that we use to store our variable names, values, and derivative values in our ```AutoDiff``` class objects. The reason for this design choice is that we want to prevent cases where we have repeated variables when we implement multivariable automatic differentiation in subsequent milestones. We use ```Counter``` objects because they enable us to easily store our data in key-value pairs which makes it easier to evaluate the derivatives with respect to a particular variable.

Our ```AutoDiffVector``` objects use the same underlying data structures but include an additional dictionary with the variable name as the key and ```AutoDiff``` objects as the values corresponding to each key. This helps to ensure that the ```AutoDiff``` objects are properly stored within the vectors and can be reliably called in order to return the specific ```AutoDiff``` objects.

Our ```jacobian``` function uses ```NumPy``` arrays to generate Jacobian matrices largely because the arrays enable us to easily visualize and print out matrices; further, this enables our Jacobian matrix outputs to be usable in other matrix operations.

## Dependencies
Our package relies on the following external packages:
- ```NumPy```: We use this to specify relevant mathematical operations within our package.

- ```collections```: We use this to store our data in ```Counter``` objects.

- ```math```: We use this for additional mathematical operations.

## Dunder methods
The following dunder methods have been overloaded in our implementation in order for our ```AutoDiff``` objects to be easily used in mathematical operations and the construction of mathematical functions:
- ```__add__```: Modified to update the counter objects accordingly when addition is performed; modified to return ```AutoDiff``` objects.

- ```__radd__```: Modified to update the counter objects accordingly when addition is performed; modified to return ```AutoDiff``` objects.

- ```__sub__```: Modified to update the counter objects accordingly when subtraction is performed; modified to return ```AutoDiff``` objects.

- ```__rsub__```: Modified to update the counter objects accordingly when subtraction is performed; modified to return ```AutoDiff``` objects.

- ```__mul__```: Modified to update the counter objects accordingly when multiplication is performed; modified to return ```AutoDiff``` objects.

- ```__rmul__```: Modified to update the counter objects accordingly when multiplication is performed; modified to return ```AutoDiff``` objects.

- ```__neg__```: Modified such that all counter elements are made negative; modified to return ```AutoDiff``` objects.

- ```__truediv__```: Modified such that all counter elements are divided accordingly; modified to return ```AutoDiff``` objects.

- ```__rtruediv__```: Modified such that all counter elements are divided accordingly; modified to return ```AutoDiff``` objects.

- ```__pow__```: Modified such that the counter elements are appropriately exponentiated; modified to return ```AutoDiff``` objects.

## Mathematical operations
Our package implements the following mathematical operations using ```NumPy``` and ```math``` such that they can be used on ```AutoDiff``` objects with ease. All of the following functions can take in scalar values, vectors (Python lists), and ```AutoDiff``` objects. This is useful to users that seek to perform mathematical calculations and/or build up complicated mathematical functions using ```AutoDiff``` objects for derivative evaluation.

### Trigonometric functions
- ```sin(x)```

- ```cos(x)```

- ```tan(x)```

- ```sec(x)```

- ```csc(x)```

- ```cot(x)```

- ```arcsin(x)```

- ```arccos(x)```

- ```arctan(x)```

- ```arcsec(x)```

- ```arccsc(x)```

- ```arccot(x)```

- ```sinh(x)```

- ```cosh(x)```

- ```tanh(x)```


### Logarithms and exponentials

- ```log(x)``` of user specified base

- ```exp(x)```

## Implementation illustrations

### Initialization and instantiation of objects
Unlike the earlier section where we illustrate the usage of our package, here, we focus on the underlying methods used; the content is somewhat repetitive, but is retained for completeness. 

Once our module is imported, we can create ```AutoDiff``` objects that store the variable name and the value at which to evaluate the variables at. The object is mutable and can undergo mathematical operations in order to create complex mathematical functions; the object stores variable names, the values of the variables (given the value at which to evaluate the variables at), and the values of first derivatives of the variables (given the value at which to evaluate the variables at).


Our package defines a class ```autodiff``` that takes a variable ```x``` as input. An ```autodiff``` object has two important attributes: 
- ```val``` - a scalar that contains the value of the function 
- ```der``` - a dictionary that stores the derivatives. For example:

 ```{"a":1, "b":1}```

In [9]:
# Import module
import superautodiff as sad

# Initalize variable inputs and instantiate AutoDiff object
value_to_evaluate = 5.0 
variable_name = "x_1"
f1 = sad.AutoDiff(variable_name, value_to_evaluate)

# Illustrate how values and derivative values are stored
print("Value of f1: {};\nValue of first derivative of f1: {}".format(f1.val, f1.der))

Value of f1: 5.0;
Value of first derivative of f1: Counter({'x_1': 1.0})


### Basic operations using dunder methods
The overloaded dunder methods enable the use of basic mathematical operations with ```AutoDiff``` objects. We do not check for the accuracy of our calculations here since that is already covered above in our Usage section; instead, we merely illustrate how the functions are used and the outputs they return in order to showcase our implementation.

In [10]:
# Addition example
f1_a = f1 + f1

# Subtraction example
f1_b = 3*f1 - f1

# Multiplication example
f1_c = f1 * 3

# Exponent example
f1_d = f1 ** 2

# Division example
f1_e = f1/4

print("Value of f1_a: {};\nValue of first derivative of f1_a: {}\n".format(f1_a.val, f1_a.der))
print("Value of f1_b: {};\nValue of first derivative of f1_b: {}\n".format(f1_b.val, f1_b.der))
print("Value of f1_c: {};\nValue of first derivative of f1_c: {}\n".format(f1_c.val, f1_c.der))
print("Value of f1_d: {};\nValue of first derivative of f1_d: {}\n".format(f1_d.val, f1_d.der))
print("Value of f1_e: {};\nValue of first derivative of f1_e: {}\n".format(f1_e.val, f1_e.der))

Value of f1_a: 10.0;
Value of first derivative of f1_a: Counter({'x_1': 2.0})

Value of f1_b: 10.0;
Value of first derivative of f1_b: Counter({'x_1': 2.0})

Value of f1_c: 15.0;
Value of first derivative of f1_c: Counter({'x_1': 3.0})

Value of f1_d: 25.0;
Value of first derivative of f1_d: {'x_1': 10.0}

Value of f1_e: 1.25;
Value of first derivative of f1_e: {'x_1': 0.25}



### Trigonometric and logarithmic operations
Similarly, our ```AutoDiff``` objects can be passed through our trigonometric and logarithmic functions. As before, we do not evaluate check the accuracy of the values as this has already been done above.

In [11]:
# Sine example
f1_f = sad.sin(f1)

# Cosine example
f1_g = sad.cos(f1*2)

# Tangent example
f1_h = sad.tan(f1/2)

# Exp example
f1_i = sad.exp(f1*3)

# Natural logarithm example
f1_j = sad.log(f1+5)

print("Value of f1_f: {};\nValue of first derivative of f1_f: {}\n".format(f1_f.val, f1_f.der))
print("Value of f1_g: {};\nValue of first derivative of f1_g: {}\n".format(f1_g.val, f1_g.der))
print("Value of f1_h: {};\nValue of first derivative of f1_h: {}\n".format(f1_h.val, f1_h.der))
print("Value of f1_i: {};\nValue of first derivative of f1_i: {}\n".format(f1_i.val, f1_i.der))
print("Value of f1_j: {};\nValue of first derivative of f1_j: {}\n".format(f1_j.val, f1_j.der))

Value of f1_f: -0.9589242746631385;
Value of first derivative of f1_f: Counter({'x_1': 0.28366218546322625})

Value of f1_g: -0.8390715290764524;
Value of first derivative of f1_g: Counter({'x_1': 1.0880422217787395})

Value of f1_h: -0.7470222972386602;
Value of first derivative of f1_h: Counter({'x_1': 0.7790211562858627})

Value of f1_i: 3269017.3724721107;
Value of first derivative of f1_i: Counter({'x_1': 9807052.117416332})

Value of f1_j: 2.302585092994046;
Value of first derivative of f1_j: Counter({'x_1': 0.1})



### Vector operations
Vector operations are identical to those in the ```AutoDiff``` case. The ```AutoDiffVector``` objects essentially operate as dictionaries containing a set of ```AutoDiff``` objects that can undergo mathematical operations by relying on the ```AutoDiff``` methods that underlie the ```AutoDiffVector``` class attributes and methods.

---

# Package Extension: Reverse Mode

As an extension to our package, we have implemented reverse mode AD after having received approval for our extension. This section of our documentation details our reverse mode implementation and how it can be used.

## Reverse mode
The reverse mode is a method of performing AD that rests on an important mathematical property: that any differentiable algorithm can be translated into a sequecnce of assignments of basic mathematical operations.

This, property, then motivates the first part of the reverse mode - namely, calculating the forward pass which essentially regenerates the function we would like to evaluate through its variable inputs. At this juncture, we store the value of the partial derivatives of each of the elementary functions.

Subsequently, we compute all the derivatives in reverse order using the partial derivatives we have already obtained from our forward pass. This, then, enables us to evaluate the derivative of a function through the reverse mode.



## ```AutoDiffReverse```
We have create a new class of ```AutoDiff``` objects called ```AutoDiffReverse``` that operate similarly to regular ```AutoDiff``` objects except that these ```AutoDiffReverse``` objects rely on the reverse mode to evaluate derivatives rather than the forward mode.

Much like the ```AutoDiff``` objects, our ```AutoDiffReverse``` objects have the following three attributes:
- ```var```: name of variable
- ```val```: value of the variable
- ```der```: value of the derivative of the variable

In the case of ```AutoDiffReverse```, the first input is the value at which to evaluate the derivative whilst the second input is the variable name (as opposed to how ```AutoDiff``` objects have it the other way around). This is because variable names are optional here since ```AutoDiffReverse``` objects are used in intermediary steps in our evaluative table.

## Dependencies
Our reverse mode implementation relies on ```NumPy``` for its mathematical functions used in ```AutoDiffReverse```'s mathematical operations.

    Additionally ```AutoDiffReverse``` relies on ```Pandas``` as the evaluation table that is generated for the forward pass is created using a ```Pandas``` dataframe.

## Data structures
As mentioned, earlier, a key data structure used in our implementation is the ```Pandas``` dataframe that is used to store the evaluation table. The reason for this usage is because ```Pandas``` allows for the simple and quick generation of tables with which we can present our evaluation table cleanly. This improves user interpretation and makes it easy for users to extract specific values from our evaluation table.


## Mathematical operations
Not unlike our ```AutoDiff``` implementation, ```AutoDiffReverse``` includes the following mathematical operations using ```NumPy``` and ```math```. All of the following functions can take in scalar values, vectors (Python lists), and ```AutoDiffReverse``` objects. 

### Trigonometric functions
- ```sin(x)```

- ```cos(x)```

- ```tan(x)```

- ```sec(x)```

- ```csc(x)```

- ```cot(x)```

- ```arcsin(x)```

- ```arccos(x)```

- ```arctan(x)```

- ```arcsec(x)```

- ```arccsc(x)```

- ```arccot(x)```

- ```sinh(x)```

- ```cosh(x)```

- ```tanh(x)```


### Logarithms and exponentials

- ```log(x)``` of user specified base

- ```exp(x)```


## Usage

We will now illustrate a typical use case of our reverse mode implementation in order to evaluate the derivatives of a given function at a given point and to generate the corresponding evaluation table.

In [12]:
from superautodiff import AutoDiffReverse
from superautodiff.autodiffreverse import *

# Instantiate AutoDiffReverse objects
x1 = sad.AutoDiffReverse(4,"x1")
x2 = sad.AutoDiffReverse(7,"x2")
x3 = sad.AutoDiffReverse(3,"x3")

In [13]:
# We create a function using the AutoDiffReverse objects
f = x1 - 3*x2 + x3*x2

# Already we can obtain the value of f
print("Value of f: ", f.val)

Value of f:  4.0


In [14]:
# Assign and display the evaluation table for function f and examine
forward_table = f.pass_table()
display(forward_table)

Unnamed: 0,Node,d1,d1value,d2,d2value
0,x1,x1,1,-,-
0,x2,x2,1,-,-
0,x3,x3,1,-,-
0,y1,x2,3,-,-
0,y2,x1,1,y1,-1
0,y3,x3,7,x2,3
0,y4,y2,1,y3,1


In [15]:
# Input the forward table and array of variable names into reversepass to obtain derivatives
der = reversepass(forward_table,['x1', 'x2', 'x3'])
print("Value of derivatives: ",der)

# Clear the table
f.clear_table()

Value of derivatives:  {'x1': 1, 'x2': 0.0, 'x3': 7.0}


## Evaluation table
As illustrated above, the evaluation table is stored in ```forward_pass``` and can be accessed at any juncture in the reverse mode automatic differentiation process.

The evaluation table consists of five columns; the first column (Node) contains the elements that are used to calculate the composite variables that are stored in the second (d1) and fourth columns (d2). Meanwhile, the third (d1value) and fifth (d2value) columns contain the derivative of each node with respect to d1 and d2.

The table, therefore, contains the value of composite and intermediary variables that are used in the reverse mode AD process.

---

# Future Work and Possible Extensions

Although ```superautodiff``` is to be considered as a finished product that users should be able to satisfactorily install and use, our team has several ideas for possible extensions and developments that we hope to implement in the future.

### Further vectorization
Presently, although our ```AutoDiff``` objects are vectorizable as ```AutoDiffVector``` objects, the functionality of these ```AutoDiffVector``` objects is still very much limited. Our ```AutoDiffVector``` objects are able to work with scalars numerics and single ```AutoDiff``` objects but not vectors of these objects yet.

We hope to implement further vectorization in order to allow for more flexible and complex vector operations for our ```AutoDiffVector``` objects. This implementation will involve working with ```NumPy``` arrays and potentially ```Numba``` if it is the case that our vector functions are found to be slow.

We foresee that this will involve an overhaul of the ```AutoDiffVector```  class with more overrides so as to ensure that the vectors are fully functional and that we have some means of tracking the variables being passed in (which will be valuable as the vectorized operations might get quite messy when large numbers of variables are involved). 

Additionally, we expect that we will have to write several functions that help to simplify matrix operations; some functions are detailed as follows:
- ```sad.dot(v_1, v_2)```: Takes in two ```AutoDiffVector``` objects and returns the dot product of the two.
- ```sad.cross(v_1, v_2)```: Takes in two ```AutoDiffVector``` objects and returns the cross product of the two.
- ```sad.determinant(v_1)```: Takes in a square ```AutoDiffVector``` object and returns the matrix determinant.
- ```sad.eye(n)```: Takes in a scalar integer ```n``` and generates an $n \times n$ identity matrix that can operate with ```AutoDiffVector``` objects
- ```sad.trace(v_1)```: Takes in a square ```AutoDiffVector``` object and returns the trace of the matrix.
- ```sad.reshape(v_1, dim)```: This will either be implemented as a function or an attribute; if it were implemented as a function, the function would take in a matrix and a tuple containing the matrix dimensions and reshape the given matrix to fit the specified dimensions similarly to how ```NumPy```'s reshape operates.

This implementation will require quite a bit of work but will be very useful as it will enable users to perform a much broader set of operations with greater convenience. Additionally, if we were to revamp our vectorization implementation, we can also get our vectors to perform these operations far more quickly than in our current implementation which still relies heavily on inefficient loops. Presently, the speed at which our vectorized objects operate is somewhat of a shortcoming of our existing implementation due to its reliance on loops.

### Support for higher-order derivatives 
Another possible extension we hope to implement in the future is to build up our package's functionality in order to support operations involving higher-order derivatives.

In theory, this approach would not be very difficult because both the forward mode and the reverse mode that we have implemented will have the core competency required to evaluate derivatives at higher-orders as well as cross-partial derivatives. We essentially need to extend our existing capabilities to differentiate variables multiple times and support cross-differentiation. The difficulty, we foresee, will most likely come from how we will need to overhaul our base classes in order to support this and how we can ensure that this additional functionality is user-friendly and intuitive; i.e. that we don't want our package to become cluttered, messy, and hard to use.

Provisionally, we think that this will involve taking in the order of derivatives required as an input and developing a method of categorizing our derivatives and tracking all the derivatives and their orders. We will also need to refine our Counter output and variable naming convention to ensure that the outputs remain clear and informative for users.

We think that expanding our functionality for higher-order derivatives is an important future feature that we definitely would like to have as we believe that it is important and useful for scientific computing. We can subsequently extend this slightly further and include functionality to allow for the generation of Hessian and border Hessian matrices which are extremely useful for optimization proclems.

### Visual interface

Presenly, our reverse mode implementation implicitly constructs an evaluation table of values and a computational graph as part of its back-end operations that users do not really see (even though they can examine a simple evaluation table that we draw with ```Pandas```). 

Our team hopes that we can develop an extension that will enable our package to output more clear and informative evaluative tables that abide by design principles such that are informative and interpretable. The hope is that users will not only be able to use our package to evaluate derivatives, but also learn about the underlying processes (such as each step of the forward pass). Users should also be able to easily observe the composite derivative values in case they wish to obtain said values.

Additionally, we hope to use tools such as ```Graphviz``` to generate the computational graph from our existing implmentation and/or use the d3 JavaScript library in order to create scalable vector graphics that can be used to visualize the computational graph. This can be particularly useful for users who are keen on learning and understanding the automatic differentiation process.

---

# References
- Sondak, David. Lecture 10: Automatic Differentiation: The Forward Mode. Cambridge, MA; CS207 Fall '19
- Hoffman, Philipp H.W. A Hitchhiker’s Guide to Automatic Differentiation. Numerical Algorithms, 72, 24 October 2015, 775-811, Springer Link, DOI 10.1007/s11075-015-0067-6.
- Domke, Justin. A simple explanation of reverse-mode automatic differentiation. 24 March 2009. https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/

We would also like to thank our head instructor, Professor David Sondak, and our Teaching Fellow, Bhaven Patel, for their input, contribution, and support.