In [None]:
# RUN THIS CELL FOR FORMAT
import requests
from IPython.core.display import HTML
styles = requests.get("https://raw.githubusercontent.com/Harvard-IACS/2018-CS109A/master/content/styles/cs109.css").text
HTML(styles)

# Documentation
@merlionctc

**Table of Contents**
- [1. Introduction](#1.-Introduction)
  - [1.1 Derivative](#1.1-Derivatives)
  - [1.2 Automatic Differentiation](#1.2-Auto-Differentiation)
- [2. Background](#2.-Background)
  - [2.1 Chain Rule](#2.1-Chain-Rule) 
  - [2.2 Elementary functions](#2.2-Elementary-functions) 
  - [2.3 Forward mode](#2.3-Forward-mode) 
  - [2.4 Dual Number](#2.4-Dual-Number) 
  - [2.5 Reverse Mode](#2.5-Reverse-Mode) 
- [3. Usage](#3.-Usage)
  - [3.1 Installation](#3.1-Installation)
  - [3.2 How to use](#3.2-How-to-Use)
- [4. Software organization](#4.-Software-Organization)
  - [4.1 Directory Structure](#4.1-Directory-Structure)
  - [4.2 Modules to Include](#4.2-Modules-to-Include)
  - [4.3 Test Suite Design](#4.3-Test-Suite-Design)
  - [4.4 Package Distribution](#4.4-Package-Distribution) 
- [5. Implementation](#5.-Implementation)
  - [5.1 Forward Mode Implementation](#5.1-Forward-Mode-Implementation)
  - [5.2 Symbolic Reverse Mode](#5.2-Symbolic-Reverse-Mode)
  - [5.3 Handling Elementary Functions](#5.2-Handling-Elementary-Functions)
- [6. Extensions](#6.-Extensions)
  - [6.1 Future Features from Milestone2 with updates](#6.1-Future-Features-from-Milestone2-with-updates)
  - [6.2 Revere Mode](#6.2-Revere-Mode)
  - [6.3 Symbolic Expression (Reverse Diff and Higher Order Derivative)](#6.3-Symbolic-Expression-(Reverse-Diff-and-Higher-Order-Derivative))
- [7. Broader Impact and Inclusivity Statement](#7.-Broader-Impact-and-Inclusivity-Statement)
- [8. Futures](#8.-Futures)
- [9. Reference](#9.-Reference)





## 1. Introduction

We developed this package, `AutoDiff`,  in the light of automatic differentiation. The package can help to automatically differentiate a function input into the program. The package includes modules of forward-mode differentiation and backward-mode differentiation.

### 1.1 Derivatives

Formally, for single variable case, the derivative of a function, if it exists, is defined as

$$ f'(x) = \lim_{h\to0} \frac{f(a+h) - f(a)}{h} $$

to visualize, is the slope of the tangent line to the graph of the function at that point. The tangent line is the best linear approximation of the function near that input value.

Of course, derivatives may be generalized to functions of several real variables, and derivatives are useful in finding the maxima and minima of functions. Derivatives has a variety of applications in statistics and machine learning, and the process of finding a derivative is called *differentiation*.

There are three ways of differentiation realized in computer science:
- Numerical differentiation
- Symbolic differentiation
- Automatic differentiaton


### 1.2 Auto Differentiation
Automatic differentiation (AD), also called as algorithmic differentiation, is a set of techniques for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs <sup>1</sup>. It is not numerical differentiation, since numerical differentiation is the finite difference approximation of derivatives using the values of the original function evaluated at some sample points<sup>2</sup>. It is different from symbolic differentiation, since symbolic differentiation is the automatic manipulation of expressions for obtaining derivative expression<sup>3</sup>.

The essence of AD is that all numerical computations are ultimately compositions of a finite set of elementary operations for which the derivatives are easily known<sup>4</sup>. The algorithm of AD breaks down a function by looking at the sequence of elementary arithmetic operations (addition, subtraction, multiplication and division) and elementary functions (exponential, logrithmatic, and trigonometry). By applying the chain rule repeatedly to these operations, derivatives of arbitrary order can be computed automatically, accurately to machine accuracy.

This differentiation technique well-established and used with applications in different areas such as fluid dynamics, astronomy, and engineering design optimization.

To sum up, there are two major advantages of using AD:
- Computes derivatives to machine precision.
- Does not rely on extensive mathematical derivations or expression trees, so it is easily applicable to a wide class of functions.

## 2. Background

Automatic differentiation relies on several vital mathematical foundations, some of which will be illustrated at this part. Based on these conceptions, it will be more resonable for users to understand the software.

### 2.1 Chain Rule

Chain Rule is the most important concept in AD. It enables us to deal with complex functions with several layers and arguments. With implementing chain rule, we can easily divide the original complicated functions into basic parts made up with elementary functions, of which we will know the concrete derivative expressions.

Suppose there is a function $h\left(u\left(t\right)\right)$ and in order to calculate derivative of $h$ with respect to $t$, we should use chain rule. The derivative is $$\dfrac{\partial h}{\partial t} = \dfrac{\partial h}{\partial u}\dfrac{\partial u}{\partial t}.$$

In general, if a function $h$ has several arguments, or even its argument is a vector, so that $h = h(x(t))$ where  $x \in R^n$ and $t \in R^m $. In this way, $h$ is now the combination of $n$ functions, each of which has $m$ variables. The derivative of $h$ is now

 $$\nabla_{t}h = \sum_{i=1}^{n}{\frac{\partial h}{\partial x_{i}}\nabla y_{i}\left(t\right)}.$$

### 2.2 Elementary functions

Any complex function is made up with several elementary functions. As discussed above, we use chain rule to break it down and then focus on elementary functions to calulate their derivatives. 

In mathematics, an elementary function is a function of a single variable composed of particular simple functions. Elementary functions are typically defined as a sum, product and/or composition of many polynomials, rational functions, trigonometric and exponential functions, and their inverse functions.<sup>5</sup>

On the other hand, we know the concrete mathematical expression of the elementary functions, which will be used directly in the later graph structure of calculations. 


### 2.3 Forward mode

#### 2.3.1 Evaluation trace

Take the example of $g = (x+y)*z$, we will first demonstrate the evaluation trace and then its corresponding computational graph.

Let's evaluate g at the point (1,1,1). In the evaluation trace table, we will record the trace of each step, its  elementary operation as well as the corresponding numeric value at the point.

| Trace | Elementary Operation | Numeric Value |
| ----- | -------------------- | ------------- |
| $x$   | 1                    | 1             |
| $y$   | 1                    | 1             |
| $z$   | 1                    | 1             |
| $p$   | $x+y$                | 2             |
| $f$   | $v_1*z$              | 2             |

#### 2.3.2 Computational graph

The above evaluation trace can be easily visualized with the computational graph below. The node will represent the trace and the edge will represent the elementary operation.

<img src="forward_mode.png">

*Note: If you cannot see the image, please right click and open image in new tab*

#### 2.3.3 Explanation
The evaluation trace above is just the path we will follow in forward mode. On top of that, we will also carry the derivatives. And we will take the derivative of g on x.

| Trace | Elementary Operation | Numeric Value | Deri. on x    | Deri. Value on x |
| ----- | -------------------- | ------------- | ------------- | ---------------- |
| $x$   | 1                    | 1             | 1             | 1                |
| $y$   | 1                    | 1             | 0             | 0                |
| $z$   | 1                    | 1             | 0             | 0                |
| $v_1$ | $x+y$                | 2             | $\dot{x}$     | 1                |
| $f$   | $v_1*z$              | 2             | $\dot{v_1}*z$ | 1                |

 ### 2.4 Dual Number
 
 A dual number has a real part and a dual part. Say we have $z = a + b\epsilon $. Then $a$ is the real part and $b$ is the dual part.

 For $\epsilon$, we define $\epsilon ^2 = 0$ but $\epsilon$ is not zero.
 
 Dual Number is really useful when we want to calculate derivatives of a function. For example, say we have

 $$ x = a + b\epsilon,   y = x^2$$
 
 Then we can derive
 
 $$y = (a + b\epsilon)^2 = a^2 + 2ab\epsilon + b^2\epsilon^2 = a^2 + 2ab\epsilon$$

 Therefore, it is really convenient to get the value of y from real part and get derivative of y from dual part. This is what we will implement in our FORWARD auto differentiation code.
 
 
### 2.5 Reverse Mode

Recall from the background section, we know that reverse mode is another way of autodifferentation and below is the math background.

#### 2.5.1 Explanation and Background

It should be noticed that in forward mode, chain rule is not utilized. We just follow the evaluation trace and combine the derivatives of elementary functions together. But for reverse mode, we will implement chain rule. 

However, it is important to realize that the reverse mode also requires the evaluation trace on forward mode to have the derivatives on the elementary functions. Then we will use chain rule to reversely calculate the final derivative.

The steps we use to implement reverse mode based on the evaluation trace in forward mode is as follows.

- STEP1: Start with $v_1$

   $$\overline{v_1} = \dfrac{\partial f}{\partial v_1} = 1.$$

- STEP2: Use chain rule to calculate$\overline{x}$

   $$\overline{x} = \dfrac{\partial f}{\partial v_1}\dfrac{\partial v_1}{\partial x}  = 1.$$

- STEP3: Get the derivative on x

   $$\overline{x} = \dfrac{\partial f}{\partial x}  = 1.$$

** Computational graph **

In reverse mode, we just reversely implement chain rule to get the derivatives. And the computational graph for reverse mode is as follows.

<img src="reverse_mode.png">

*Note: If you cannot see the image, please right click and open image in new tab*




## 3. Usage


 ### 3.1 Installation

There are two ways to install the `AutoDiff` package. 

Method 1 directly install the package through pip into your local environment.

Method 2 create a virtual environment and install the package through pip, and also clones github repo for `demo.py`.

The package is distributed through TestPyPI at https://test.pypi.org/project/autodiff-merlionctc/


* **Method 1: install packaged using pip** 

    To install AutoDiff directly using pip, in the terminal, type:

    ```bash
    pip install -i https://test.pypi.org/simple/ autodiff-merlionctc
    ```

    This will install all required modules as dependency.

    Next, user could start use AutoDiff package by following `demo.py` examples in github repository or **3.2 How to Use** section:
    
    ```python
    >>> from autodiff.model import *
    >>> from autodiff.dual import *
    >>> from autodiff.elementary import *
    >>> from autodiff.symbolic import *
    #（begin autodifferentiation)
    >>> quit()
    ```


* **Method 2: install using virtual environment**
    
    For now, to get started, please do git clone on our project and test by using `demo.py`

    First, download the package from github to your folder

    ```bash
    mkdir test_merlionctc
    cd test_merlionctc
    git clone https://github.com/merlionctc/cs107-FinalProject.git
    cd cs107-FinalProject
    ```

    Create a vertual environment and activate it

    ```bash
    # If you do not have virtualenv, install it
    sudo easy_install virtualenv
    # Create virtual environment
    virtualenv ac207
    # Activate your virtual environment
    source ac207/bin/activate
    ```

    To ensure you have your enviroment and all required package setup, 

    ```bash
    pip install -r requirements.txt
    ```

    To install our package while using 
    
    ```bash
    pip install -i https://test.pypi.org/simple/ autodiff-merlionctc
    ```

    To run a demo we provided.

    ```bash
    python3 AutoDiff/demo.py
    ```
    
    If you want to quit the virtual enviornment:

    ```bash
    deactivate
    ```
    

### 3.2 How to Use

Here is an example that serves that a quick start tutorial on Forward Mode.

** Disclaimer: For usage of reverse mode and symbolic, please refer to Section 6 Extension. **

After installing AutoDiff package (see section 3.1)

```python
>>> from autodiff.dual import *
>>> from autodiff.elementary import *
>>> from autodiff.model import *
>>> from autodiff.symbolic import *
>>> import numpy as np
```

First Step: User instantiate variables
* val: value of variable that you start with
* der: value of the derivative of variable that you start with, usually starting with 1
* loc: The location/index this variable when there are multiple input variables for the target function(s). For example, if you initialize x1 first, the loc will be 0; then you initialize y1, the loc will increment to 1
* length: The length/number of the total variables that will be input when there are multiple input variables for the target function(s).For example, if you want to initialize x1,y1 and z1, the length will be 3, for each variable in the initialization process

```python
>>>x1 = Dual(val = 1, der=1, loc = 0, length = 3)
>>>y1 = Dual(val = 2, der=1, loc = 1, length = 3)
>>>z1 = Dual(val = 5, der=1, loc = 2, length = 3)
```

Second Step: User inputs function, based on above variables
```python
>>>f1 = 3 * x1 + 4 * y1 * 2 - cos(z1)
```

Third Step: User instantiate `autodiff.Forward` class 
```python
>>>fwd_test = Forward(f1)
```

Fourth Step: User could choose to call instance method `get_value()` to get value of func
```python
>>>print(fwd_test.get_value())
18.716337814536775
```

Fifth Step: User could choose to call instance method `get_der()` to get derivatives of func

Note: This method will return a derivative vector w.r.t to ALL variables. 


Note 2: If user enters a scalar function, then get_der will return the jacobian
```python
>>>print(fwd_test.get_der())
[ 3.          8.         -0.95892427]
```

Sixth Step: User could choose to call instance method get_der(var) to get derivatives of func

Note: This method will return a derivative vector w.r.t to specific variables you input

```python
>>>print(fwd_test.get_der(x1))
[3.0]
```

Seventh Step: User could also inputs multiple functions with multiple variables and call get_der() and get_jacobian(). 

```python
y1 = Dual(val = np.pi, der=1, loc = 1, length = 3)
f2 = (tanh(cos(sin(y1))**z1) + logistic(z1**z1, 2, 3, 4))**(1/x1)
f3 = exp(arccos(tan(sin(y1))) + logb(z1**(1/2), 1/5)*sinh(x1))

# User should use list to combine multiple functions together
fwd_test_multiple = Forward([f1, f2, f3])

# User could choose single/several variables to get derivatives
print(fwd_test_multiple.get_der(x1, y1))

# User could get the jacobian matrix of multiple functions
# Note: the order displayed in the Jacobian Matrix is matched with the order of input functions(as row) and the input variables(as column)
print(fwd_test_multiple.get_jacobian())
```

quitting
```python
>>>quit()
```

Note: Our example is a generalization of vector input, since if x,y, are vectors, we can decompose x, y into \[x0 x1] and \[y0 y1].
When we want to evaluate f = ax + by, we will achieve the same goal of having a list of multiple functions \[f1 f2], where f1 = a\*x0+b*y0, f2 = a\*x1+b\*y0 after proper linear algebra process.



## 4. Software Organization

Discuss how you plan on organizing your software package. We provided a basic structure here. 

### 4.1 Directory Structure

```
AutoDiff
│   README.md
│   .travis.yml  
│   .coverage
│   requirements.txt
│
└───src
│   │
│   └─── autodiff
│   │   │   
│   │   └─── symbolic
│   │   │   │  __init__.py (wrapper function to use Expression class)
│   │   │   │  expression.py (symbolic differentiation)
│   │   │   
│   │   │   __init__.py
│   │   │   model.py (AutoDiff main class with Forward mode)
│   │   │   elementary.py (Elementary Function module)
│   │   │   dual.py (Dual Number Class for Forward Class)
│   │
│   └─── deprecated (some deprecated implementation of reverse)
│
└───tests
│   │   test_autodiff.py
│   │   test_dual_class.py
│   │   test_elementary.py
│   │   test_symbolic.py
│   │   ...
└───Docs
│   │   README.md
│   │   milestone1.ipynb
│   │   milestone2_progress.ipynb
│   │   milestone2.ipynb
│   │   documentation.ipynb
│   │   ...
│     
│   
└───demo.py (A demo to usage of package)

```

### 4.2 Modules to Include

 Standrad Python libraries/modules:

 *math*: mathematical, algebric operations

 *numpy*: supports computations for large, multi-dimensional arrays and matrices. 

 Self-designed libraries/modules under `AutoDiff` package:

 *elementary*: contains overwritten elementary functions customized for the different variable class defined when implementing different types of differentiation.

 *model*: An interface-like, main class with Forward/Reverse mode auto-differentation methods. Also, the methods are useable for symbolic differentiation 

 *dual*: This module contains class methods for dual number class, which is the basic class structure in the forward mode

 *expression*: This module contains class methods and child classes of expression class, which is the basic class structure of symbolic differentiation


### 4.3 Test Suite Design

 We used Pytest as our test suite. We tested all of our implemented classes on every single instance methods.
 We also tested on special and corner cases such as simplication, real number etc. 
 The whole test suite is included in the *tests* sub-directory. 
 And both TravisCI and CodeCov will be used to check the codes coverage and test integration.

### 4.4 Package Distribution

  * Package Distribution

    We distributed our software using [Test Python Package Index (TestPyPi)](https://test.pypi.org/).
    Currently, we used Python project structure from [PyScaffold](https://pypi.org/project/PyScaffold/).
    
    The package could be found here: https://test.pypi.org/project/autodiff-merlionctc/

  * Software package

    We packaged our module using the standard packaging tool ([setuptools](https://packaging.python.org/key_projects/#setuptools)), and following Package Python Projects [Tutorial](https://packaging.python.org/tutorials/packaging-projects/).
  As mentioned in 3.1 **_How to Install_**, the user can use `pip install` to install the necessary dependencies.

  





## 5. Implementation

### 5.1 Forward Mode Implementation

The forward mode auto differentiation works for all real number functions.

#### 5.1.1 Core Data Structures

* The user will start by initializing variables using Dual class.

* Numpy Array:
  Our core data structure of Dual class and its operation is built upon numpy array.
  The elementary function is also operated on numpy array and real numbers.
  We store each variable its corresponding value, derivative value in numpy array, such that we could apply elementary operation to entire array instead of just a scalar value.
  
* For variables and differentiation point, it will be input in Dual number instantiation as its variable value, and value as the point of differentiation.
  User would also have to input the length of total variables and the position of current variable in dual class.

* For forward mode, the function is directly built upon operation on dual class initialized above.
  

#### 5.1.2 Implemented Classes

* `AutoDiff`: The base class for declaring a function that we wish to do auto-differenciation.

* `Forward(AutoDiff)`: Our forward auto-differentiation class. This class implements differention and calculate derivatives and jacobian through forward mode.
 
* `Dual`: dual number class. This class will take in any real number variable and construct and return it as a dual number,
  all the subsequent operations in AutoDiff will be done on Dual Number class.
  
  The class also contains operations (addition, subtraction, multiplication, division, power, etc) between dual numbers, and real numbers.
 

#### 5.1.3 Implemented Classes

`AutoDiff`:

* Description: This class declares a function to be differentiate. It provides a super method of differentiable function initialization for our Forward, Reverse, Symbolic differentiation classes. 

* Attributes: `self.f`: function to be differentiated, it shall be a list, or we initialize a user input as a list. 
We will pass this function into the child class of `AutoDiff`, which is, for example, `Forward`.

* Methods: 
  
  *  \__init__: intilization

`Forward(AutoDiff)`:

* Description: This class initialize a forward AD object, and contains method of getting values, derivatives, and jocobians

* Attributes: `self.f` is the function that we passed in, it is function of variables ready to be differentiated.

* Methods: 
  
  * \__init__: intilization, inherit from the super class
  * get_value: calculate the value of f on var through forward mode on specific value
  * get_der: calculate the derivative of f on var through forward mode with respect to all variables or specific directions
  * get_jacobian: calculate the jacobian matrix of f list on all vars through forward mode on all variables, user can specify the direction

`Dual`:

* Description: The Dual class supports custom operations for Automatic Differentiation (AD) in forward mode, it contains overwritten dunder method for basic operations
(addition, multiplication, exponentiation and so on)

* Attributes:

  `self.val`: The evaluation values of the functions or the initialized values for the variable(dual object)

  `self.der`: The derivatives of the functions or the initialized derivative of the variable. When there are multiple variables to be input in a function, the derivatives will be an
  array to store the derivatives of different variables separately.

  `self.loc`: The location/index of this variable when there are multiple input variables for the target function(s).

  `self.length`: The length/number of the total variables that will be input when there are multiple input variables for the target function(s).

* Methods:
 
  * \__init__: initialize a Dual object, with its values, derivatives, location and length.
  * \__repr__: Prints self in the form of Dual(value = [val], derivative = [der])
  * \__pos__: Returns the positive of self
  * \__neg__: Returns the negative of self
  * \__add__: Returns the addition of self and other, other can be Dual object, float, or int
  * \__radd__: Returns the addition of other and self, other can be Dual object, float, or int
  * \__sub__: Returns the subtraction of self and other, other can be Dual object, float, or int
  * \__rsub__:Returns the subtraction of other and self, other can be Dual object, float, or int
  * \__mul__: Returns the multiplication of self and other, other can be Dual object, float, or int
  * \__rmul__: Returns the multiplication of other and self, other can be Dual object, float, or int
  * \__truediv__: Returns the devision of self and other,other can be Dual object, float, or int
  * \__rtruediv__: Returns the devision of other and self, other can be float, or int
  * \__pow__: Returns the power of self raised by other, other can be Dual object, float, or int
  * \__rpow__: Returns the power of other raised by self, other can be float, or int 
  * \__eq__: Returns boolean if two objects have equal value 
  * \__ne__: Returns boolean if two objects DO NOT have equal value
  * \__lt__: Returns boolean if the former object is less than the latter
  * \__le__: Returns boolean if the former object is less than or equal to the latter
  * \__gt__: Returns boolean if the former object is greater than the latter
  * \__le__: Returns boolean if the former object is greater than or equal to the latter


#### 5.1.4 External Dependencies

We have the following external libraries/Modules to include:

`NumPy`: This provides an API for a large collection of high-level mathematical operations. In addition, it provides support for array operations.

`pytest`: the alternative, more Pythonic way of writing tests, making it easy to write small tests. We plan to use it for a comprehensive test suite

Also, we will use TravisCI and CodeCov to track the building requirement and code coverage.



### 5.2 Symbolic Reverse Mode

(For more detail, please consult _6.1 symbolic differentiation_)

Expression class is an abstracted representation of a mathematical function. For instance, some expression represents functions in the form ``a+b``,
in addition to the arithematic expressions, there are two additional expressions, namly `Symbol` and `Constant`. 

Each expression knows how to evaluate itself against a list of given value of each `Symbol`.

Each expression are also implemented to differentiate themselves w.r.t any given `Symbol`, this differentiation returns an expression.

In this way, as each expression knows how to differentiate itself w.r.t a given `Symbol` and return an expression, which means each expression naturally has the ability of doing higher order function.

To get value, derivative or jacobian, we just needs to evaluate expression, derivative expression or jacobian expression at their given value.

Conceptually, by maintaining a syntax tree, this implementation also meets our definition of `reverse mode differentiation` in the sense that each expression asks its dependencies, combining the results of its dependencies only at the time when `diff()` is called.


#### 5.2.1 Core Data Structures


* `Expression` class:
  Our core data structure of will be the Expression class its operations are self-contained.

* Dictionary: the values that the expression will be evaluated against will be entered as a dictionary.
  
* NumPy array: We store each variable its corresponding value, derivative value in numpy array, such that we could apply elementary operation to entire array instead of just a scalar value.

For variables and differentiation point, it will be input in `Symbol` instantiation as its variable value as an dictionary. For symbolic differentiation, the function is directly built upon operation on Expression class initialized above.
  

#### 5.2.2 Implemented Classes

* `Expression`: The parent class for declaring a Expression object (function) that we wish to do auto-differenciation.

* `Constant(Expression)`: a child class of Expression. The constant is evaluated to be the constant itself and the derivative of constant is zero
Printing the class instance will result in a string, which can be concatenated into the Expression.

* `Symbol(Expression)`: a child class of Expression, initiaizing the symbol of variable (x,y,z, etc.), which can be concatenated into the Expression.
  
  Note that the parent class also contains operations (addition, subtraction, multiplication, division, power, etc) between dual numbers, and real numbers.
 
* Operation specific classes: basic arithmatic operation, turns into an sub-class of `Expression`. Including `SumExpression(Expression)`,`ProductExpression`,
`DivisionExpression`,

  `LnExpression`,`PowerExpression`,

  `SinExpression`,`CosExpression`,`TanExpression`,

  `SinhExpression`,`CoshExpression`,`TanhExpression`,

  `ArcsinExpression`,`ArccosExpression`,`ArctanExpression`,

  In 5.2.3 we will give detail of one Operation specific class to clarify.


#### 5.2.3 Implemented Classes

`Expression`:

* Description: The parent class for declaring a Expression object (function) that we wish to do auto-differenciation

* Attributes: `self`, Expression. `values`, a dictionary with key initilizing as variable symbols (x, y, z, etc) and value to be floats that each variable is going to be evaluate against

* Methods: 
  
  *  evaluate: Evaluate the value of this Expression with the given values of variables
  *  _symdiff: display the symbolic representation of the derivative of this Expression
  *  \__call__: Special method enabling Expression instance to use evalute method and returns the derivative value of the instance
  *  \__add__: Addition on Expression
  *  \__radd__:Addition (commutative) on Expression
  *  \__sub__:Substraction on Expression
  *  \__rsub__:Substraction on Expression
  *  \__mul__:Multiplication on Expression
  *  \__rmul__:Multiplication (commutative) on Expression
  *  \__truediv__:Division on Expression
  *  \__rtruediv__:Division on Expression
  *  \__pow__:Exponentiation on Expression
  *  \__rpow__:Exponentiation on Expression
  *  \__neg__:Negation on Expression
  
  Notice that we DO NOT have to overwrite the comparison methods because the operation in `Expression` class is self-contained, we do not need to compare the value until
  the very last minute when we want to call the `evaluate` method. If we want to compare at some points, we can call the `evaluate ` method and comapre the values using the 
  built-in default comparison dunder in Python.

An example of arithmetic Expression class:
`SumExpression(Expression)`:

* Description: This class declares a function to be differentiate. It provides a super method of differentiable function initialization for our Forward, Reverse, Symbolic differentiation classes. 

* Attributes: `self.f`: function to be differentiated, it shall be a list, or we initialize a user input as a list. 
We will pass this function into the child class of `AutoDiff`, which is, for example, `Forward`.

* Methods: 
  
  * \__init__: initialize `self.operand`
  *  evaluate: Evaluate the value of this Expression with the given values of variables
  *  _symdiff: display the symbolic representation of the derivative of this Expression
  * \__str__: print out the addition expression

#### 5.2.4 External Dependencies

We have the following external libraries/Modules to include:

`NumPy`: This provides an API for a large collection of high-level mathematical operations. In addition, it provides support for array operations.

`Math`: This provides access to some mathematical functions also.

`__future__.annotations`: for the ease of assigning input/output to certain class

`pytest`: the alternative, more Pythonic way of writing tests, making it easy to write small tests. We plan to use it for a comprehensive test suite

Also, we will use TravisCI and CodeCov to track the building requirement and code coverage.


### 5.3 Handling Elementary Functions

The elementary model is utilized by both forword implementation and symbolic (reverse) implementation, therefore, we put the sections here. 
  
 For the elementary function, we write our own method of computing the value so that these function can be applied on the Dual number, Expression object, as well as on the real number. 
 For example, in our daily usage, sine funtion on a real number `x` can be calculated via `NumPy` (np.sin(x)), but here when we calculate sine value for our self-defined class (Dual, or Expression), 
 we cleverly store the value (same as we got from np.sin(x)) and the derivative part. 

 We have implemented the following elementary functions and we provide a demo function for reference.
 

the function we have implemented so far, * represents the input

  * Exponentation
  exp(*): extend the exponential function to an self-defined object

  Note \__power__(*): in the class dunder method for Dual and PowerExpression class for Expression, we are able to overwrite power in order to exponentiate any base number by any power

  * Square root
  sqrt(*): #extend the square root function an self-defined object

  * Trig functions
  sin(*): #extend the sine function to an self-defined object

  cos(*): #extend the cosine function to an self-defined object

  tan(*): #extend the tangent function to an self-defined object

  *  Inverse trig functions
  arcsin(*): #extend the inverse sine function to an self-defined object

  arccos(*): #extend the inverse cosine function to an self-defined object

  arctan(*): #extend the inverse tangent function to an self-defined object

  * Hyperbolic functions
  sinh(*)): #extend the hyperbolic sine function to an self-defined object

  cosh(*): #extend the hyperbolic cosine function to an self-defined object

  tanh(*): #extend the hyperbolic tangent function to an self-defined object

  * Logistic function
  logistic(*): #extend the logistic function to an self-defined object, the default is a standard sigmoid

  * Logarithms
  log(*): #extend the natural log to an self-defined object

  logb(*,base): #extend the log function with any base to an self-defined object



Here is one example:

```python
def sin(var):
    """Calculate the sine of the input 

    Parameters
    ----------
    var: Dual, Node, Expression, or real number
        
    Returns
    ------- 
    the sine value: sin{<var>} 
        
    Examples
    -------- 
    >>> sin()
    """
    if isinstance(var, Dual):
        der = np.cos(var.val) * var.der
        val = np.sin(var.val)
        return Dual(val, der)

    elif isinstance(var, Expression):
        return SinExpression(var)

    else:
        return np.sin(var)
```
If we call the above function, it will give the following output. 

```python
#...import necessary dependencies
x = Dual(np.pi, 1)
z = sin(x)
print(z)
```
We will get:
```python
Dual(value=1.2246467991473532e-16, derivative=-1.0)
```

Notice z is a Dual object. What if z is an Expression object?

```python
x = symbols('x')
z = sin(x ** 2)
values = {x: 3}
print(z.evaluate(values))
```
We will get:
```python
0.4121184852

```


This function also applies to real number:

```python
#...import necessary dependencies
x_real = np.pi
z_real = sin(x_real)
print(z_real)


```
We will get:
```python
1.2246467991473532e-16
```




## 6. Extensions: Symbolic Reverse Mode 
### 6.1 Future Features from Milestone2 with updates

**Feedback:**
It looks like you have thought through some of the implementation details of reverse model. One point of caution: the reverse mode does not have a mathematical correspondance with the dual numbers. The procedure that 
you describe actually overloads a function value, interprets this overloaded value as a dual number, and then proceeds with creation of the graph required for reverse mode. Again, this is not a dual number implementation of the reverse mode." 

** Response: **
Thanks! There are mainly three adjustments of our current library and the last future feature sections based on your feedback.

  1. Instead of using Dual class to implement reverse mode, we create a new class Expression to deal with the symbolic of the functions and defferentiation to implement reverse mode.

  2. Also with Expression class, we implemented expression of the function and its derivatives, and also higher order derivatives.
  
  3. However, we do not implement user interface and visualization and will add it in the future.


* **Jacobian of multiple functions and multiple variables**

  Last time, our package could deal with single function with multiple variables. We now can implement the case when the use will use multiple functions with multiple variables.
  In this way, we will allow the value and derivatives to be array and combine the derivatives of different functions and variables together.

*  **Reverse Mode of Auto Differentiation class (This is discussed in great details in 6.2)**

  Now our package could calculate deriavtives through forward mode of Auto Differentiation. However, in many tasks in machine learning, it is important to implement reverse mode to customize gradient descent in these problems.
  To create class of reverse mode. In the final deliveries, we create a new Expression class to store the expression of the variables/sub-functions, and for each operation/elementary function the gradients and its value will be stored.
  Then we can calculate the derivatives of the function using chain rule to from outer expression to inner one.

*  **Expression and Visualization **

  We actualize the expression and symbolic representation of differentiation, and store them in a tree (or other neccessary data structure), and recombine symbolic representation together after finishing differnetiation.
  It could requires higher complexity.
  Besides, if possible, we would also plan to visualize our computational graph in python via visualization package, e.g. d3. 



### 6.2 Background and Information: Reverse Mode

Recall from the background section, we know that reverse mode is another way of autodifferentation and below is the math background.

#### 6.2.1 Explanation and Background

It should be noticed that in forward mode, chain rule is not utilized. We just follow the evaluation trace and combine the derivatives of elementary functions together. But for reverse mode, we will implement chain rule. 

However, it is important to realize that the reverse mode also requires the evaluation trace on forward mode to have the derivatives on the elementary functions. Then we will use chain rule to reversely calculate the final derivative.

The steps we use to implement reverse mode based on the evaluation trace in forward mode is as follows.

- STEP1: Start with $v_1$

   $$\overline{v_1} = \dfrac{\partial f}{\partial v_1} = 1.$$

- STEP2: Use chain rule to calculate$\overline{x}$

   $$\overline{x} = \dfrac{\partial f}{\partial v_1}\dfrac{\partial v_1}{\partial x}  = 1.$$

- STEP3: Get the derivative on x

   $$\overline{x} = \dfrac{\partial f}{\partial x}  = 1.$$

** Computational graph **

In reverse mode, we just reversely implement chain rule to get the derivatives. And the computational graph for reverse mode is as follows.

<img src="reverse_mode.png">

*Note: If you cannot see the image, please right click and open image in new tab*

#### 6.2.2 Implementation (Detail in 6.3)
To implement reverse mode, we create an Expression class to store the traces in the computational graph, modify elementary functions to customize Expression objects,

Conceptually, by maintaining a syntax tree, this implementation also meets the definition of `reverse mode differentiation` in the sense that each expression asks its dependencies, combining the results of its dependencies only at the time differentiation method is called.


### 6.3 Symbolic Expression (Reverse Diff and Higher Order Derivative) 
For extension, we built a Expression class to implement the symbolic representation of our function. 

#### 6.3.1 Implementation
Expression class is an abstracted representation of a mathematical function. For instance, some expression represents functions in the form ``a+b``,
in addition to the arithematic expressions, there are two additional expressions, namly `Symbol` and `Constant`. 

Each expression knows how to evaluate itself against a list of given value of each `Symbol`.

Each expression are also implemented to differentiate themselves w.r.t any given `Symbol`, this differentiation returns an expression.

In this way, as each expression knows how to differentiate itself w.r.t a given `Symbol` and return an expression, which means each expression naturally has the ability of doing higher order function.

To get value, derivative or jacobian, we just needs to evaluate expression, derivative expression or jacobian expression at their given value.

Conceptually, by maintaining a syntax tree, this implementation also meets our definition of `reverse mode differentiation` in the sense that each expression asks its dependencies, combining the results of its dependencies only at the time when `diff()` is called.


#### 6.3.2 Functionality
* **Evaluate**: An Expression class could be reused and evaluated at given dictionary of values.

* **Differentation**: A Expression class could be differentiated w.r.t a given Symbol, this will return a Expression.

* **Evaluate at Differentiation Expression**: We could evaluate the differentiated expression at a given value, this will return the derivative results w.r.t Symbol specified in Diff().

* **Higher Order Differentation**: This module also naturally supports higher order differentation, in the sense that use could repeatedly call diff() on a expression w.r.t to different variable/symbols, and evaluate its value on a given value dictionary. 

* **Get Jacobian Matrix**: This module could also outputs a Jacobian Matrix on a list of expressions, for a list of Symbols.

#### 6.3.3 Usage

First Step: User instantiate variables.
You can choose to initialize by wrapper method for multiple variables together.
Or you could initialize indivial symbols by Symbol class.

```Python
x, y, z = symbols('x y z')
x1 = Symbol('x1')
```

Second Step: User inputs function, based on above variables
```Python
f2 = (tanh(cos(sin(y))**z) + logistic(z**z, 2, 3, 4))**(1/x)
```

Third Step: User input the values of the variables
```Python
values = {x: 2, y: np.pi, z: 4}
```

Fourth Step: User could choose to call instance method evaluate() to get value of func
```Python
print(f2.evaluate(values))
```

Fifth Step: User could choose to call instance method diff() to get first order derivative or higher order derivative of func
*get derivative of f1 with respect to z*
```Python
print(diff(f2, z).evaluate(values))
```

*get second order derivative of f2 with respect to z*
```Python
print(diff(f2, z, z).evaluate(values))
```

*get partial derivative of f2: df2/dxdy*
```Python
print(diff(f2, x, y).evaluate(values))
```

*get third derivative of f with respect to x*
```Python
print(diff(f2, x, x, x).evaluate(values))
```

Sixth Step: User could User could get jacobian/derivatives of multiple functions with multiple variables
```Python
f1 = 3 * x + 4 * y * 2 - z
f3 = exp(arccos(tan(sin(y))) + logb(z**(1/2), 1/5)*sinh(x))
```

User could get Jacobian Matrix with method get_jacobian_value()

*Note: the order displayed in the Jacobian Matrix is matched with the order of input functions(as row) and the input variables(as column)*
```Python
print(get_jacobian_value([f1, f2, f3], [x, y, z], values))
```

Seventh Step: User could get the expression of the function
```Python
print(f1)
```

User could also get the expression of (higher order) derivatives
```Python
print(diff(f2, x))
print(diff(f2, x, y))
```


## 7. Broader Impact and Inclusivity Statement 

### Broader Impact

As a open source library, it is important to care about the broader impact on the social community, especially in terms of diversity, ethical and social impact. 

Above all, we, the developers for AutoDiff, spare no effort to encourage the autonomy and freedom of using and contributing to the library from diversified groups, especially for the women, people with disability and working parents. We hope our library can provide great motivation and encouragement for the minority groups in contributors and for other open source library, contributing to the diversity in the whole open source community.

Additionally, the open source Auto Differentiation library may also be misused in some scenarios and cause some ethical issues. For one thing, there may be some students or researchers to overuse this library instead of calculating the derivatives by hand. Although AutoDiff can be efficient and powerful for solving complicated gradient problems, sometimes it is also vital for students to learn how to conduct the derivatives by hand, and for mathematical/physical researchers to discover drawbacks and make breakthroughs. Therefore, it should be notified that AutoDiff is only a tool, but not a bible to entirely depend on. For the other, there may be some people using AutoDiff for business purpose or offering it for sale, which is entirely against our purpose and privacy rules. It should be warned that our library is designed for academical purpose, increasing the efficiency of Machine Learning tasks and other gradient-related problems.



### Software Inclusivity
Software development, like many fields of science, has been prosperous because the contribution of people from a variety of backgrounds. This package welcomes and encourage participation and usage from a global community. 
Just as Python Software's Diversity Statement indicated, *the Python community is based on mutual respect, tolerance, and encouragement, and we are working to help each other live up to these principles.* We, as the developer of this AutoDiff package, also want our user group to be more diverse: whoever you are, and whatever your background, we welcome you to use our package.
This software package is built based upon the diversity perspective on Python broader community. We strongly believe that embrace diverse community to use our package brings new blood and perspective, making our user group stronger and more vibrant. A diverse user group where all users treat each other with respect has more potential contributors and more sources for fresh ideas.
We also welcomes users from all language background. Mathematics has no boundary.

In principle, there should be no barrier whatsoever for other developers to contribute to our code base. 
In practice, these barriers do exist and could be rather subtle. 

Our software project will be mainly published on Github page, and welcomes contributions from opening issues and pull requests from a broad audiences. 
We will also leave our email (email address could be found at **README.md**) for reaching out for closer contact if any users has ideas on contributing but found it difficult to overcome barriers. With a closer communication, we could reach out to our users and accomodate for their needs accordingly if that would help.

Pull requests will be reviewed and approved together by each of our group members. Emails will also be reviewed and approved together.

For underrepresented group, we welcome contributions from their perspective and are willing to receive their comments and feedbacks. We will also remove sensitive content and words in our code base, for example, we would be taking care of code and variable naming to avoid use of words like `blacklist` or `whitelist`, `master` or `slave`, but to use a more generic term.
We will also rename our main branch to `main`.

For working parents, we will accomodate for their time if they would like a closer contact and discuss about our package implementation. 
For people from different countries or non-native English speakers, rural communities, we would be willing to receive their feedback and contribution through all channels. We could be reached via email, phone, letter or any other means plausible.
We will also use translator to help accomodate for language barriers if it is needed.

For people with disabilities, we will be happy to provide accommodations, inluding but not limiting to sign language explanation and hearing assitance. We are happy to go with more detail if you feel needed.

We want to make our biggest effort to create an inclusive learning and communication environment for all the users and members from the science community that supports a diversity of thoughts, perspectives, and experiences, and honors the identities.
To help achieve it:

* If you feel anything such as the name and the written records that disturb you in this software, please let us know.

* If you feel anything improper in our future maintanance and development process, please let us know

* As we welcome everyone to contribute, we shall all strive to honor the diversity of each other



# 8. Futures

For future improvments, below are some points we have in mind:
* **Expression class simplification**: We maintained syntax tree for expression class, however, it is not a very polished version for read-friendly expression. Especially when there's a complex function, or when we differentiated for several times and get a higher order differentiation expression. 
    * Simplify Adding: When there's multiple expression adding together, especially constant adding expression adding constant, we should simply and adding constants together.
    * Simplify Multiplication: Similar to adding, remove multi-layer mulitiplication.
    * Constant: Constant arithmetic
    * Symbols: Multiple same Symbol operations could be combined, e.g `x+x+x+x` could be simplified as `4*x`
    * Zero * Symbol: should return Constant Zero itself
    * One * Symbol: should return Symbol itself
    * Symbol ** 1: should return Symbol itself
    * A lot more other arithmetic simplification for better readibility on Expression...


* **Implement User Interface with elegant input (or Web Development)**

  Now that our package's Forward mode requires the user to initialize the variables using Dual class with specified location and length of the input variables. 
  And then uer could use these Dual objects to create the target functions. However, there are mainly two future improvements that can be implemented to make the package more friendly to users.

  1. With user entering the function expression, our package will automatically parse the function break it down into elementary functions and store them into tree structure.

    When users want to repeatively use the functions that require our package to calculate the deriavtives, now the users have to redefine the funtions repeatively.
    Once we could parse the function expressions and use tree structure to store the operations and input variables at each step, we can free users from initializing functions and variables repeatively.

  2. With user directly entering the input variables for the target functions, our package will automatically initialize the Dual objects that will be used in the target functions.
  
    When users initialize the variables, they should first think about how many variables will be used in the target functions. But sometimes users may want to use the variables later in different functions.
    Therefore, in the future we will use list to store the values that the user is intended to use in the function instead of the initialized variables. Our package will automatically initialize the variables for the user
    with these values.
  3. We could also aim to develop a web based application for user to input in GUI as an easier use case.

## 9. Reference

[[1]](https://www.jmlr.org/papers/volume18/17-468/17-468.pdf): Baydin, Atilim Gunes; Pearlmutter, Barak; Radul, Alexey Andreyevich; Siskind, Jeffrey (2018). "Automatic differentiation in machine learning: a survey". Journal of Machine Learning Research. 18: 1–43.

[[2]](https://fac.ksu.edu.sa/sites/default/files/numerical_analysis_9th.pdf):Rirchard L. Burden and J. Douglas Faires. Numerical Analysis. Brooks/Cole, 2001.

[[3]](https://www.springer.com/gp/book/9783540654667):Johannes Grabmeier and Erich Kaltofen. Computer Algebra Handbook: Foundations, Applications, Systems. Springer, 2003

[[4]](https://www.jstor.org/stable/24103956): Arun Verma. An introduction to automatic differentiation. Current Science, 78(7):804–7,
2000.

[[5]](https://www.worldcat.org/oclc/31441929):  Spivak, Michael. (1994). *Calculus* (3rd ed.). Houston, Tex.: Publish or Perish. p. 359.


