# Table of Content
* [Introduction](#Introduction)
* [Background](#Background)
* [How to use the package](#How to use the package)
    * [Installing autodiff](#Installing-autodiff)
    * [Basic Demo](#Basic-Demo)
* [Software Organization](#Software-Organization)
    * [Directory Structure](#Directory-Structure)
    * [Modules](#Modules)
    * [Test Automation](Test-Automation)
    * [Distribution](Distribution)

* [Implementation Details](#Implementation-Details)
    * [Core data structures](#Core-data-structures)
    * [Core classes](#Core-classes)
    * [Important attributes](#Important-attributes)
    * [External dependencies](#External-dependencies)
    * [Elementary functions](#Elementary-functions)

----
## Introduction

Automatic differentiation (AD) is a family of techniques for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. Application of AD includes Newton’s method for solving nonlinear equations, real-parameter optimization, probabilistic inference, and backpropagation in neural networks. AD has been extremely popular because of the booming development in machine learning and deep learning techniques. Our AD sofeware package enable user to calculate derivatives using the forward and reverse mode. 

----
## Background

**Mathematical Background**

Automatic Differentiation decomposes a complex function into a sequence of operations on elementary functions, evaluates the derivatives at each intermediate stage, repeatedly applies the chain rule to obtain the derivative of the outermost function.
We provides explanations for related math concepts below. 

**Elimentary functions**

The class of functions consisting of the polynomials, the exponential functions, the logarithmic functions, the trigonometric functions, the inverse trigonometric functions,and the functions obtained from those listed by the four arithmetic operations and by superposition(i.e. composition),applied by finitely many times.

**Chain Rule**
+ Used to compute the derivative of a composite function
+ Core of automatic differentiation
$$ f \circ g (x) = f(g(x))$$
$$\dfrac{d}{dx}[f(g(x))] = f'(g(x))g'(x)$$

**Dual Numbers**
+ Used to compute derivative for elementary functions in automatic differentiation
+ Replace x and y with $x+x'\epsilon$ and $y+y'\epsilon$. x' and y' are real numbers,$\epsilon$ is an abstract number with the property: $\epsilon^2=0$
+ Carry out operations, the dual part gives us the derivative

**Topological Graph**
+ Each node represent a variable
+ Arrows indicate topological orders(order of operations) and operations themselves.


**Forward Mode Autodifferentiation**

Follow the topological order and store the values of each variable in the nodes.
visit each node in topological order. Let x denote our innermost function. For variable $u_i=g_i(v)$ we already know $\dfrac{dv}{dx}$, calculate $\dfrac{du_i}{dx}= \dfrac{du_i}{dv}\dfrac{dv}{dx}$


**Reverse Mode Autodifferentiation**

Has forward computation and backward computation

    **Forward Computation**
Follow the topological order and store the values of each variable in each nodes.
    
    
    **Backward Computation**
let y denote our final output variable and $u_j$, $v_j$ denote the intermediate variables
1. Initialize all partial derivative $\dfrac{dy}{du_j}$ to 0 and dy/dy = 1
2. visit each node in reverse topological order. For variable $u_i=g_i(v_1,...,v_n)$ we already know $\dfrac{dy}{du_i}$, increment $\dfrac{dy}{dv_j}$ by $\dfrac{dy}{du_i}\dfrac{du_i}{dv_j}$


----
## How to use the package

### Installing `autodiff`

Here is how to install `autodiff` on command line. We suppose that the user has already installed `pip` and `virtualenv`:
1. clone the project repo by `git clone git@github.com:D-Y-F-S/cs207-FinalProject.git`
2. `cd` into the local repo and create a virtual environment by `virtualenv env` 
3. activate the virtual environment by `source env/bin/activate` (use `deactivate` to deactivate the virtual environment later.)
4. install the dependencies by `pip install -r requirements.txt`
5. install `autodiff` by `pip install -e .`

### Basic Demo

### Univariate Functions

We start with a `Variable`, which represents an independent variable. Let's call it `x`. 

In [13]:
import autodiff.forward as fwd

In [14]:
x = fwd.Variable()

The core class in `dyfs` is `Expression`, and we can build up `Expression` from `Variable` and other `Expression`. All functions are represented as `Expression`. All `Expression`, including `Variance` which is the most elementary `Expression`, implements the `evaluation_at` method, returns the value of this `Expression`. It also implements the `derivative_at` method, returns the derivative of this `Expression`.

Here we create an `Expression` that represents $f(x) = 2x + \exp(x)$. There is no need to call the `Expression` constructor explicitly, because the `*` operator on `x` is overloaded and will return an `Expression`. The `exp` function also returns an `Expression` representing $\exp(x)$.

In [15]:
f = 2*x + fwd.exp(x)

We can then evaluate $f(x)$'s value and derivative at $x = 0.5$ by calling `evaluation_at` and `derivative_at` on `f`. 

In [16]:
f.evaluation_at({x: 0.5})

2.648721270700128

In [17]:
f.derivative_at(x, {x: 0.5})

3.648721270700128

### Multivariate Functions

Similar operations can be carried out on multivariate functions.

Here we first create another `Variable` called `y`. Then we create an `Expression` that represents $g(x, y) = \exp(x-y)$.

In [19]:
y = fwd.Variable()

In [20]:
g = fwd.exp(x+y)

We can then evaluate $g(x, y)$'s value at $x = 0.5, y = -0.5$ by calling `evaluation_at` on `g`. We can also evaluate $\dfrac{\partial g}{\partial x}$ and $\dfrac{\partial g}{\partial y}$ by calling `derivative_at` on `g`.

In [23]:
g.evaluation_at({x: 0.5, y: -0.5})

1.0

In [24]:
g.derivative_at(x, {x: 0.5, y: -0.5})

1.0

In [25]:
g.derivative_at(y, {x: 0.5, y: -0.5})

1.0

### Vector Functions

Similar operations can be carried out on vector functions.

Here we create an `VectorFunction` that represents $h(\begin{bmatrix}x\\y\end{bmatrix}) = \begin{bmatrix}f(x)\\g(x, y)\end{bmatrix}$.

In [26]:
# not implemented yet
# h = fwd.VectorFunction(fun_dict={f: (x), g: (x, y)}, var_list=[x, y])

We can then evaluates $h(x)$'s value and derivative (returned as gradient $\begin{bmatrix}\dfrac{\partial f}{\partial x}\\\dfrac{\partial g}{\partial x}\end{bmatrix}$ or $\begin{bmatrix}\dfrac{\partial f}{\partial y}\\\dfrac{\partial g}{\partial y}\end{bmatrix}$) at $x = 0.5, y = -0.5$. The `jacobian_at` function returns the Jacobian ($\begin{bmatrix}\dfrac{\partial f}{\partial x} & \dfrac{\partial f}{\partial y} \\ \dfrac{\partial g}{\partial x} & \dfrac{\partial g}{\partial y} \end{bmatrix}$) of `h` at given position.

In [27]:
# not implemented yet
# h.evaluation_at({x: 0.5, y: -0.5})
# h.derivative_at(x, {x: 0.5, y: -0.5})
# h.derivative_at(y, {x: 0.5, y: -0.5})
# h.jacobian_at({x: 0.5, y: -0.5})

----
## Software Organization

### Directory Structure

The structure of `autodiff`'s project directory is as follows. 
```
autodiff/

    __init__.py
    forward.py
    
tests/

    test_forward.py
    
docs/

    milestone1.ipynb
    milestone2.ipynb
    
.gitignore
.travis.yml
LICENSE.txt
README.md
requirements.txt
setup.cfg
setup.py
```

The source codes lies in the directory `autodiff`, in which the `__init__.py` is there to make `autodiff` a package. Currently all the source codes are in the file `forward.py`. In the future, we may want to break it into multiple files later for better organization.

The test suites lies in the directory `tests`. Currently all the test codes are in the file `test_forward.py`. In the future, we may want to break it into multiple files later for better organization.

The documents lies in the directory `docs`. `milestone1.ipynb` is the history version of document when submitting milestone 1. `milestone2.ipynb`, which is this file itself, is the document when submitting milestone 2.

Other files in the rrot directory includes: `.gitignore`, which specifies the files that should not be tracked by git, `.travis.yml`, which is the configuration file for TravisCI, `LICENSE.txt`, which is the license for this package, `README.md`, which is the README file for this package, `requirements.txt`, which specifies the dependensies of this package, `setup.cfg`, which is the configuration file for installing this package, `setup.py`, which is the script for installing this package.

### Modules

There are currently 1 module in `autodiff`, that is `forward`, which implements the forward mode autodifferenciation. The other modules we are planning to add include: `backward`, which implements the backward mode autodifferenciation. `usecase`, which contains some example client codes built on top of `forward` and `backward`.

### Test Automation

`TravisCI` and `Coveralls` are used for test automation. The test suites for each module is included in the `tests` directory.

### Distribution

Currently `autodiff` has to be manually installed. It will be distributed with `PyPI` in the future.

----
## Implementation Details

### About how the derivative is evaluated

The centural data structure in `autodiff` are `Expression` and `ElementFunction` (which is the common interface shared by `Add`, `Mul`, `Pow`, `Exp`, `Sin`... We may want to explicitly add an abstract base class in the future). `Expression`'s are composed of one `ElementFunction`  plus one or two sub-`Expression`'s. When evaluating the derivative, `Expression` will pass its sub-`Expression`('s) to the `ElementFunction`'s `derivative_at` method. `ElementFunction`'s `derivative_at` method will then call the `evaluation_at` method and `derivative_at` method os the sub-`Expression`('s) and use the returned value to calculate derivative. It is a mutual recursive process, where the base lies in `Variable` class and `Constant` class, whose `evaluation_at` method and `derivative_at` method return a solid number (rather than continue calling another function).

The `evaluation_at` method works similarly.

### autodiff.forward.Expression

`Expression` represents an expression. It implements the `evaluation_at` and the `derivative_at` methods. The first returns the value of this expression at a certain point. The second returns the derivative (with respect to given variable / expression) of this expression at a certain point. There is not need to contruct an `Expression` explicitly when using `autodiff`.

There are 2 subclass of `Expression`, both implements the same interface but have some difference in their implementation of `evaluation_at` and `derivative_at` methods: `Variable`, which represents an independent 'base' variable, and `Constant`, which represents a constant. 

The interface of `Expression` is as follows:

In [29]:
class Expression:
    """
    The contructor of Expression. There is no need to call this 
    constructor explicitly.
    ------------------------
    Input:
        ele_func:  ElementFunction*, the element function 
                   involved expression
        sub_expr1: Expression, the first sub-expression involved 
                   in this expression
        sub_expr2: Expression, the second sub-expression incolved 
                   in this expression
    * Elementfunction is the common interface shared by Add, Mul, 
      Sub, Pow, Exp, Sin... We may want to explicitly add an 
      abstract base class in the future
    """
    def __init__(self, ele_func, sub_expr1, sub_expr2=None):
    
    """
    The value of this expression at the point specified by val_dict.
    ------------------------
    Input:
        val_dict: dict, a dictionary representing a point (keys are 
                  Variables and values are numeric)
    Output:
        numeric, the value of the expression at the point specified 
        by val_dict
    """
    def evaluation_at(self, val_dict):
    
    """
    The derivative with respect to certain variable / expression at 
    the point specified by valdict.
    ------------------------
    Input:
        var:      Expression, the variable / expression to calculate 
                  derivative with respect to
        val_dict: dict, a dictionary representing a point (keys are 
                Variables and values are numeric)
    Output:
        numeric, the derivative with respect to certain variable / 
        expression at the point specified by valdict
    """
    def derivative_at(self, var, val_dict):

### autodiff.forward.Variable

`Variable` is a special kind of `Expression`. Its main purpose is to make the client code more readable. Client codes of `autodiff` typically starts with creating `Variable` objects.

The interface of `Variable` is as follows:

In [3]:
class Variable(Expression):
    """
    The contructor of Constant.
    """
    def __init__(self, val):
    
    """
    The value of this variable at the point specified by val_dict.
    ------------------------
    Input:
        val_dict: dict, a dictionary representing a point (keys are
        Variables and values are numeric)
    Output:
        numeric, the value of the expression at the point specified 
        by val_dict
    """
    def evaluation_at(self, val_dict):
    
    """
    The derivative with respect to certain variable / expression at 
    the point specified by valdict. The output of this function will 
    always be either 0.0 (derivative w.r.t. some other variable) or 
    1.0 (derivative w.r.t. itself).
    ------------------------
    Input:
        var:      Expression, the variable / expression to calculate 
                  derivative with respect to
        val_dict: dict, a dictionary representing a point (keys are 
                  Variables and values are numeric)
    Output:
        numeric, the derivative with respect to certain variable / 
        expression at the point specified by valdict.
    """
    def derivative_at(self, var, val_dict):

### autodiff.forward.Constant

`Constant` is a special kind of `Expression` representing a constant. There is not need to contruct an `Expression` explicitly when using `autodiff`.

The interface of `Variable` is as follows:

In [None]:
class Constant(Expression):
    """
    The contructor of Constant. There is no need to call this 
    constructor explicitly.
    ------------------------
    Input:
        val: numeric, the value of this constant
    """
    def __init__(self, val):
    
    """
    The value of this variable at the point. The output of this 
    function will always be the val passed to the constructor
    ------------------------
    Input:
        val_dict: dict, a dictionary representing a point (keys are 
                  Variables and values are numeric)
    Output:
        numeric, the value of the expression at the point specified 
        by val_dict
    """
    def evaluation_at(self, val_dict):
    
    """
    The derivative with respect to certain variable / expression at 
    the point specified by valdict. The output of this function will 
    always be 0.0.
    ------------------------
    Input:
        var:      Expression, the variable / expression to calculate 
                  derivative with respect to
        val_dict: dict, a dictionary representing a point (keys are 
                  Variables and values are numeric)
    Output:
        numeric, the derivative with respect to certain variable / 
        expression at the point specified by valdict.
    """
    def derivative_at(self, var, val_dict):

### autodiff.forward.VectorFunction

`VectorFunction` represents a vector function. We haven't implemented this class for now.

The interface of `VectorFunction` will be as follows:

In [None]:
class VFun:
    
    def __init__(self, fun_dict, var_list):

    def evaluation_at(self, val_dict):
        
    def derivative_at(self, var, val_dict):
    
    def jacobian_at(self, val_dict):

### autodiff.forward.Exp

`Exp` represents the elementary function $x \mapsto \exp(x)$. It implements 2 methods: `evaluation_at` and `derivative_at`. The first returns the value of some exponential expression. The second returns the derivative of some exponential expression. Its methods are called by `Expression` and need not be called by client codes. 

The interface of `Exp` is as follows:

In [None]:
class Exp:
    
    """
    The value of this exponential expression at the point.
    ------------------------
    Input:
        sub_expr1: Expression, the expression x in x -> exp(x)
        val_dict:  dict, a dictionary representing a point (keys are 
                   Variables and values are numeric)
    Output:
        numeric, the value of the expression at the point specified 
        by val_dict
    """
    @staticmethod
    def evaluation_at(sub_expr1, val_dict):
    
    """
    The derivative with respect to certain variable / expression at 
    the point specified by valdict.
    ------------------------
    Input:
        var:      Expression, the variable / expression to calculate 
                  derivative with respect to
        val_dict: dict, a dictionary representing a point (keys are 
                  Variables and values are numeric)
    Output:
        numeric, the derivative with respect to certain variable / 
        expression at the point specified by valdict.
    """
    @staticmethod
    def derivative_at(sub_expr1, var, val_dict):

### autodiff.forward.exp

`exp` is a free function that act as the constructor for an expression that has `Exp` as `ele_func`.

The interface of `exp` is as follows:

In [None]:
"""
Return the exponential of an expression.
------------------------
    Input:
        expr: Expression, the expression x in x -> exp(x)
    Output:
        Expression, the exponential of expr
"""
def exp(expr):