# Group 1 Milestone1 doc


## 1. Introduction

Automatic Differentiation (AD) refers to the way to compute the derivative of a given equation automatically. It has a broad range of applications across many disciplines, such as engineering, statistics, computer science, and computational biology. For both students and researchers, Auto Differentiation plays a vital role. Because it is essential for them to have tools computing derivatives efficiently given the amount of computational power needed, while facing the difficulty of converting a symbolic mathematical expression into a computer program data structure. Here, we propose a novel Python library, `undefined`, to solve this problem by implementing the AD on user defined numerical equations. 

One potential application would be calculating the derivatives in the direction of negative gradient to minimize the loss function when tune parameters in training gradient machine learning models. Compared with other solutions, `undefined` is easier to use, single task focused and more generalized.


## 2. Background

As we learned in calculus classes, the traditional way to calculate derivatives is to calculate by hand and apply different rules, including power rule, product rule, chain rule, etc.

Here is an example when we need to calculate derivative by using the chain rule. 

### 2.1 Chain Rule Formula

The general formula of the chain rule is shown as following:

(1) Suppose we have a function $h(u(t))$:

$$\frac{dh}{dt} = \frac{\partial h}{\partial u}\frac{du}{dt}$$

When we have multiple coordinates, the chain rule formula will change:

(2) Suppose our formula is $h(u(t), v(t))$

$$\frac{dh}{dt} = \frac{\partial h}{\partial u}\frac{du}{dt} + \frac{\partial h}{\partial v}\frac{dv}{dt}$$



### 2.2 Elementary Funciton

(1) Unary elementary function examples: sin(x), cos(x)

(2) Binary elemnetary function examples: x + y, x * y

### 2.3 Computational Graph

A computational graph is a directed graph where the nodes correspond to elementary functions or variables.

Computational graph node for binary elementary funciton:

![bg_computational_graph](resources/computational_graph_1.png)

The computational graph grows once the computations become more complex:

![bg_computational_graph](resources/computational_graph_2.png)

### 2.4 Automatic Differencation

Suppose we have the gradients of the function defined as following:


${f(x, y) = \cos(5x + 7y)e^{-x}}$


Assume we will calculate the partial derivative for x first, ${\frac{\partial f}{\partial x}}$, we will apply the product rule first:

${\frac{\partial f}{\partial x} = \cos(5x + 7y)(-e^{-x}) - 5 \sin(5x + 7y)e^{-x}}$

To simplify: 

${ \frac{\partial f}{\partial x} = -e^{-x}(\cos(5x+7y) + 5\sin(5x+7y)) }$


If we would have to calculate ${\frac{\partial f}{\partial y}}$, we only need to use the chain rule:

${ \frac{\partial f}{\partial y} = -7\sin(5x + 7y)e^{-x} }$

Computing this function is simple, but AD will become handy when we have to compute the derivative for complicated equations. 

There are many advantages of AD compared to other ways (numerical differentiation and symbolic differentiation) to calculate derivative automatically. One of the biggest advantage of AD is that AD calculates to machine precision and comsumes efficientively than the other two methods. 

## 3. How to use `undefined`

***Tentative***

`undefined` provided esay installation by running this following command:

` python -m pip install cs107-undefined `

Users should import the package by the following in their Python script:

`import undefined as ud`

Once imported successfully, users can calculate the derivative of a given section by using the following commands:

In [None]:
# Do not run
import undefined as ud

func = lambda x :x**2 + 5x - 6

# instantiate AD object
results = ud.trace(func, x = 3)

print(results)

>>> taking derivative...
>>> 11

More specifically, user can choose to perform auto differenciation in both forward mode and reverse mode.

**(1) Forward mode**

For the basic functionality, we will develop a function called `trace`, which will intake a user defined function and return the derivatives of the function. Note: our default is to use **forward** mode. 

Here, we showed a demo with $\mathbb{R}$ -> $\mathbb{R}$

In [None]:
# Do not run
# R -> R implementation
# import module
import undefined as ud

# user defined function
f = lambda x: x - np.exp(-2.0 * np.sin(4.0 * x) * np.sin(4.0 * x))

# call the trace function in undefined, and provide input x = 2
ud.trace(f, x = 2)

# the function will return the 1st derivative when x=2.
>>> taking derivative...
>>> 0.674811

The `trace` function can also handle multiple dimensional calculation. Assume we need to calculate $\mathbb{R}^m$ -> $\mathbb{R}$, we will input the values for ${x}_1$ and ${x}_2$. 


In [None]:
# Do not run
# user defined function
f = x*y + np.exp(x*y)

# call the trace function in undefined, and provide input x1 = 1 and x2 = 2
nd.trace(f, [1, 2])

# the function will return the 1st derivative when x1 = 1 and x2 = 2.
>>> taking derivative...
>>> [16.7781, 8.3891]

Our function will handle other multiple dimensional calculations, including $\mathbb{R}$ -> $\mathbb{R}^n$, $\mathbb{R}^m$ -> $\mathbb{R}^n$. The difference will be the number of input values. 

**(2) Reverse mode**

The `trace` function will also be able to calculate derivatives in reverse mode by specifying the `mode` parameters. Take the example below as a demo


In [None]:

# user defined function
f = lambda x: x - np.exp(-2.0 * np.sin(4.0 * x) * np.sin(4.0 * x))

# call the trace function in undefined, and provide input x = 2
nd.trace(f, x = 2, mode = 'reverse')

# the function will return the 1st derivative when x=2.
>>> taking derivative...
>>> 0.674811

## 4. Software Organization

***Tentative***

The directory structure will look like the following:

```
./undefined
├── ./undefined/README.md
├── ./undefined/Codecov.yml
├── ./undefined/.travis.yml
├── ./undefined/src
│   └── ./undefined/src/undefined
│       ├── ./undefined/src/undefined/__main__.py
│       └── ./undefined/src/undefined/__init__.py
│       └── ./undefined/src/undefined/API.py
│       └── ./undefined/src/undefined/UDFunction.py
│       └── ./undefined/src/undefined/Parser.py
│       └── ./undefined/src/undefined/Calculator.py
│       └── ./undefined/src/undefined/GraphGenerator.py
├── ./undefined/test
│   └── ./undefined/test/test.py
│   └── ./undefined/test/other_test_scripts
└── ./undefined/docs
    └── ./undefined/docs/milestone1
```

<p style="color: red">We are planning on using <b>numpy</b>, <b>math</b> models from Python, and use <b>pandas</b>, <b>matplotlib</b> to store information.</p>

<p style="color: red">We will follow the PEP257 guidelines on documenting the functionality of our codes. We decide not to use existing python frameworks, but we will build and release our own package conforming to PEP517/518.</p>


## 5. Implementation

### 5.1 Core Data Structure

**UDFunction** is the core data structure in our library returning the derivative of a user defined function $f$.

The UDFunction contains the value and derivative of the user defined function $f$, and can be used easily to perform basic calculations such as addition, multiplication, etc.

We will use `numpy.ndarray` as our core data structures to store values. For the derivatives, we will also use `numpy.ndarray` and `dictionary` to store the intermediate values from derivatives for calculation and visualization purposes. We will use a self defined graph structure to store and visualize the computational graphs. 



### 5.2 Classes

**API.py:**

This class contains methods that can be called by the users. Such as `ud.trace()`

| Method                                 | Description                                                                                                  |
|----------------------------------------|--------------------------------------------------------------------------------------------------------------|
| trace(lambda_function, mode='forward') | Given a user defined function, calculate the derivative using auto differentiation. Both vector functions and scaler functions are supported. Default mode is forward. |

**UDFunction:**

This class wraps the core data structure in our library. Objects instantiated from this class are the most basic computing units in our library.

- Name Attributes:

| Name Attribute | Description                                         |
|----------------|-----------------------------------------------------|
| values         | values of a elementary function                     |
| derivatives    | derivatives of a elementary function                |
| shape          | a tuple that declares the shape of values attribute |

- Methods (overload operators):

| Method                | Description                        |
|-----------------------|------------------------------------|
| getters and setters   | getters and setters for attributes |
| \_\_add__(self, other)  | add values from object: other      |
| \_\_radd__(self, other) | add values from object: other      |
| \_\_mul__(self, other)  | multiply values from object: other |
| \_\_rmul__(self, other) | multiply values from object: other |
| constructor           |                                    |

**Parser:**

This class is used to evaluate the user defined function `f` and parse it into a UDFunction class which is defined and implemented by our library.

- Methods:

| Method                    | Description                                                                                                    |
|---------------------------|----------------------------------------------------------------------------------------------------------------|
| evaluate(lambda_function) | break down the user defined function in to a list of elementary functions, then construct a UDFunction object. |
| parse(UDFunction)         | parse UDFunction object into the a descriptive string                                                          |

**Calculator:**

This class contains util functions to perform elementary functions calculation on UDFunction such as $sin$, $sqrt$, $log$, $exp$, which cannot be implemented by overloaded functions in UDFunction.


| Method                | Description                                             |
|-----------------------|---------------------------------------------------------|
| cos(UDFunction)       | calculate cos value of a UDFunction                     |
| sin(UDFunction)       | parse UDFunction object into the a descriptive string   |
| tan(UDFunction)       | is calculated using sin(UDFunction) and cos(UDFunction) |
| sqrt(UDFunction)      | square root performed on UDFunction                     |
| exp(UDFunction)       | exponential performed on UDFunction                     |
| log(UDFunction, base) | logarithms of base: base                                |

**GraphGenerator:**

This class contains util functions to generate computation graphs for a given UDFunction that is transformed from user defined function.

| Method                     | Description                                                             |
|----------------------------|-------------------------------------------------------------------------|
| generate(UDFunction, mode) | generate the computational graph either of forward mode or reverse mode |

**Utils:**

This class contains all other utils that are needed in the library, such as timing methods, etc.

| Method           | Description                             |
|------------------|-----------------------------------------|
| time(operation)  | calculate the time of a operation       |
| log(information) | library log of some informative strings |

### 5.3 External Dependencies

We are planning to include one python file to include the codes for computing the derivative, and have another file with all the testing files. Both `TravisCI` and `CodeCov` will be used for testing suit monitoring, and the package will be uploaded to `PyPI` by following the instructions given in class. 

<p style="color:red">We will use the <b>NetworkX</b> package for constructing the visualization for the computational graph.</p>



## 6. Licensing

We will use the `MIT` license for open source software development so that other people who are interested in our software will have access to contribute. 

- Instinction for our choice: We want it to be simple and permissive.
- Under the `MIT` license, anyone can contribute to this project by adding functionality, debug, or customerize it to meet their needs. 


<style>
H5{color:DarkOrange !important;}
</style>
#### Milestone 1 feedback

1.75/2 Software Organization
Will you use a framework? If so, which one and why? If not, why not?
Please move discussion on what external dependencies you will rely on to the implementation section and explain why you would like to include them. 

3.75/4 Implementation
Classes and methods are very well thought-through. 
What core data structures will you use? How will you store the derivatives? will you use the dictionary, list or some other data structures? 

##### We thank the TF for the helpful feedbacks. We addressed the feedbacks mentioned in the comments with texts in each respected section.</div>