## Group 1 Milestone1 doc


### Introduction

Automatic Differentiation (AD) refers to the way to compute the derivative of a given equation automatically. It has a broad range of applications across many disciplines, such as engineering, statistics, computer science, and computational biology. For both students and researchers, it is essential for them to have tools in order to compute derivatives efficiently, given the amount of computational power needed. Here, we propose a novel Python library, `Undefined`, to implement the AD on user defined numerical equations.

One potential application would be calculating the derivatives in the direction of negative gradient to minimize the loss function when tune parameters in training gradient machine learning models. 


### Background

As we learned in calculus classes, the traditional way to calculate derivatives is to calculate by hand and apply different rules, including power rule, product rule, chain rule, etc.

Here is an example when we need to calculate derivative by using the chain rule. 

**TODO: add the chain rule formula**

Suppose we have the gradients of the function defined as following:


${f(x, y) = \cos(5x + 7y)e^{-x}}$


Assume we will calculate the partial derivative for x first, ${\frac{\partial f}{\partial x}}$, we will apply the product rule first:

${\frac{\partial f}{\partial x} = \cos(5x + 7y)(-e^{-x}) - 5 \sin(5x + 7y)e^{-x}}$

To simplify: 

${ \frac{\partial f}{\partial x} = -e^{-x}(\cos(5x+7y) + 5\sin(5x+7y)) }$


If we would have to calculate ${\frac{\partial f}{\partial y}}$, we only need to use the chain rule:

${ \frac{\partial f}{\partial y} = -7\sin(5x + 7y)e^{-x} }$


We could constructure a computational graph for this equation. See the following graph


![bg_computational_graph](bg_computational_graph.jpeg)


Computing this function is simple, but AD will become handy when we have to compute the derivative for complicated equations. 


There are many advantages of AD compared to other ways (numerical differentiation and symbolic differentiation) to calculate derivative automatically. One of the biggest advantage of AD is that AD calculates to machine precision and comsumes efficientively than the other two methods. 

**TODO: Big O notation**

### How to use `Undefined`

***Tentative***

`Undefined` provided esay installation by running this following command:

` python -m pip install cs107-Undefined `

Users should import the package by the following in their Python script:

`import Undefined as ud`

Once imported successfully, users can calculate the derivative of a given section by using the following commands:



In [1]:
# Do not run
import Undefined as ud

func = lambda x :x**2 + 5x - 6

# instantiate AD object
results = ud.trace(func, x = 3)

print(results)

>>> taking derivative...
>>> 11

SyntaxError: invalid syntax (<ipython-input-1-7f5e04d6b26d>, line 1)

### Software Organization

***Tentative***

The directory structure will look like the following:

```
./undefined
├── ./undefined/README.md
├── ./undefined/Codecov.yml
├── ./undefined/.travis.yml
├── ./undefined/src
│   └── ./undefined/src/undefined
│       ├── ./undefined/src/undefined/__main__.py
│       └── ./undefined/src/undefined/__init__.py
├── ./undefined/test
│   └── ./undefined/test/test.py
└── ./undefined/docs
    └── ./undefined/docs/milestone1
```

([link to the online file tree generator](https://tree.nathanfriend.io/?s=(%27options!(%27fancy!true~fullPath!true~trailingSlash!false~rootDot!false)~3(%273%272*README.md4cov.yml*.travis.yml4*025Test*0test5doc*0Group1-milestone1%27)~version!%271%27)*%5Cn00%20%202Undefined3source!4*Code5.py*%0154320*))

We are planning on using `numpy` `math` models from Python, and use `pandas` to store information.

We are planning to include one python file to include the codes for computing the derivative, and have another file with all the testing files. Both `TravisCI` and `CodeCov` will be used for testing suit monitoring, and the package will be uploaded to `PyPI` by following the instructions given in class. 


### Implementation


The Python libraries mentioned above will help us to modify the general math operation when taking the derivatives. 

**Forward mode**

For the basic functionality, we will develop a function called `trace`, which will intake a user defined function and return the derivatives of the function. Note: our default is to use **forward** mode. 

Here, we showed a demo with $\mathbb{R}$ -> $\mathbb{R}$


In [None]:
# R -> R implementation
# import module
import Undefined as nd

# user defined function
f = lambda x: x - np.exp(-2.0 * np.sin(4.0 * x) * np.sin(4.0 * x))

# call the trace function in undefined, and provide input x = 2
nd.trace(f, x = 2)

# the function will return the 1st derivative when x=2.
>>> taking derivative...
>>> 0.674811

The `trace` function can also handle multiple dimensional calculation. Assume we need to calculate $\mathbb{R}^m$ -> $\mathbb{R}$, we will input the values for ${x}_1$ and ${x}_2$. 


In [None]:
# user defined function
f = x*y + np.exp(x*y)

# call the trace function in undefined, and provide input x1 = 1 and x2 = 2
nd.trace(f, [1, 2])

# the function will return the 1st derivative when x1 = 1 and x2 = 2.
>>> taking derivative...
>>> [16.7781, 8.3891]

Our function will handle other multiple dimensional calculations, including $\mathbb{R}$ -> $\mathbb{R}^n$, $\mathbb{R}^m$ -> $\mathbb{R}^n$. The difference will be the number of input values. 

**Reverse mode**

The `trace` function will also be able to calculate derivatives in reverse mode by specifying the `mode` parameters. Take the example below as a demo


In [None]:

# user defined function
f = lambda x: x - np.exp(-2.0 * np.sin(4.0 * x) * np.sin(4.0 * x))

# call the trace function in undefined, and provide input x = 2
nd.trace(f, x = 2, mode = 'reverse')

# the function will return the 1st derivative when x=2.
>>> taking derivative...
>>> 0.674811

The results should be the same as in the forward method calculation

We are planning on overload the operators, including `__add__` `__sub__` `__mul__`, and  `__truediv__`

For the trigonometric (`sin`, `cos`, `tan`), exponential (`exp`) and logarithmic (`log`) functions, we are planning to overload them as it will be different when calculate with dual numbers.  


Lastly, users can also use a provided function (`undefined_plot`) to visualize the original function and derivatives results. 

#### Details on implementation

### Licensing

We will use the `MIT` license for open source software development so that other people who are interested in our software will have access to contribute. 

- Instinction for our choice: We want it to be simple and permissive.
- Under the `MIT` license, anyone can contribute to this project by adding functionality, debug, or customerize it to meet their needs. 
