# CS207 Milestone1

## Introduction


We are building a software library to allow users to utilize Automatic Differentiation methods on machine calculations in order to guarantee the precision and accuracy of their calculations (this is particularly useful in scientific computing scenarios). Automatic Differentiation is a way for computers to break down the steps required to compute the derivative of a function. 

Computers no matter their level of processing power, all have a sequence of elementary arithmetic operations and elementary functions that can be put together via the chain rule to calculate complex, higher-order tasks. Our software library will take in these complex, higher-order tasks and produce accurate results via Automatic Differentiation. 

## Background
#### What is AD?
As mentioned in the introduction, computers are able to compute elementary arithmetic operations and functions extremely well. When a computer is tasked with a derivative equation however, they can end up utilizing step sizes for limit operation that are too big or too small. When this occurs, the delta between the approximation and the actual value can vary significantly and randomly. In most computing scenarios, and especially scientific ones, accuracy is required and non-negotiable. Therefore, we turn to automatic differentiation to save the day. AD utilizes the simple chain rule - that the derivative of each sub-expression can be calculated recursively to obtain the final derivatives - to overcome the problem of inaccuracy that computers are presented with.


#### How does AD work?
An example of how AD works is provided below using the chain rule:

\begin{aligned}
y=f(g(h(x)))=f(g(H(w_{0})))=f(g(w_{1}))=f(w_{2})=w_{3}
\end{aligned}

\begin{aligned}
w_{0}=x\\
w_{1}=h(w_{0})\\
w_{2}=g(w_{1})\\
w_{3}=f(w_{2})=y\\
\end{aligned}

$$
\frac{\partial{y}}{\partial{x}} = \frac{\partial{y}}{\partial{w_{2}}} \cdot \frac{\partial{w_{2}}}{\partial{w_{1}}} \cdot \frac{\partial{w_{1}}}{ \partial{x}}
$$

Forward mode states that goes from the inside to the outside, while reverse mode is from the outside to the inside. In the case above:

Forward mode calculates: $\frac{\partial{w_{i}}}{\partial{x}} = \frac{\partial{w_{i}}}{\partial{w_{i-1}}}\cdot \frac{\partial{w_{i-1}}}{\partial{x}}$ and $w_3 = 7$

Reverse mode calculates: $\frac{\partial{y}}{\partial{w_{i}}} = \frac{\partial{y}}{\partial{w_{i+1}}} \cdot \frac{\partial{w_{i+1}}}{\partial{w_{i}}}$ and $w_{0} = x$


## How to Use PackageName
### How to Use *AutoDiff*

There are two ways to install our package.

#### Method 1: User installation via ```pip```
Users are able to install our package via “pip” through following commands:

Create a virtual environment and call it `env`.
```bash
virtualenv env
```

Activate the virtual environment and install the package.
```bash
source env/bin/activate
pip install AutoDiff-StanAndyJohn
```

Open a Python interpreter on the virtual environment and import the module
```python
>>> import AutoDiff.AutoDiff as ad
```

#### Method 2: Installation via Github (for developers and users)
Users are able to install our package via Github through following commands:

```bash
git clone https://github.com/StanAndyJohn/cs207-FinalProject.git

```
Create a virtual environment and call it `env`.
```bash
virtualenv env
```

Activate the virtual environment and install the dependencies.
```bash
source env/bin/activate
pip install -r requirements.txt
```

Open a Python interpreter on the virtual environment and import the module

```python
>>> import AutoDiff.AutoDiff as ad
```

#### Introduction to basic usage of the package

After successful installation, the user will first import our package.
```python
>>> import AutoDiff.AutoDiff as ad
```
We have the following options provided:

##### Scalar functions of scalar values
Goal:  gradient of the expression $f(x) = alpha * x + 6$.
Input:  a variable x and then the symbolic expression for `f`.
```python
>>> x = ad.Variable(7, name='x')
>>> f = 7 * x + 6
```
Special function: sin,cos,exp,etc.
```python
>>> f = 7 * ad.sin(x) + 6
```
Goal: evaluate the gradients of f with respect to x.
```python
>>> print(f.val, f.der)
```
f.val returns value of f 
f.der returns gradient of f with respect to x.

Goal: second derivatives of f with respect to x
```python
>>> print(f.der2)
```
f.der2 returns second derivative of f with respect to x.

##### Scalar functions of vectors - Type 1
Goal: gradient of the expression $f(x_1,x_2) = x_1 x_2 + x_1$. 
Input: two variables `x1` and `x2` and the symbolic expression for `f`.
```python
>>> x1 = ad.Variable(2,name='x1')
>>> x2 = ad.Variable(3,name='x2')
>>> f = x1 * x2 + x1
```
Goal: values and gradients of f with respect to x1 and x2
```python
>>> print(f.val, f.der)
```
f.val returns dictionaries of values of f 
f.der returns dictionaries of gradients of f with respect to x1 and x2.

Goal: second derivatives of f with respect to x1 and x2
```python
>>> print(f.der2)
```

f.der2 will then contain dictionaries of values and gradients of f with respect to x1 and x2, i.e., $\frac{\partial^2 f}{\partial x_1^2}$, $\frac{\partial^2 f}{\partial x_2^2}$, $\frac{\partial^2 f}{\partial x_1 \partial x_2}$ and $\frac{\partial^2 f}{\partial x_2 \partial x_1}$ as a dictionary with keys `'x1x1'`, `'x2x2'`, `'x1x2'` and `'x2x1'` respectively.

##### Scalar functions of vectors - Type 2
Goal: gradient of the expression $f(x_1, x_2) = (x_1 - x_2)^2$ where $x_1$ and $x_2$ are vectors themselves. 

Input  two variables `x1` and `x2` and the symbolic expression for `f`.
```python
>>> x1 = ad.Variable([2, 3, 4], name='x1')
>>> x2 = ad.Variable([3, 2, 1], name='x2')
>>> f = (x1 - x2)**2
```
Goal: values and gradients of f with respect to $x_1$ and $x_2$
```python
>>> print(f.val, f.der, f.der2)
```

##### Vector functions of vectors
Goal: gradients of the system of functions 
$$f_1 = x_1 x_2 + x_1$$
$$f_2 = \frac{x_1}{x_2}$$

i.e.
$$\mathbf{f}(x1,x2)=(f_1(x_1,x_2),f_2(x_1,x_2))$$
Input: two variables `x1` and `x2` and the symbolic expression for `f`.
```python
>>> x1 = ad.Variable(3, name = 'x1')
>>> x2 = ad.Variable(2, name = 'x2')
>>> f1 = x1 * x2 + x1
>>> f2 = x1 / x2
```
Goal:  the gradients of f with respect to x1 and x2
```python
>>> print(f1.val, f2.val, f1.der, f2.der)
```
The Jacobian $\mathbf{J}(\mathbf{f})$ =(f1', f2') = (f1.der, f2.der)

Goal: second derivatives (Hessian matrix)
```python
>>> print(f1.der2, f2.der2)
```

## Software Organization 
###### Discuss how you plan on organizing your software package.

- What will the directory structure look like?
```bash
├── AutoDiff
│   ├── __init__.py
│   ├── AutoDiff.py
│   └── file2.py
├── demos
│   ├── demo1.py
│   └── demo2.py
├── tests
│   ├── testforBasicFeature.py
│   └── testforAdditionalFeatures.py
├── docs
│   ├── milestone1.ipynb
│   └── milestone2.ipynb
├── .codecov.yml
├── .travis.yml
├── LICENSE.md
├── README.md
└── requirements.txt
```
- What modules do you plan on including? What is their basic functionality?

   - \_\_init__.py: initializes the package
   - AutoDiff.py: implements basic data structure and algorithms of the forward mode of automatic differentiation, including elementary functions/methods and operator overloading methods


- Where will your test suite live? Will you use TravisCI? CodeCov?

   Our test suite will be under the tests folder. We plan to have 2 test files, one for AD, the other for additional features. TravisCI and CodeCov will be used in our project.
    
    
- How will you distribute your package (e.g. PyPI)?

   We will distribute our package on PyPI. More information regarding how to use our package is discussed in the How To Use section. 
   
   
- How will you package your software? Will you use a framework? If so, which one and why? If not, why not?
   
   We will closely follow the instructions on https://packaging.python.org/tutorials/packaging-projects/ to package our software. As of now, we decide not to use a framework to package our sofeware because our software will include only some Python modules and other files which do not depend on other frameworks. A standard Python’s native packaging should be sufficient for our software.
   
   
- Other considerations?

  If time allows, we are thinking of building a user friendly UI for our software. Some web frameworks for Python are Django or Flask.



## Implementation
###### Discuss how you plan on implementing the forward mode of automatic differentiation.

- What are the core data structures?
   - dictionary: use to keep track of the partial derivatives
   
   
- What classes will you implement? What method and name attributes will your classes have?

| Classes | Description | Attributes | Methods         
| :- |:------------- | :- | :-
|AutoDiff|  an auto-differentiation class with the overloaded operators | der: dictionary of derivatives | Elementary Functions/Methods: sin, sinh, arcsin, cos, cosh, arccos, tan, tanh, arctan, exp, log <br>Operator Overloading Methods: \_\_add__, \_\_radd__, \_\_sub__, \_\_rsub__, \_\_mul__, \_\_rmul__, \_\_pow__, \_\_rpow__, \_\_itruediv__, \_\_rtruediv__, \_\_pos__, \_\_neg__
| AutoDiffTest | a class with the test methods for AutoDiff class | | Comprehensive test methods for each method in the AutoDiff class

- What external dependencies will you rely on?
   
   - numpy: ~1.17.x
   - scipy: ~1.3.x
   

- How will you deal with elementary functions like sin, sqrt, log, and exp (and all the others)?
  
  We will implement these elementary functions in our AutoDiff class in AutoDiff.py. Our AutoDiff class includes elementary functions mentioned in the class description above.
