# Introduction

This software solves the issue of accurate differentiation. 
Accurate differentiation is important for many fields such as machine learning, numerical methods and optimization. 
Being able to accuratelly know the derivative of a non-linear function allows programers and mathematicians to quickly
take derivatives when the focus of their research or project does not rely on the actual steps of differentiating a function,
but simply finding the correct answer for the derivative of a given equation to move forward in their work. 

Unlike finite-difference numerical differentiation which is an approximation, automatic differentiation uses dual numbers to compute
within machine precision exact derivatives based on elementary primitives, allowing for high-performance and highly accurate computation
of numerical derivatives, which are useful for many fields.

This software package will do just that for N number of variables and complex derivatives that would otherwise be 
extremely challenging to evaluate. This process should help minimize errors as compared to numerical methods.




# Background

The main mathematical idea behind automatic differentiation is to break downs the process of differentiation 
into specific and iterable steps. We do so by breaking down each equation into to elementary arithmetic operations
such as addition, subtraction, multiplication, division, power, expoential, logarithmic, cos, sin, tan, etc. 
To perform this process, automatic differentiation uses the power of the chain rule to brake down derivatives of composit functions into easily solvable components.
The benefit of following this approach is that it allows the derivative evaluation to be as accurate as possible up to computer precision, unlike numerical differentiation. 

Automatic differentiation is benefitial because it can be used in two ways. The forward and the reverse accumulation. 
The workings of each of the two modes are described in more detail below.

## Forward Accumulation 
In this mode we break down the equation by following chain rule as we would when doing it by hand. This approach is benefitial to compute accurate differentiation of pf matrix producs such as Jacobians. 
Because AD method inherently keeps track of all operations in a table, this becomes very efficient for evaulation other types higher order derivative based matrices such as Hessians. 

## Reverse Accumulation
In this mode, the dependent variable is fixed and the derivative is computed backward recursively. This means that this accumulation type travels through the chainrule in a backward fashion, namely, from the outside toward the inside.
Because of its similarity to backpropagation, namely, backpropagation of errors in multilayer perceptrons are a special case of reverse mode, this type of computational coding is a very efficient way of computing these backpropagations of error
and ultimatly enables the ability to optimize a the weights in a neural network.

### Example Evaluation Trace for a Simple Neural Network
![](https://raw.githubusercontent.com/matheuscfernandes/cs107_matheus_fernandes/master/homework/HW4/HW4-final/P2_graph.png?token=ACDGXVNZ5KUZRP2NTJI5UQC7WWAL6)


# How to use _AAD_ ("Awesome Automatic Differentiation")

## Initialization

* `git clone` this package repository into your project directory.
* Import our package using `import AAD`
* Consult the documentation for quick examples to get started.

## Code example
```python
import AAD
import math
my_AD = AAD(math.log)
my_AD.derive()
val, derivative = my_AD.evaluate(5) # for x = 5, type Dual
print("val = ", val, ", derivative = ", derivative)
```
Answer:
```
<AAD: automatically differentiated in 10 iterations>
val = 1.60943791, derivative = 0.200
```

# Organization

## Directory structure and modules
* We will have a main importable class that contains the directions for each function and how to use them. 

```
README.md
docs/                    Contains the documentation for the project.
   README.md
   milestone1.ipynb
   milestone2.ipynb
   milestone2_progress.ipynb
   ...
code/                     Source files
   types.py                 Shared types
   shared.py                Shared utility functions
   AAD.py                   Main constructor
   forward.py               Forward mode code
   reverse.py               Reverse mode code
tests/                   Contains the test suite for the project
```

## Modules

* The modules that we will use are `numpy`, `math`, `SimPy`, `SciPy`
   * `numpy` will be used in order to evaluate and analyze arrays.
   * `math` will be used in for its access to simple mathematical functions.
   * `SimPy` will potentially be used to take symbolic derivatives and will be useful in our test suite. Additionally,
   if a function is not in our elementary functions, we can use this module to help evaluate them.
   * `SciPy` will be useful to test how our automatic differentiator compares to numeric derivatives (speed test).

## Test suite
* Our test suite will live inside the `tests` folder within our main repository.
* We will use `TravisCI` to test our suite.

## Distribution and packaging
* We will distribute this package using `PyPI` and through the GitHub repository.
* We will use `package`, which will package our package and we will not use a framework. We will not use a framework
because this project is simple enough where we can manage without one.


# Implementation

## Data structures
The core data structures of this software consist of a `Dual` number class holding dual numbers and auxiliary classes to hold elementary functions required to perform algebra with `Dual` numbers.

## Classes and method signatures
### The `Dual` number class
Includes representation and string functions, i.e. `__repr__` and `__str__`.

Allows retrieval of the function value (i.e. `val()`) and derivative value (i.e. `deriv()`).

### The `AAD` class
Includes APIs used for performing Automatic Differentiation, such as:
* `derivative(f, x)`
* `gradient(f, x)`
* `jacobian(f, x)`
* `hessian(f, x)`

### Attributes

Our classes will have the following attributes
* `__init__(f)` -- initialization and all variables
* `orderofoperations()`  -- finding the order of operations
* `derive()` -- finding the derivatives of each function
* `evaluate(X)` evaluating the functions and its derivatives at a given point

### External dependencies

For matrix support, this software package requires `numpy`, `math`, `SimPy`, `SciPy`.

### Vector Valued Functions

For these functions we will evaluate each element of the vector valued function independently and then move on to the next element.
We will store each evaluation in a numpy array - akin to a jacobian to output the correct derivative.

### Elementary Operators

We will overload the addition, multiplication, subtraction, division, and power operators to work within our software using
the baseline dunder methods and the reverse versions (ex: `__rmul__` and `__mul__`)., and also the negation operator (`__neg__`).

### Elementary functions
Elementary functions are implemented using built-in Python `math` and the `numpy` package and include ready-made implementations for `Dual` numbers in this package.

To deal with elementary functions such as `sin`, `log` and `exp` we will manually evaluate them and add them to a database to query.
* The derivative of `sin(x)` = `x' * cos (x)`
* The derivative of `log(x)` = `x' * 1/x`
* The derivative of `exp(x)` = `x' * exp(x)`

Additionally, for other trigonometric functions, we will do them same.

# Feedback
## Milestone 1
### 2/2 Introduction:
Would have been nice to see more about why do we care about derivatives anyways and why is AAD a solution compared to other approaches? 

**Response:** *We have added a paragraph about finite-difference derivatives and AAD in the introduction. Additionally, we noted why automatic differentiation
is preferred to numerical.*

### 1/2 Background
Good start to the background.  The flow could have been enhanced by presenting the evaluation trace and a computational graph.

I would like to see more discussion on automatic differentiation. How do forward mode and reverse mode work?

Going forward, I would also like to see a discussion on what forward mode actually computes (Jacobian-vector product), the "seed" vector, and the efficiency of forward mode.

**Response:** *We have added an example of the forward mode evaluation trace, the computational graph, and explanations for the modes*

### 3/3 How to use
Good job!

**Response:** *We left this section as is because we got a perfect score and were only given positive feedback*

### 3/3 Software Organization
It would be nice to include the directory structure tree.

**Response:** *A proposed directory structure has been included.*

### 4.5/5 Implementation
1. How will you handle vector valued functions?
2. Your implementation for elementary functions like `sin` and `log` is unclear.
3. Will you implement operator overloading methods?

**Response:** *We explained how we would implement elementary funtions manually and created sectsion for both elementary operators vector valued functions*

13.5/15