# CS207 Project Group 9
# Milestone 1

*****

# I. Introduction

The software implements **‘Automatic Differentiation’ (AD)**. This is a technique to computationally evaluate the derivative of a specified function. Importantly, AD is not the same as symbolic differentiation or numerical differentiation, and holds important advantages over both. Symbolic differentiation, which is equivalent to analytically solving differential equations, can find the exact solution (with machine precision), but is very computationally expensive, and so with very large functions can be infeasible. Numerical differentiation, which uses the finite-difference approximation method, is computationally efficient, but is ultimately only approximate, and can be subject to both rounding error and discretisation error, meaning that it cannot be perfectly accurate. Both of these ‘traditional’ methods of differentiation run into problems and inefficiencies when calculating higher derivatives and partial derivatives with respect to many inputs (which is an important component of gradient-based optimisation methods). 


Automatic differentiation solves all these problems as it is able to solve derivatives to machine precision with comparative computational efficiency. As a result, automatic differentiation has incredibly important applications; in its ‘reverse-mode’ (discussed below), it is the basis of back-propagation, a fundamental process in neural network machine learning algorithms - as such this technique is leveraged by open-source machine learning libraries such as TensorFlow. A result of its efficient accuracy and iterative method, AD is capable of algorithmic differentiation: Because of the fact that every computer program, from mathematical algorithms to web-pages, can be expressed as a sequence and combination of arithmetic operations and elementary functions, the derivative of any computer program can be found using automatic differentiation.

# II. Background

Automatic differentiation is essentially the iterative application of the chain-rule. As mentioned above, any function can be considered a sequence of basic arithmetic operations or elementary functions (addition, multiplication, division, subtraction, trigonometric functions, exponential, logarithms etc.) and so any function can be interpreted in the following way (albeit often less simply):
	$$y = f(x) = f(g(h((x)))$$

This can be rewritten as:
$$y = f(g(h(x0))) = f(g(x1)) = f(x2) = x3$$
	
Often, this decomposition is represented as an acyclic, directed computational graph that illustrates the route from the base function x0 to y, as illustrated by the example below:

$ x_0\rightarrow^{h(x)}x_1\rightarrow^{g(x)}x_2\rightarrow^{f(x)}x_3\rightarrow y $



In forward mode, automatic differentiation works by decomposing the function into this structure, and working through each component function finding the derivative using the chain rule ‘inside out’. That is to say, dx0/dx is found first, following by dx1/dx and so on until dy/dx itself is found. All this requires initial values to be set for x0, and x0’.


Reverse mode, however, works in the opposite direction; rather than finding the derivative of the most fundamental component, and then finding the derivative of parent expressions in terms of these children components recursively until the final gradient is found, reverse mode goes the other way. It finds the derivative of each ‘child’ function in terms of its parent function recursively until the basic level derivative is found, at which point the final gradient can be found.


One way of achieving forward mode AD is to use dual numbers. These are an extension of real numbers, somewhat analogous to imaginary numbers, such that every number additionally contains a dual component, $\epsilon$, where $\epsilon^2$ = 0. Given any polynomial function (or, in fact, any analytic real function via its Taylor series), if we replace x with (x+x'$\epsilon$), we find that the function will become: f(x) + f'(x)$\epsilon$. This provides a routine to automatically compute the derivative of the function f(x), and so is used in forward AD.

Sources: https://en.wikipedia.org/wiki/Automatic_differentiation,
	   http://www.columbia.edu/~ahd2125/post/2015/12/5/


# III. How to Use

## _a) Installation_ 

Currently, the best way to install the package is to download or clone the package's Github repo (https://github.com/CS207-Project-Group-9/cs207-FinalProject). Then, one must ensure that the package's requirements (Numpy and numbers) are installed, or install them manually. The user can then create a new driver python script

The package will soon be available on PyPI.

## b) Usage

### i) Importing

The package can be imported simply through:
```python
from AutoDiff import AutoDiff
```

In [1]:
# following code only to allow use in docs folder:
import os
path = os.getcwd().replace('docs','')
os.chdir(path)

# following code for normal import:
from AutoDiff import AutoDiff

### ii) Forward Mode

###### Univariate

For a simple, univariate example, let's find the value of the derivate of $y=3x^2-4x$  at $x=3$.

we create an instance of the AutoDiff object as the basic building block for the equation - in other words, a single value of the independent variable. This object can then be used with binary and unary mathematical operators to construct the full function being evaluated. For each operation, a new function value (AutoDiff.val) and derivative value (AutoDiff.der) are calculated, such that once all operations are complete for the function, the function object's 'der' attribute will be that function's derivative at the point specificied at AutoDiff object creation.

**N.B.**: It is important to note that AutoDiff assumes that the object being initialised is elementary and as such will have a derivative value of 1. Users can pass different der values as an argument for the function if required.

In [6]:
a = 3.0
x = AutoDiff.fAD(a)
y = 3*x**2 - 4*x

print(y.der)

# alternative way to find values:

print(y.get_val(), y.get_jac())

[[14.]]
15.0 14.0


More complex cases can be handled, for example:

$$y = \frac{e^{2x}|1-\log{x}|}{\sin{x} - \cos{x}}$$

In [7]:
x = AutoDiff.fAD(a)

y = (AutoDiff.exp(2*x)*abs(1-AutoDiff.log(x)))/(AutoDiff.sin(x) - AutoDiff.cos(x))

y.get_jac()



215.62712855401998

##### Multivariate

AutoDiff.fAD is designed to handle multivariate calculus as well. For this, the additional functions **create_f** and **stack_f** must be used.

Create_f is used to set up a system of fAD objects, in the form of a list, for construction of a multivariable function, whilst stack_f can be used to take in multiple independent fAD objects and construct a new fAD object that combines these objects such that the values of all objects can be returned at once and the objects can be altered simultaneously.

**AutoDiff.create_f - for creating multiple variables for a single function**

Multiple variables can be passed to create_f **as a list** and it will return an object for each variable with the necessary set of partial derivative values. When finding the derivative of a function with more than one variable, these variables should _always_ be created together using create_f, as this will ensure each variable has the correct vector of partial derivative values. Moreover, variables from unrelated functions should not be created together with create. Passing both $a$ and $b$ into create_f will create partial derivatives of $a$ with respect to $b$ and $b$ with respect to $a$ - if this is not desired then $a$ and $b$ should be created separately. Note that just like the single-variable case, create_f will assume that each object being created is elementary and will assign a derivative value of 1 to each unless otherwise instructed by the user.

**AutoDiff.stack_f - for creating a vector of independent functions**

One can pass multiple fAD objects into stack_f **as a list** and it will return a single fAD object with each of the objects values and derivatives stored as a vector. This allows one to return the derivative of all objects at once, or to manipulate each function in the same way all at once. Calling the `val` attribute of the stacked AutoDiff object will return the value of each function at the specified point. Calling the `der` attribute will return the Jacobian of the derivatives.

**N.B.** Creating a stack_f object for vector value functions allows each function within the vector to be altered uniformly and simultaneously (see below) - but only operations with scalars (i.e. not other fAD objects) are currently supported.

For example, let's deal with the following example of a vector-valued function:

$\begin{bmatrix} f_1 \\ f_2 \\ f_3 \end{bmatrix} = \begin{bmatrix} x^2+2y^2 \\ |x-\cos{y}| \\ x-\frac{y}{x-y} \end{bmatrix}$

at the points:

$x = 4$, $y = 6$

Once we've found the values and derivatives for this vector valued function $F(x,y)$, let's find the values and derivatives of another vector valued function, $G$:

$G(x,y) = 2F(x,y)+5$

In [10]:
x, y = AutoDiff.create_f([4,6])

f1 = x**2 + 2*y**2
f2 = abs(x - AutoDiff.cos(y))
f3 = x - y/(x-y)

F = AutoDiff.stack_f([f1,f2,f3])

print('F function values: ', F.get_val())
print('F derivative values:\n', F.get_jac())

G = F*2+5
print("---------------------------")
print('G Function values: ', G.get_val())
print('G Derivative values:\n', G.get_jac())

F function values:  [88.          3.03982971  7.        ]
F derivative values:
 [[ 8.        24.       ]
 [ 1.        -0.2794155]
 [ 2.5       -1.       ]]
---------------------------
G Function values:  [181.          11.07965943  19.        ]
G Derivative values:
 [[16.       48.      ]
 [ 2.       -0.558831]
 [ 5.       -2.      ]]


## iii) Usage: Reverse Mode

AutoDiff can also implement reverse mode automatic differentiation through the rAD class. The usage is slightly different. Variables can be initialised in a similar way, however this time, multiple variables within one equation do not necessarily need to be defined at once using create_r (although this function still exists should one want to use it). Once the function has been defined, the user must explicitly state which variable is the 'outer' variable of the function using the method outer(). then, to find the gradient of the outer variable with respect to, for example, the variable x, one must call **x**.grad().

This is because, unlike forward mode which determines values and derivatives in the same directions whilst implicitly traversing the function's computational graph, reverse mode traverses the computational graph in one direction, storing the connections between nodes as it does so, to determine function values, and then to find function derivatives it must traverse the computational graph in the reverse direction. This is done by recursing through the from x (inner object) to y (outer object) and then calculating the derivative through the recursion.

Let's again find the value of the derivate of $y=3x^2-4x$  at $x=3$.

In [14]:
a = 3

x = AutoDiff.rAD(a)

y = 3*x**2-4*x
y.outer()

x.grad()

x.get_grad()

14.0

More complex cases can be handled, for example:

$$y = \frac{e^{2x}|1-\log{x}|}{\sin{x} - \cos{x}}$$

In [16]:
x = AutoDiff.rAD(a)

y = AutoDiff.exp(2*x)*abs(1-AutoDiff.log(x))/(AutoDiff.sin(x) - AutoDiff.cos(x))

y.outer()
x.grad()

x.get_grad()

215.62712855402003

#### _Multivariate_

Multivariate cases can also be handled. Again, let's find the derivative of:

$\begin{bmatrix} f_1 \\ f_2 \\ f_3 \end{bmatrix} = \begin{bmatrix} x^2+2y^2 \\ |x-\cos{y}| \\ x-\frac{y}{x-y} \end{bmatrix}$

at the points:

$x = 4$, $y = 6$

This time round, things are a bit more complicated, as gradients must be reset between the functions; the variables can only be defined in the context of one outer function at a time.

In [18]:
x, y = AutoDiff.create_f([4,6])

x = AutoDiff.rAD(4)
y = AutoDiff.rAD(6)

f1 = x**2 + 2*y**2
f1.outer()
x.grad()
y.grad()
print(x.get_grad(),y.get_grad())
AutoDiff.reset_der((x,y))

f2 = abs(x - AutoDiff.cos(y))
f2.outer()
x.grad()
y.grad()
print(x.get_grad(),y.get_grad())
AutoDiff.reset_der((x,y))

f3 = x - y/(x-y)
f3.outer()
x.grad()
y.grad()
print(x.get_grad(),y.get_grad())

8.0 24.0
1.0 -0.27941549819892586
2.5 -1.0


# IV. Software Organisation

## _a) Directory Structure_

```
cs207-FinalProject\
                   AutoDiff/                   
                            __init__.py         
                            __pycache__.py
                            AutoDiff.py
                            test_AutoDiff.py
                   docs/
                        milestone1.ipynb
                        milestone2.ipynb
                        final.ipynb
                   README.md
                   requirement.txt
                   setup.cfg
```                 

## _b) Modules_

All code is contained within the **AutoDiff** module. Within here are two Python modules:

1. AutoDiff.py: This contains all functional code for automatic differentiation. within this file there are two classes, each with two complementary functions for equation construction along with numerous mathematical functions:
    * fAD - Class object that is the basic building block for forward-mode differentiation. Stores function and derivative values and has overloaded mathematical methods to allow construction of mathematical functions.
    * create_f - Function that is used to create multiple fAD objects that are within the same function and thus contain partial derivatives with respect to each other.
    * stack_f - stacks multiple fAD objects, given as a list, into a single fAD object that represents a vector-valued function.
    * rAD - Class object that is the basic building block for reverse-mode differentiation. Stores function value and children lists that, when the grad() method is called, are defined iteratively through reverse mode. Also contains a function outer() which is required to define the final parent node of the function's computational graph.
    * create_r - analogous function to its forward-mode counterpart, though not strictly necessary for multivariate functions as create_f is for forward mode.
    * stack_r - analogous to forward mode counterpart, creates rAD object for multiple functions - unlike forward mode, the functions (as python functions or lambda functions) must be passed to stack alongside the rAD objects. returns jacobian matrix of derivatives.
    * mathematical functions - designed to handle forward and reverse mode objects as well as standard numerics: includes sin, cos, tan, arcsin, arccos, arctan, sinh, cosh, tanh, exp, log, and sqrt.
    * mul_by_row - Allows multiplication of forward-mode autodiff object with 2-dimensional derivatives, used for generalised overloading of multiplicative magic methods.
    * reset_ders - reset derivative values of reverse mode objects so that they can be reused in new functions.
2. test_AD.py: This is the test-suite for AutoDiff.py. Tests for correct function of above classes and functions. Tests are run using pytest, with the assistance of numpy.testing functions `assert_array_equal` and `assert_array_almost_equal` for dealing with cases where values are subject to a degree of rounding error. The tests are linked with Travis CI, which provides continuous integration software testing, and Coveralls, which provides a code coverage service to ensure that all of the code is being tested.


Currently, the way to install this package is to download or clone it from the package's repo on Github (https://github.com/CS207-Project-Group-9/cs207-FinalProject), manually install requirements if not present, and create driver script in project directory.

## _c) Implementation_

### Class: fAD

This class encapsulates the fundamental machinery of forward mode automatic differentiation, and is capable of dealing with both single and multi-variable cases.

#### Dependencies

- numpy (imported as np): used for numerous mathematical (e.g. trigonometric, logarithmic operations)
- numbers: used to ensure user passes numerical values.

#### Attributes (Data Structures)
- val: array of floats (of size 1 or more)
    - Numeric values, indicating the value of each entry in the current AutoDiff instance. For cases with only one function, val will size 1. For vector-valued functions, val can be longer.
    
- der: 2D array of floats
    - Values representing the derivative value(s) in the current AutoDiff instance. The returned 2D array can be thought of as the Jacobian of all functions and variables for the AutoDiff instance. Suppose there are m elementary variables constructing n functions, all stored within the AutoDiff instance, 'der' would be a n\*m array of elements, with the (i,j) entry representing the derivative value of the i-th AutoDiff with respect to the j-th elementary variable.

#### Methods 
(The following demonstrations are for the case when there is only one value in self.val. When the fAD object is in higher dimension, storing the values in an array allows us to simply apply the computation to each entry.)

0. `__init__`:
    - arguments: 
        - a list/array of fAD instances or numerics
    - sets self.val as a list of 'val' attributes of the input AD instances
    - combines the 'der' attributes of the input fAD instances as a 2D array and save as self.der


1. `__add__` & `__radd__`: 
    - arguments:
        - self
        - other: a float, int, or fAD
    - returns: 
        - if other is an AD -> a new fAD instance with new.val = self.val + other.val, new.der = self.der + other.der
        - if other is a numeric value -> a new fAD instance with new.val = self.val + other, new.der = self.der


2. `__sub__`
    - arguments:
		- self
		- other: a float, int, or fAD
	- returns: 
		- if other is a fAD -> a new fAD instance with new.val = self.val - other.val, new.der = self.der - other.der
		- if other is a numeric value -> a new fAD instance with new.val = self.val - other, new.der = self.der
        
        
3. `__rsub__`
    - arguments:
		- self
		- other: a float, int, or fAD
	- returns: 
		- if other is a fAD -> a new fAD instance with new.val = other.val - self.val, new.der = other.der - self.der
		- if other is a numeric value -> a new fAD instance with new.val = other - self.val, new.der = -self.der


4. `__mul__` & `__rmul__`
	- arguments:
		- self
		- other: a float, int, or fAD
	- returns: 
		- if other is a fAD -> a new fAD instance with new.val = self.val \* other.val, new.der = self.val \* other.der + self.der \* other.val
		- if other is a numeric value -> a new fAD instance with new.val = self.val \* other, new.der = self.der \* other


5. `__truediv__`
	- arguments:
		- self
		- other: a float, int, or fAD
	- returns: 
		- if other is a fAD -> a new fAD instance with new.val = self.val / other.val, new.der = self.der\/other.val-self.val\*other.der\/(other.val\*\*2)
		- if other is a numeric value -> a new fAD instance with new.val = self.val / other, new.der = self.der / other
	- raises:
		- ZeroDivisionError when other.val = 0 or other = 0


6. `__rtruediv__`
	- arguments:
		- self
		- other: a float, int, or fAD
	- returns: 
		- if other is a fAD -> a new fAD instance with new.val = other.val / self.val, new.der = other.val / self.der-other.val\*self.der / (self.val\*\*2)
		- if other is a numeric value -> a new fAD instance with new.der = -other \* self.der / (self.val \*\*2)
	- raises:
		- ZeroDivisionError when self.val = 0


7. `__pow__`
	- arguments:
		- self
		- exp: a float, int, or fAD
	- returns:
		- if exp is a fAD -> a new fAD instance with new.val = self.val \*\* exp.val, new.der = (self.val\*\*exp.val) * (self.der\*exp.val /self.val + exp.der\*np.log(self.val))
		- if other is a numeric value -> a new fAD instance with new.val = self.val \*\* exp, new.der = exp*(self.val\*\*(exp-1))\*self.der


8. `__rpow__`
	- arguments:
		- self
		- base: a float, int, or fAD
	- returns:
		- if base is a fAD -> a new fAD instance with new.val = base.val\*\*self.val, new.der = (base.val\*\*self.val) * (base.der\*self.val /base.val + self.der\*np.log(base.val))
		- if base is a numeric value -> a new fAD instance with new.val = base\*\*self.val, new.der = np.log(base)\*(base\*\*self.val)\*self.der


9. `__neg__`
	- arguments:
		- self
	- returns:
		- a new fAD instance with new.val = -self.val, new.der = -self.der
        

10. `__abs__`
	- arguments:
		- self
	- returns:
		- a new fAD instance with new.val = abs(self.val), new.der = (self.val / abs(self.val)) \* self.der



11. `__eq__`
    - arguments:
        - self
        - other: a fAD instance
    - returns:
        - 'True' if self.val==other.val and self.der==other.der, 'False' otherwise
        
12. `__ne__`
    - arguments:
        - self
        - other: a fAD instance
    - returns:
        - 'False' if self.val==other.val and self.der==other.der, 'True' otherwise
        
12. `__str__`
    - arguments:
        - self
    - returns:
        - a string describing the value and derivatives of the current instance
        
        
13. `__len__`
    - arguments:
        - self
    - returns:
        - len(self.val); number of function values stored within the fAD object.
        
        
14. `__repr__`
    - arguments:
        - self
    - returns:
        - string describing fAD(self.val,self.der)
        
        
15. `get_val()`
    - arguments:
        - self
    - returns:
        - self.val formatted correctly


16. `get_jac()`
    - arguments:
        - self
    - returns:
        - self.der formatted correctly


### Class: rAD

This class encapsulates the fundamental machinery of forward mode automatic differentiation, and is capable of dealing with both single and multi-variable cases.

#### Dependencies

- numpy (imported as np): used for numerous mathematical (e.g. trigonometric, logarithmic operations)
- numbers: used to ensure user passes numerical values.

#### Attributes (Data Structures)
- val: array of floats (of size 1 or more)
    - Numeric values, indicating the value of each entry in the current AutoDiff instance. For cases with only one function, val will size 1. For vector-valued functions, val can be longer.
    
#### Methods 
(The following demonstrations are for the case when there is only one value in self.val. When the AD object is in higher dimension, storing the values in an array allows us to simply apply the computation to each entry.)

0. `__init__`:
    - arguments: 
        - a list/array of AD instances
    - sets self.val as a list of 'val' attributes of the input AD instances
    - creates empty list for children and sets derivative value to None


1. `grad`:
    - arguments:
        - self
    - returns:
        - gradient of outer object with respect to this object; calling this before variable.der / variable.get_der() will update derivatives for outer variable from None to its gradient with respect to this object.

2. `__add__` & `__radd__`: 
    - arguments:
        - self
        - other: a float, int, or rAD object
    - returns: 
        - if other is a rAD -> a new rAD instance with new.val = self.val + other.val
        - if other is a numeric value -> a new AD instance with new.val = self.val + other
        - appends to the children of self and other (if other is rAD) a tuple of weight = self.val, and the new rAD object.


3. `__sub__`
    - arguments:
		- self
		- other: a float, int, or rAD object
	- returns: 
		- if other is a rAD -> a new rAD instance with new.val = self.val - other.val
		- if other is a numeric value -> a new AD instance with new.val = self.val - other
        - appends to the children of self and other (if other is rAD) a tuple of weight = self.val (times 1 and -1 respectively), and the new rAD object. 
        
        
4. `__rsub__`
    - arguments:
		- self
		- other: a float, int, or rAD object
	- returns: 
		- if other is a rAD -> a new rAD instance with new.val= other.val - self.val
		- if other is a numeric value -> a new AD instance with new.val = other - self.val
        - appends to the children of self and other (if other is rAD) a tuple of weight = self.val (times -1 and 1 respectively), and the new rAD object. 



5. `__mul__` & `__rmul__`
	- arguments:
		- self
		- other: a float, int, or rAD
	- returns: 
		- if other is a rAD -> a new rAD instance with new.val = self.val \* other.val, 
        - appends (other.val, new value) to self's children and (self.val, new value) to other's children
		- if other is a numeric value -> a new rAD instance with new.val = self.val \* other, new.der = self.der \* other
        - appends (other\*self.value, new value) to self's children


6. `__truediv__`
	- arguments:
		- self
		- other: a float, int, or rAD
	- returns: 
		- if other is a rAD -> a new rAD instance with new.val = self.val / other.val 
        - appends (1/other.val, new value) to self's children and ((-self.val / other.val\*\*2),new value) to other's children 
		- if other is a numeric value -> a new rAD instance with new.val = self.val / other, and (1/other, new value) is appended to self's children.
	- raises:
		- ZeroDivisionError when other.val = 0 or other = 0


7. `__rtruediv__`
	- arguments:
		- self
		- other: a float, int, or rAD
	- returns: 
		- if other is a rAD -> a new rAD instance with new.val = other.val / self.val, 
        - appends (-other.val / self.val\*\*2, new value) to self's children and (1/self.val, new value) to other's children
		- if other is a numeric value -> a new rAD instance and appends (-other/self.val\*\*2, new value) to self's children
	- raises:
		- ZeroDivisionError when self.val = 0


8. `__pow__`
	- arguments:
		- self
		- exp: a float, int, or rAD
	- returns:
		- if exp is a rAD -> a new rAD instance with new.val = self.val \*\* exp.val
        - appends (self.val\*\*(other.val-1)\*other.val, new value) to self's children and (self.val\*\*other.val\*np.log(self.val), new value) to other's children
		- if other is a numeric value -> a new rAD instance with new.val = self.val \*\* exp, and appends self's children with (self.val\*\*(other-1)\*other, new value)


9. `__rpow__`
	- arguments:
		- self
		- base: a float, int, or rAD
	- returns:
		- if base is a rAD -> a new rAD instance with new.val = base.val\*\*self.val
        - appends self's children with (other.val\*\*self.val\*np.log(other.val), new value) and other's children is appended with (other.val\*\*(self.val-1)\*self.val, new value)
		- if base is a numeric value -> a new rAD instance with new.val = base\*\*self.val, and self' children  is appended with (other\*\*self.val\*np.log(other), new value)


10. `__neg__`
	- arguments:
		- self
	- returns:
		- a new AD instance with new.val = -self.val, and (-1,new value) is appended to self's children      

11. `__abs__`
	- arguments:
		- self
	- returns:
		- a new AD instance with new.val = abs(self.val), self's children is appended with (self.val / abs(self.val), new value)



12. `__eq__`
    - arguments:
        - self
        - other: a rAD instance
    - returns:
        - 'True' if self.val==other.val and self.der==other.der, 'False' otherwise
        
13. `__ne__`
    - arguments:
        - self
        - other: a rAD instance
    - returns:
        - 'False' if self.val==other.val and self.der==other.der, 'True' otherwise
        
14. `__str__`
    - arguments:
        - self
    - returns:
        - a string describing the value and derivatives of the current instance
        
15. `get_val()`
    - arguments:
        - self
    - returns:
        - self.val formatted correctly


16. `get_grad()`
    - arguments:
        - self
    - returns:
        - gradient of self with respect to outer object
        
17. `outer`
    - arguments:
        - self
    - returns:
        - nothing
        - sets this object to be the outer variable for a function by setting self.der = 1 (derivative of self with respect to self = 1).
       

### Functions: create, stack, sin, cos, log, exp.

#### Dependencies

- numpy (imported as np): used for numerous mathematical (e.g. trigonometric operations)
- numbers: used to ensure user passes numerical values.
- math: used for logarithms with variable bases.

The following functions exist outside the AutoDiff class:

1. `create_f`
    - allows the users to quickly create multiple fAD instances
    - arguments:
        - val: a list of values 
        - der: optional, assigned derivative values
    - returns:
        - a list of fAD objects

1. `create_r`
    - allows the users to quickly create multiple fAD instances
    - arguments:
        - val: a list of values 
    - returns:
        - a list of rAD objects

2. `stack_f`
    - allows users to stack multiple fAD instances into one high-dimentional fAD instance
    - arguments:
        - vals: a list of multiple fAD instances
    - returns:
        - one fAD object, with *val* = an array of *val*'s of the fAD instances in the argument fADs, and *der* = an array of *der*'s of the fAD instances in the argument
        
3. `stack_r`
    - allows users to stack multiple rAD instances into one high-dimentional rAD instance
    - arguments:
        - vals: a list of multiple rAD instances
        - functions: list of multiple functions to be combined
    - returns:
        - jacobian of functions
        
3. `sin`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns sin(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.

3. `cos`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns cos(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.


3. `tan`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns tan(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.
        
3. `arcsin`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns arcsin(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.

3. `arccos`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns arccos(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.
        
3. `arctan`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns arctan(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate. 
        
3. `sinh`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns sinh(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.
        
3. `cosh`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns cosh(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.
        
3. `tanh`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns tanh(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.

5. `exp`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns exp(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.

6. `log`
	- arguments:
		- x (AutoDiff or numeric)
	- returns:
		- returns log(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.
        
5. `sqrt`
	- arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns sqrt(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.
        
5. `logistic`
    - arguments:
		- x (fAD, rAD or numeric)
	- returns:
		- returns logistic(x) as the appropriate object (fAD, rAD or numeric) with correct derivative/children as appropriate.
        
5. `mul_by_row`
	- arguments:
		- val: array of values
        - der: array of derivatives
	- returns:
		- performs row-wise multiplication for forward-mode autodiff objects, facilitating calculations with 2-dimensional derivatives.
        
5. `reset_der`
	- arguments:
		- rADs: single instance or array of rAD objects
	- returns:
		- nothing
        - resets children and derivative values for all rAD objects given.

# _d) Future_

The main improvement that could be made is to improve the user-friendliness and intuitiveness of the reverse mode. Due to the fact that the values and computational tree are created in one direction and the derivatives are determined by traversing the other direction, the usage is slightly counter-intuitive; the derivative of y with respect to x is an attribute of x, not y. An ideal usage might be that the derivative of y with respect to x is an attribute of x could be attained through calling something like y.grad(x), but to do this the objects x and y must be connect in some way, and it is not clear how this would be achieved without decreasing user-friendliness further.