# Minimum Package Requirement

We have used PyPI to host our package. Users can download our Automatic Differentiation package with the following command:

`pip install autodiffing`

To install the required dependencies, users need to run the following command:

`pip install -r requirements.txt`

Within our requirements.txt, we have the following packages:

`bleach==3.1.0`\
`certifi==2019.9.11`\
`cycler==0.10.0`\
`docutils==0.15.2`\
`kiwisolver==1.1.0`\
`matplotlib==3.1.1`\
`nltk==3.4.5`\
`numpy==1.17.4`\
`Pygments==2.4.2`\
`pyparsing==2.4.5`\
`python-dateutil==2.8.1`\
`readme-renderer==24.0`\
`requests-toolbelt==0.9.1`\
`scipy==1.3.2`\
`six==1.13.0`\
`twine==2.0.0`\
`webencodings==0.5.1`\

Most of the packages above come with the installation of `python` version 3.7. 

Our team has selected only 3 additional packages for our user to install, namely `matplotlib`, `scipy` and `numpy`. 

`numpy` is essential for our Automatic Differentiation package as we require it for the calculation of our elementary functions, and for dealing with arrays and matrices when there are vector functions and vector inputs.

`scipy` is a good package to have for its optimization and linear algebra abilities. In particular, under our future features for the Automatic Differentiation package, we hope to be able to deal with optimization problems. 

`matplotlib` is needed for any potential visualization of our outputs. `bleach`, `docutils`, `Pygments` are additional packages that `matplotlib` requires.



# Future Features

Future features for our Automatic Differentiation package include taking in vector inputs and vector functions and implementing reverse mode for automatic differentiation. For each of these future features, their required software changes and primary challenges are elaborated below.


### Vector Inputs 

To deal with vector inputs and vector functions, we make use of our current package that deals with only scalar function of scalar input. Specifically speaking, we first create scalar functions that can deal with vector inputs before considering vector functions. A new class `scalar_func` is created that inherits from the `DualNumber` class. In this class, we define new methods that have their equivalent in the `DualNumber` class and determine values of the scalar function and its derivatives by looping over the array of vector inputs. 

In [None]:
class scalar_func(DualNumber):
    # To deal with scalar functions with vector inputs
    def __init__(self, vector_inputs,seed_vector):
    # check dimension
    assert len(seed_vector) == len(vector_inputs)
    self._inputs = vector_inputs
    self._val= np.empty()
    self._der= seed_vector
    
    def val(self):
        for i in self._inputs:
            self._val = np.append([self._val],[i.val()],axis=0)
        return self._val
    
    def Jacobian(self):
        # Calculate Jacobian vector given the vector_inputs
        for i in self._inputs:
            jacobian = np.append([jacobian],[i.der()],axis=0)
        return jacobian
    
    def der(self):
        # determine result of derivatives
        self._der = np.dot(self.Jacobian(), self._der)
        return self._der

### Vector Functions

Vector functions build on the work that is done for scalar and vector inputs. A new class `vector_func` is created that inherits from the `scalar_func` class. In this class, we define new methods that have their equivalent in the `scalar_func` class and the general approach is to loop over the list of scalar functions while applying the equivalent methods in the `scalar_func` class to determine the outputs for the vector functions. For instance, in the `Jacobian` method, we calculate the Jacobian matrix by simply looping over the list of `scalar functions` and applying the `Jacobian` method on each scalar function. The result for the derivatives is determined using the dot product betwen the Jacobian matrix and seed vector.

In [None]:
import numpy as np

class vector_func(scalar_func):
    # To deal with vector functions with scalar or vector inputs, inherits from the scalar_func class
    
    def __init__(self, scalar_functions,seed_vector):
    # check dimension
    assert len(seed_vector) == len(scalar_functions)
    self._functions = scalar_functions
    self._val=np.empty()
    self._der= seed_vector
    
    def val(self):
        for func in self._functions:
            self._val = np.append([self._val],[func.val()],axis=0)
        return self._val
    
    def Jacobian(self):
        # Calculate Jacobian matrix given the scalar_functions
        for func in self._functions:
            jacobian_matrix = np.append([jacobian_matrix],[func.Jacobian()],axis=0)
        return jacobian_matrix
    
    def der(self):
        # determine result of derivatives
        self._der = np.dot(self.Jacobian(), self._der)
        return self._der

The primary challenge for the implementation of both vector inputs and vector functions is the design of the code such that users can interact with our package in the most straightforward and easily understood manner. For example, the team is considering collapsing both `scalar_func` and `vector_func` into a single `func` class so that the user only have to call upon one class when defining functions. In addition, we have to think of a way to tackle the case when users decide to create a function using `DualNumber` directly (ie not using our classes for functions). In that case, perhaps we should make `DualNumber` a private class and let users use `func` directly to create variables (ie treat variables as a scalar function of scalar input). 

### Reverse Mode

The reverse mode is fundamentally different in its approach to automatic differentiation as compared to the forward mode. In particular, the reverse mode consists of both the forward pass and reverse pass, with no chain rule applied in the forward pass (only partial derivatives are stored). The result of a reverse mode is only determined after the reverse pass is done, and the value of each variable or parent node at each stage depends on the values of its children nodes. As such, this has three important implications for the design of our package for reverse mode. 

Firstly, the reverse mode cannot be interpreted in the context of dual numbers like the forward mode and we need to come up with a different class for the implementation of reverse mode. Since the result cannot be calculated until the reverse pass is done and the variables at each stage of the reverse mode depends on the values of its children, we need to instantiate a reverse mode object/variable with an empty list that will temporarily hold the partial derivative values of its children during the forward pass. Note that we need a list here because it is possible for a parent node to have more than one child node.

In [None]:
class ReverseVar():
    def __init__(self, value):
        self._value = value # value of variable at which the derivative is determined
        self._children = [] # empty list to contain partial derivatives of children during forward pass
        self._der = None # value not determined until reverse pass is done
        
    def val(self):
        return self._value
        

Secondly, as the forward pass only does partial derivatives and does not apply the chain rule, we need to redefine the overloading of operators for our reverse mode objects/variables. As an example, the overloading of the multiplication operator is shown below. Note that overloading the operators in essence is equivalent to carrying out the forward pass, and the partial derivatives are stored as temporary items within `self._children` for evaluation later during reverse pass.

In [None]:
class ReverseVar():
    def __init__(self, value):
        self._value = value # value of variable at which the derivative is determined
        self._children = [] # empty list to contain partial derivatives of children during forward pass
        self._der = None # value not determined until reverse pass is done
        
    def val(self):
        return self._value    
        
    def __mul__(self, other):
        z = ReverseVar(self._value * other._value)
        self._children.append((other._value, z)) 
        other._children.append((self._value, z)) 
        return z

Lastly, we define a method `der` to carry out the reverse pass recursively in order to calculate the value of the derivatives.

In [None]:
class ReverseVar():
    def __init__(self, value):
        self._value = value # value of variable at which the derivative is determined
        self._children = [] # empty list to contain partial derivatives of children during forward pass
        self._der = None # value not determined until reverse pass is done
    
    def val(self):
        return self._value
        
    def der(self):
        # recurse only if the derivative is not yet calculated
        if self._der is None:
            # calculate derivative using chain rule
            for weight, var in self._children:
                self._der = sum(weight * var.der())
        return self._der

    def __mul__(self, other):
        z = Var(self._value * other._value)
        self._children.append((other._value, z)) 
        other._children.append((self._value, z)) 
        return z

The primary challenge for reverse mode is to ensure that its classes, methods and attributes are kept separate from that of forward mode even though we would want them to share certain similarity. For example, we would want the user to use the same methods in each mode to get out the same results. In addition, after writing code for the reverse mode to deal with scalar functions of scalars, we ought to extend the reverse mode to deal with vector inputs and vector functions just like the forward mode did.

### Visualization

When users interact with our package, it would be useful for them to have a way of visualizing the on-going calculations or final results. We define a new method called `post_process` which will be found in the different classes of our package where visual outputs of either key calculations or important results is possible. For instance, `post_process` within the reverse mode class might produce tables of forward pass and reverse pass. The method `post_process` primarily uses the `matplotlib` library and takes in `directory_out` as an argument for users to indicate the directory in which they wish to save the visualization outputs. 

In [None]:
def post_process(directory_out):
    #Visualization of relevant calculations and results goes here
    fig, ax = plt.figure()
    plt.savefig('{}/figure_title.pdf'.format(directory_out))
    plt.show()