In [1]:
from Dotua.autodiff import AutoDiff as ad
from Dotua.rautodiff import rAutoDiff

# Reverse Mode Implementation: High Level

- Efficiency
- Ease of Use
- Expressibility

# Efficiency

- Gradient computaional overhead scales linearly in the number of user defined functions
- Favorable for certain applications in comparison to forward mode where overhead scales linearly in the number of input variables

# Ease of Use

- Comparable to NumPy usage syntax
- Limited initialization

In [18]:
import numpy as np

np.sin(5)

from Dotua.roperator import rOperator as rop
from Dotua.rautodiff import rAutoDiff

rop.sin(5)

rad = rAutoDiff()
x = rad.create_rscalar(5)
rop.sin(x)


<Dotua.nodes.rscalar.rScalar at 0x104b50780>

## Expressibility

### Given that **Dotua** aims to be a partial NumPy replacement, it is essential that users can be as mathematically expressive with **Dotua** as the can be with NumPy.  

__Operator Support__:
- Basic functions: addition, subtraction, multiplication, division, exponentiation, and negation
- Trigonometric functions: sine, cosine, and tangent
- Inverse trigonometric functions: arcsine, arccosine, arctangent, 
- Hyperbolic functions and their inverses: sine, cosine, and tangents
- Logarithms of any base
- Natural exponentials

__Functional Support__:
- Roots of arbitrary degree
- Logistic functions

In [22]:
from Dotua.roperator import rOperator as rop

# Complex function demo
rad = rAutoDiff()

vals = [0.5, 11, 13, 17, 23, 0.4]
a, b, c, d, f, h = rad_initializer.create_rscalar(vals)

# func = a + b

func = rop.arccosh(b) / (rop.log(c) ** (rop.arcsin(b ** -2))) + \
       rop.cos((rop.tan(h + a) ** rop.log(d * h, base=2)) / (rop.sin(f) * rop.cos(f)))

display(func.eval())
display(rad_initializer.partial(func, a))
display(rad_initializer.partial(func, b))
display(rad_initializer.partial(func, c))
display(rad_initializer.partial(func, d))
display(rad_initializer.partial(func, f))
display(rad_initializer.partial(func, h))


2.578175798120918

1.639349254446802

-3.0650168320059885

-0.0007596792228180783

0.07206032210419326

3.519659904572385

4.701912943875016

# Reverse Mode Implementation: Low Level

- Constructing the computational graph
- Calculating gradients
- Handling multiple functions

## Constructing the Computational Graph

- The *parents* class variable of **rScalar** objects stores the computational graph and derivatives.  
- Operator overloading and Dotua's elementary functions provide this support.

## Constructing the Computational Graph: rScalar

In [None]:
def __init__(self, val):
    self.val = val
    self.parents = []
    self.grad_val = None

- The *parents* member variable is a list of tuples of the form *(parent, value)*
- *parent* is an **rScalar** created by operator overloading or elementary functions in **rOperator**
- *value* is  d(parent) / d(self)
- This storage method delays derivative computation until the user specifies the function to be differentiated

## Constructing the Computational Graph: rScalar

In [17]:
def __mul__(self, other):
    new_parent = rScalar(self.val)
    try:
        new_parent.val *= other.val
        self.parents.append((new_parent, other.val))
        other.parents.append((new_parent, self.val))
    except AttributeError:
        new_parent.val *= other
        self.parents.append((new_parent, other))
    return new_parent

- **rScalar** overloaded operators create intermediary nodes in the computational graph
- **rScalar** overloaded operators assign new nodes as the parents of the **rScalar** variables that define them
- Dotua's **rOperator** functions can also handle non-rScalar objects (Python numeric types) 

## Constructing the Computational Graph: rOperator

In [None]:
@staticmethod
def sin(x):
    try:
        new_parent = rScalar(np.sin(x.val))
        x.parents.append((new_parent, np.cos(x.val)))
        return new_parent

    except AttributeError:
        return np.sin(x)

- **rOperator** functions also create intermediary nodes in the computational graph
- **rOperator** functions also assign new nodes as the parents of the **rScalar** variables that define them
- Dotua's **rOperator** functions can also handle non-rScalar objects (Python numeric types) 

## Calculating Gradients

- The **rAutoDiff** class provides a user interface for calculating derivatives of functions of **rScalar** and **rVector** variables
- The **rScalar** and **rVector** gradient methods use the computational graph representations stored in **rScalar** and **rVector** objects to determine derivatives

## Calculating Gradients: rScalar

In [None]:
def gradient(self):
    if self.grad_val is None:
        self.grad_val = 0
        for parent, val in self.parents:
            self.grad_val += parent.gradient() * val
    return self.grad_val

- **rScalar** variables recursively use the computational graph defined in *self.parents*
- The derivative computation bubbles up to the level of user defined functions 
- Derivatives are propagated back down to the user defined input variables

## Multiple Functions

- Initialize and manage **rScalar** objects through a centralized object: **rAutoDiff**
- **rAutoDiff** objects provide user interface for calculating derivatives
- **rAutoDiff** objects "know" when to reset **rScalar** gradient class variables

## Multiple Functions

In [None]:
def __init__(self):
    self._func = None
    self._universe = []

def _reset_universe(self, var):
    var.grad_val = None
    for parent, _ in var.parents:
        self._reset_universe(parent)

- **rAutoDiff** objects keep track of user defined **rScalar** objects
- **rAutoDiff** "remember" the last function the user differentiated
- When necessary, **rAutoDiff** objects can reset the gradient values cached in all **rScalar** variables

In [None]:
def partial(self, func, var):
    if (self._func != func):
        for item in self._universe:
            self._reset_universe(item)
        func.grad_val = 1
        self._func = func
    return var.gradient()
