# <img style="float: left; padding-right: 10px; width: 45px" src="https://raw.githubusercontent.com/Harvard-IACS/2018-CS109A/master/content/styles/iacs.png"> CS207 Systems Development for Computational Science
## Final Project


**Harvard University**<br/>
**Fall 2019**<br/>
**Instructor**: David Sondak <br/>
**Team #14**: Fantastic Four<br/>
**Students**: Daniel Cox, Anna Davydova, Stephen Moon, Valentina Toll Villagra   <br/>
**Git Repository**:https://github.com/IACS-CS-207-FantasticFour/cs207-FinalProject


<hr style="height:2pt">

*“Nothing takes place in the world whose meaning is not that of some maximum or minimum.”*
― Leonhard Euler

# INTRODUCTION:
<br/>
This project aims to deliver an Automatic Differentiation Software DeltaPi that will calculate derivatives efficiently and accurately, making it useful across a wide spectrum of applications. In this package, we will execute the forward mode of Automatic Differentiation.  We will also provide an extension package HedgeDeltaPi that will calculate delta for European Call and Put options, allowing for efficient delta hedging. We are especially excited about the applications of this software on Wall Street, where Automatic Differentiation has been adopted fairly recently and has already shown significant improvements in efficiency and/or accuracy vs. the alternative methods (Hedin, 2019) . <br/>
<br/>
Derivatives are powerful and ubiquitous. Their use spans from gradients and Hessians in machine learning applications to hedge sensitivities in financial markets. However, of the four existing methods for derivative calculation only Automatic  Differentiation (AD) combines interpretability, efficiency and accuracy. Manual differentiation is inefficient and susceptible to error.(Baydin et. al., 2018). Numerical differentiation, while easy and fairly quick in its implementation, is also inaccurate, prone to rounding and truncation errors (Jerrell, 1997).  It does not scale well, which makes it a poor choice for machine learning models. Symbolic differentiation, while accurate, can become incredibly complex facing "expression swell" issues (Corliss, 1988). AD overcomes these issues, via its application of the chain rule and a step by step approach to differentiation, and accurately computes derivatives with asymptotic efficiency. (Baydin et.al, 2018). <br/>
<br/>
Our software DeltaPi will implement AD methods, allowing the end user to benefit from its accuracy and efficiency.  Our goal is to produce a package that can handle a wide variety of uses beyond simple scalar functions and an extension package to compute deltas for financial options on Wall Street.

# BACKGROUND:
<br/>
Generally speaking, in any computer program a function can be broken down into its elementary function components (unary or binary) such as addition, subtraction  log, sin, sqrt etc (Heath, 2018).  Since the value of partial derivatives of these elementary functions can be easily calculated, then the value of the entire function can be calculated via the application of the chain rule.<br/>

Let us walk through this process in more detail. Fundamentally, Chain Rule is the key pillar to the AD process. Recall that:

$$\dfrac{\partial h}{\partial t} = \dfrac{\partial h}{\partial u}\dfrac{\partial u}{\partial t}$$ 

We can extend the definition of the Chain Rule to a function that contains to functions as follows: h(u(t),v(t)) (Sondak, 2019).:

\begin{align}
  \displaystyle 
  \frac{\partial h}{\partial t} = \frac{\partial h}{\partial u}\frac{\partial u}{\partial t} + \frac{\partial h}{\partial v}\frac{\partial v}{\partial t}.
\end{align}

This result, from a perspetive of a gradient, leads us to this general rule:

\begin{align}
  \nabla_{x}h = \sum_{i=1}^{n}{\frac{\partial h}{\partial y_{i}}\nabla y_{i}\left(x\right)}.
\end{align}

Thus, in a nutshell, the AD process consist of breaking down the function into its elementary components and carrying out the differentiation process in sequential order of these operations while multiplying through with the chain rule. The output of this process is a dual number containing the value of the function along with its derivative.  We can visualize this process with a computational graph and a table that contains the trace of the calculations. 

For example, for a simple function (*adapted from class exercises*): $$f\left(x,y\right) = (sin(x)-cos(y))^2.$$ 

The forward mode AD graph looks as follows:

<img src="simple_example.png" width="500" height="240" align="center"/>

The corresponding computational table for values $x=\frac{\pi}{2}$ and $y=\frac{\pi}{3}$looks as follows:

| Trace    | Elementary Operation &nbsp;&nbsp;&nbsp;| Current Function Value &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;     | Derivative  &nbsp;&nbsp;&nbsp;&nbsp;   | Partial Derivative w.r.t. x                            | Partial Derivative w.r.t. y                       |evaluate f(x) and its derivative: (f(x), f'(x))                   |
| :------: | :-------------------------------------:| :--------------------------------------------------------:|:-------------------------------------: | :-------------------------------------------------:|:----------------------------------------------:|:---------------------------------------:
| $x_{1}$  | $x_{1}$                                |          $\frac{\pi}{2}$                                              |$\dot {x_1}$                                     |      $1$                                           |      $0$                                       |$\left(\frac{\pi}{2}, 1\right)$                                                                  |
| $x_{2}$  | $x_{2}$                                |          $\frac{\pi}{3}$                                              |$\dot {x_2}$                                     |      $0$                                           |      $1$                                       |$\left(\frac{\pi}{3},1 \right)$                                                                  |
| $x_{3}$  | $sin(x_1)$                             |          $1$                                          |$cos(x_1)\dot{x_1}$                          |      $0$                                          |      $0$                                       |$\left(1,0)\right)$                                                        |
| $x_{4}$  | $cos(x_2)$                         |               $\frac{1}{2}$                                            |$-sin(x_2)\dot{x_2}$                     |      $0$                                          |    $-\frac{\sqrt{3}}{2}$                             |$\left(\frac{1}{2},-\frac{\sqrt{3}}{2}\right )$                                                         |
| $x_{5}$  | $x_3-x_4$                          |          $\frac{1}{2}$                                        |$\dot{x_3}-\dot{x_5}$           |     $0$                                        |   $\frac{\sqrt{3}}{2}$                            |$\left(\frac{1}{2},\frac{\sqrt{3}}{2}\right)$                                                         |
| $x_{6}$  | $x_5^2$                             |    $\frac{1}{4}$                                     |  $2x_5\dot{x_5}$                         |                        $0$                            |       $\frac{\sqrt{3}}{2}$ |           $\left(\frac{1}{4},\frac{\sqrt{3}}{2}\right)$      |                                                                                                        







We note not only the simplicity of differentiation process on elementary functions carried along by the chain rule but relatively interpretable process that could be useful to our end-users.

AD process generalizes nicely to calculate a function's  Jacobian where the function:</br>
$f: \mathbb{R}^n \rightarrow \mathbb{R}^m $ has n independent (input) variables $x_i$ and m dependent (output) variables $y_j$ (Baydin et.al, 2018). Here, for each forward pass only one of of the variables $\dot{x_i}$ is initialized at $\dot{x_i}=1$, while the rest are set at 0.  Every time we evaluate the derivate for a specified value of x such as $x=a$, we are computing:</br>
$$\left.\dot{y}_j=\frac{\partial{y_j}}{\partial{x_i}}\right\vert_{x=a}, j=1, . . . , m$$

This gives us one column from a Jacobian $mxn$ matrix:</br>
    $$\left. J_f=
    \begin{bmatrix}
    \frac{\partial{y_1}}{\partial{x_1}} & . & . & . & \frac{\partial{y_1}}{\partial{x_n}} \\
    . & . &  &  & .\\
    . &   & . & & .\\
    . &  & & .& . \\
    \frac{\partial{y_m}}{\partial{x_1}} & . & . & . & \frac{\partial{y_m}}{\partial{x_n}}
    \end{bmatrix}\right\vert_{x=a}$$

When we finish all of our evaluations for each value of $x_i, . . . , x_n$ we end up with the full Jacobian matrix. What is particularly useful about AD is that it provides a more efficient/matrix free way for calculating Jacobian-vector products (Baydin et.al, 2018). Consider the following Jacobian-vector product:

$$ J_fr=
    \begin{bmatrix}
    \frac{\partial{y_1}}{\partial{x_1}} & . & . & . & \frac{\partial{y_1}}{\partial{x_n}} \\
    . & . &  &  & .\\
    . &   & . & & .\\
    . &  & &. & . \\
    \frac{\partial{y_m}}{\partial{x_1}} & . & . & . & \frac{\partial{y_m}}{\partial{x_n}}
    \end{bmatrix}
    \begin{bmatrix}
    r_1\\
    .\\
    .\\
    .\\
    r_n
    \end{bmatrix}$$

All we have to do here is initialize $\dot{x}=r$ and we will be able to compute the entire Jacobian-vector product in one forward pass. In general, AD forward mode is very efficients for $f: \mathbb{R}\rightarrow \mathbb{R}^m $ with all of the  derivatives computed in one forward pass. <br/>
</br>
***Please Note:*** AD forward mode is not always efficient! In the case where we have  $f: \mathbb{R}^n \rightarrow \mathbb{R}$  AD forward mode performs n evaluations, outputting one column at a time for 1xn Jacobian matrix. Using reverse mode here would be significantly more efficient as we would just need one pass. It is also important to keep in mind that for situations where $f: \mathbb{R^n}\rightarrow \mathbb{R}^m $ where n>>m, using reverse mode would be more efficient. For a function $f: \mathbb{R^n}\rightarrow \mathbb{R}^m $ where $ops(f)$ is the number of operations contained within the function, the time it would take AD forward mode can be approximated as $n*c*ops(f)$ while for AD reverse mode the time is $m*c*ops(f)$ where c is some constant c < 6, usually in the [2, 3] range (Griewank and Walther, 2008). For more information on AD reverse mode please see [Automatic Differentiation in Machine Learning: A Survey by Baydin et al](https://arxiv.org/pdf/1502.05767.pdf).

# HOW TO USE OUR PACKAGE:

## HOW TO INSTALL
### 1. Check Conda is Installed and Updated  
* In the command line of a terminal window, check you have conda installed by entering:
        conda -V
* If conda is installed you will see:
        conda -V
        conda x.x.x
* To check conda is updated, type:
        conda update conda
        
### 2. Create a Virtual Environment  
* In the command line, enter:
        conda create --name env_name python
  where `env_name` is your desired name for the environment
<p>&nbsp;</p>
* To specify the python version, use:
        conda create --name env_name python=x.x
        
### 3. Activate the Virtual Environment  
* To activate or switch into your new virtual environment, type:
        source activate env_name
        
### 4. Download the  Package
* Navigate to the folder structure where you want to install the package
* To retrieve the folder with the Autodiff class and tests, close the Github repository from this:
        https://github.com/IACS-CS-207-FantasticFour/cs207-FinalProject.git
        
        git@github.com:IACS-CS-207-FantasticFour/cs207-FinalProject.git
        
       
* Navigate to the folder where the files are stored by entering:
        cd code

### 5. Activate and Use Package 
* Follow the instruction in the following section, 'HOW TO USE' for step by step details and examples on how to use.
        
### 6. Deactivating and Removing the Virtual Environment  
* Once you have finished using the AutiDiff paackage, you can deactivate the virtual environment by enterting:
        source deactivate
* To remove the virtual environment completely, type:
        conda remove --name env_name --all

### To use the AutoDiff Class
* Be `x,y,z,...` input variables
* Be `f(x,y,z,...)` the output of function f at (x,y,z,...)
* Be `AutoDiff` the class 

#### 0. Import AutoDiff class (for using in script)
        from AutoDiff import AutoDiff as AutoDiff

#### 1. Create an instance of the AutoDiff class  
        x = AutoDiff(val, derv)     
where:  
* `val` = the value of x at the desired point  
* `derv` = the initial value of the derivative of x  
    * 1 for calculating the partial derivative df/dx
    * 0 for calculating the partial derivative df/dy,df/dz,...

#### 2. Repeat the process of creating an instance of the AutoDiff class for each input variable  
        y = AutoDiff(val, derv)
        z = AutoDiff(val, derv)
        ...
where:
* `val` = the value of the instance variable (y,z,...) at the desired point
* `derv` = the initial value of the derivative of the instance variable (y,z,...)
    * 1 for calculating the partial derivative of the instance variable (df/dy for y, df/dz for z,...)
    * 0 for calculating the partial derivative of another variable that is not the instance one (df/dx,df/dz for y, df/dx,df/dy for z,...)

#### 3. Enter the function of interest  
        f = function of x,y,z,...
where `function` can contain any of the operations listed in Table 2 in the Implementation section

#### 4. Print f.val to get the calculated value of f at the specified point (x,y,z,...)  
        print(f.val)

#### 5. Print f.derv to get the calculated value of df/dx or df/dy or df/dz or ... at the specified point (x,y,z,...)  
        print(f.derv)
        
### Examples

* For values:
        x = 3
        y = 4
        f = x + 2 * y

#### Be the goal to calculate the partial derivatives df/dx at (x,y)=(3,4)

0. Import AutoDiff class
        from AutoDiff import AutoDiff as AutoDiff

1. Create an instance of the AutoDiff class
        x = AutoDiff(3, 1)
        
2. Repeat the process of creating an instance of the AutoDiff class for y
        y = AutoDiff(4, 0)

3. Enter the function of interest
        f = x + 2 * y

4. Print f.val to get the calculated value of f at (3,4)
        print(f.val)

5. Print f.derv to get the calculated value of df/dx at (3,4)
        print(f.derv)


#### Be the goal to calculate the partial derivatives df/dy at the same (x,y)=(3,4)

0. Import AutoDiff class
        from AutoDiff import AutoDiff as AutoDiff

1. Create an instance of the AutoDiff class
        x = AutoDiff(3, 0)

2. Repeat the process of creating an instance of the AutoDiff class for y
        y = AutoDiff(4, 1)

3. Enter the function of interest
        f = x + 2 * y

4. Print f.val to get the calculated value of f at (3,4) - will be the same from df/dx
        print(f.val)

5. Print f.derv to get the calculated value of df/dy at (3,4)
        print(f.derv)

### To use the all_derivatives() function

* Be `x,y,z,...` input variables
* Be `func(x,y,z,...)` the output of function f at (x,y,z,...)
* Be `all_derivatives` the function that uses `AutoDiff` class

#### 0. Import all_derivatives function
        from All_derivatives import all_derivatives as all_derivatives

#### 1. Create a vector containing the input variables
        in_vars = [x,y,z,...]
where:
* `x,y,z,...` are variable names 

#### 2. Create a vector containing the variable values
        in_vals = [val_x,val_y,val_z,...]
where:
* `val_x` = the value of x at which to evaluate the function
* `val_y` = the value of y at which to evaluate the function
* `val_z` = the value of z at which to evaluate the function

### 3. Enter the function to be evaluated
        func = function of x,y,z,...
where `function` can contain any of the operations listed in Table 2 in the Implementation section

#### 4. Run all_derivatives()
        result = allderivatives(func, in_vars, in_vals)

#### 5. Print result.val to get the calculated value of func at (in_vars)
        print(result.val)

#### 6. Print result.derv_vals to get vector of calculated partial derivative values, one for each variable in in_vars
        print(result.derv_vals)

### Example

* For values:
        x = 3
        y = 4
        z = 5
        func = x + 2 * y - z

#### Be the goal to calculate the all partial derivatives at once
0. Import all_derivatives function
        from All_derivatives import all_derivatives as all_derivatives
        
1. Create a vector containing the input variables
        in_vars = [x,y,z]

2. Create a vector containing the variable values
        in_vals = [3,4,5]

3. Enter the function to be evaluated
        func =  x + 2 * y - z

4. Run all_derivatives()
        result = all_derivatives(func, in_vars, in_vals)

5. Print result.val to get the calculated value of func at (in_vars)
        print(result.val)

6. Print result.derv_vals to get vector of calculated partial derivative values, one for each variable in in_vars
        print(result.derv_vals)
        
### To use multi_func_all_derivatives() function
* Be `x,y,z,...` input variables
* Be `func(x,y,z,...`) the output of function f at (x,y,z,...)
* Be `multi_func_all_derivatives` the function that uses `AutoDiff` class

#### 0. Import multi_func_all_derivatives function
        from All_derivatives import multi_func_all_derivatives as multi_func_all_derivatives

#### 1. Create a vector containing the input variables
        in_vars = [x,y,z,...]
where:
* `x,y,z,...` are variable names 

#### 2. Create a vector containing the variable values
    in_vals = [val_x,val_y,val_z,...]
where:
* `val_x` = the value of x at which to evaluate the function
* `val_y` = the value of y at which to evaluate the function
* `val_z` = the value of z at which to evaluate the function

#### 3. Create a vector containing the functions to be evaluated
        funcs = [func_1,func_2,func_3,...]
where:
* `func_n` = function of x,y,z,...

where `func_n` can contain any of the operations listed in Table 2 in the Implementation section

#### 4. Run multi_func_all_derivatives()
        result = multi_func_all_derivatives(funcs, in_vars, in_vals)

#### 5. Print result.out_vals to get vector of calculated values of all funcs each at (in_vars)
        print(result.out_vals)

#### 6. Print result.out_derv_matrix to get the Hessian matrix of partial derivatives evaluated at (in_vars)
        print(result.out_derv_matrix)
        
### Example
* For values:
        x = 3
        y = 4
        z = 5
        func_1 = x + 2 * y - z
        func_2 = x - exp(y) + z

#### Be the goal to calculate the partial all the derivatives of more than one function at once
0. Import multi_func_all_derivatives function
        from All_derivatives import multi_func_all_derivatives as multi_func_all_derivatives
        
1. Create a vector containing the input variables
        in_vars = [x,y,z]

2. Create a vector containing the variable values
        in_vals = [3,4,5]

3. Create a vector containing the functions to be evaluated
        func_1 = x + 2 * y - z
        func_2 = x - exp(y) + z
        funcs = [func_1,func_2]

4. Run multi_func_all_derivatives()
        result = multi_func_all_derivatives(funcs, in_vars, in_vals)

5. Print result.out_vals to get vector of calculated values of all funcs each at (in_vars)
        print(result.out_vals)

6. Print result.out_derv_matrix to get the Hessian matrix of partial derivatives evaluated at (in_vars)
        print(result.out_derv_matrix

# SOFTWARE ORGANIZATION:

At the moment, our directory structure is set up in such a way that our forward mode module is both easy to access and easy to test; the current structure is a temporary placeholder for a more comprehensive one to be implemented after we pin down some final specifics regarding our extension. We are still discussing whether or not to include the reverse mode of auto-differentiation in our package as the option-pricing extension would be viable without it; it would just be much slower than ideal if implemented with just the forward mode.
    
Either way, as it stands, the entirety of our forward mode implementation is contained in a single module, AutoDiff.py. Given functions of multiple variables and specific values, our module can evaluate both the functions and their partial derivatives at the given values. As of now, this single module is contained in a directory called “code.” Our test module - “test_AutoDiff.py” - also lives in this directory. The tests included in our test module are automatically run by TravisCI upon each commit to the remote repository. We haven’t experienced any issues with this testing method yet, so it seems that this is how we will test our code for the entirety of the project.
    
As for our directory structure moving forward, it is still dependent on whether we will be able to complete a fully functional module for the reverse mode of auto-differentiation. If we are, then the directory structure will be similar to the one outlined in our 1st milestone:
    
If we are not able to implement the reverse mode, then we will have to edit this structure a little. Most likely, we will just have the option pricing module and the associated tests in our extension’s sub-package without the reverse mode implementation. 
    
We have not yet put our package on PyPI seeing as we still have a lot of structural elements to figure out. As of now, if a user wanted to use the package, he or she would have to download it independently, write the appropriate print statements at the bottom of AutoDiff.py (for the values they desired), and run the module from the command line.	

<img src="Directory_Structure.png" width="200" height="100" align="left"/>

# IMPEMENTATION:

Our DeltaPi software provides the user with a package he or she can use to evaluate functions of multiple variables at a specified point while at the same time evaluating the partial derivatives of the function at that same point. A class is provided to perform simple calculations, and two functions are provided to handle vector inputs and outputs.
DeltaPi performs automatic differentiation via the algorithim known as forward mode.

Our strategy has been to create input-variable objects in python that once defined may be used in python calculations as if they were standard python variables of type float. However, when calculations are done with these AutDiff objects, a partial derivative with respect to one of the input variables is calculated and carried along. The object representing the result then contains an attribute that contains the value of the function at the specified point (val) and an attribute that contains a partial derivative with respect to one of the input variables (derv). Which partial derivative is calculated is specified by the user by intiating the derv attribute of one input variable to 1 and the derv attributes of the other input variables to 0.  The calculation can then be repeated to get partial derivatives with respect to other input variables by changing which input variable’s derv attribute is intiated to 1. A function *all_derivatives( )* is supplied to automate this process.

*The AutDiff( ) class*

Most of the functionality of our DeltaPi package is implemented in a single class called AutoDiff( ). Code for this class can be found in the module 'AutoDiff.py' in the *code* directory. It has two attributes, self.val and self.derv, that respectively hold the value of the function being evaluated and the partial derivative of the function being evaluated up to some point in the calculation.

Autodiff() has many methods. Each is short and designed to replace a standard mathematical operation. Thus, for example there is a method \_\_add\_\_ that will overload the standard addition operator (+) and thus specify how objects of type AutoDiff will be added to scalar values or to each other. And similar methods are included to overload the other standard mathematical operators (- * / ^ +). Also, methods are included to replace many 
numpy elemenatry functions so that these functions may be used as well in any calculations involving AutoDiff objects. Two functions are included separately outside the AutoDiff class. They are LogN and Logist, as these
functions are not implemented in numpy, they had to overloaded with a diferent approach. A Table of the functions
and operations overloaded in the AutoDiff module is shown below:

| Arithmetic     | Power    | Log     | Urnary  | Trig    | Hyper   | Special  | Comparison |  
| :------------  | :------- | :------ | :------ | :------ | :------ | :------- | :--------: |
| Addition       | $x^{a}$  | log     | neg     | sin     | sinh    | logistic |    >       |
| subtraction    | $a^{x}$  | log2    | pos     | cos     | cosh    |          |    <       |
| Multiplication | $y^{x}$  | log10   |         | tan     | tanh    |          |    =       |
| Division       | $e^{x}$  | logN    |         | arcsin  | arcsinh |          |    <=      |
|                |          |         |         | arccos  | acrcosh |          |    >=      |
|                |          |         |         | arctan  | acrtanh |          |    !=      |

<center>Table 2: AutoDiff class supported functions</center>

Central to all Autodiff( ) methods is the principle that each method returns a new AutDiff object that will represent the next stage in the function’s evaluation. For example, if x is an AutDiff object, and it is multiplied by the scalar 2 (x\*2), then x’s \_\_mul\_\_ method is called, and it returns a new object that will represent x\*2. The AutoDiff.\_\_mul\_\_ method performs this calculation as follows. It assumes that 2 is another object of type AutoDiff, and trys to multiply the two AutoDiff objects together. It finds that 2 is not an AutoDiff object: it falls to its except block, and then it makes two calculations: 2 * x.val and 2 * x.derv and stores these values in the returned object’s val and derv attributes. If the 2 had been to the right of x (2*x), then the \_\_rmul\_\_ method would have been called, and it would have executed the same operation.  

More subtle is the case where two AutoDiff objects, say x and y, are multiplied together x\*y. Here x’s \_\_mul\_\_ method is called. It finds that the two objects are instances of AutoDiff, and it 1) multiplies x.val and y.val together and stores the result as the returned object’s val attribute, and 2) calculates (x.derv\*y.val + y.derv\*x.val) and store this value as the returned object’s derv attribute. Notice in the second calculation the derivatives from both objects x and y are carried along as a sum. Recall, however, that the derv value of either x or y will have been initiated to 0, so one term in the sum (x.derv\*y.val + y.derv\*x.val) will always evaluate to 0. In this way, the algorithm can automatically choose the proper derivative to carry along for subsequent calculations (here either that of x or that of y) when two Autodiff objects collide. This same approach has been used generally to  overload the division and power operators as well.

An example of a elementary mathematical function implemented in AutoDiff( ) is the exponential function. This works as follows. For a given AutoDiff object, say x, when exp(x) is executed, x’s exp() function is called on itself. It uses numpy’s exp() function to evaluate np.exp(x.val) and store this value in the return object’s val attribute. It then evaluates the derivative of exp(x.value) —which in this case is just exp(x.value)— and stores this value in the return object’s derv attribute. This approach has been used generally for all the elementary functions.

Autdiff objects are fully functional replacements for python float variables such that once the AutoDiff class is defined, these objects can be used to perform forward automatic differentiation on any multi-input, single output function at any specified point. 

*The all_derivatives() function*

This function allows the user to calculate all partial derivatives of a given function at once without having to initiate the derv attributes of the input variables several times manually. It operates by evaluating the desired function once for each input variable and during each run altering which variable's derv attribute is set to 1, while the other's derv attributes are set to 0. It takes as input: the function to be evaluated, a list of independent variables, and the values of these variables at the point desired. It returns  the value of the input function at the desired point, and an array of partial derivative one for each inpedendent variable. Since the code for this function is not long it is listed below. It takes advantage of the AutoDiff( ) class.

    def all_derivatives(func, in_vars, point):
        '''
        Function that calculates the value of the input function at the input point and all its partial derivatives

        :params func: (function) function to evaluate whose input is a list of variables
                in_vars: (list)  list of variables to use when calculating the values of the input function
                point: (list)    list of values of each input variable at the input point

        :returns: out_val (scalar) the value of the input function at the input point
                  out_dervs (numpy array , 1 dimensional) an array of the 1st derivative values of each
                  input variable at the input point
                                                          

        :dependencies: requires class AutoDiff from module AutoDiff, package numpy imported ad np

        Example:

        f(x,y) = x^2 + y^2

        def func(vars):
            return vars[0]**2 + vars[1]**2

        a, b = all_derivatives(func, ['x','y'], [3,2])
        print(a, b)
        13.0, [6.0, 4.0]

        '''
        auto_diff_vars = []   # list to hold AutoDiff objects, one for each var in in_vars
        out_dervs =[]         # list to hold derivative values, one for each var in in_vars

        # Make AutoDiff objects
        for var, pt in zip(in_vars, point):
            var = AutoDiff(pt, 0)
            auto_diff_vars.append(var)

        # Calculate the value of func at the input point
        f = func(auto_diff_vars)
        out_val = f.val

        # Calculate the partial derivatives of func at the input point
        # Go through AutoDiff objects in auto_diff_vars
        # and during each pass change the derv value of 1 variable to 1, while others are held at 0
        # and calculate the partial derivative
        for i in range(len(auto_diff_vars)):
            auto_diff_vars[i].derv = 1
            f = func(auto_diff_vars)
            out_dervs.append(f.derv)
            auto_diff_vars[i].derv = 0

        return out_val, np.array(out_dervs)


*The multi_func_all_derivatives() function*

This function handles vector inputs and outputs. It allows the user to specify: a vector of functions, a vector of input variables, and a vectors of values defining a specific point. It the returns: a vector of output values (one for each function evaluated at the specified point), and a Jacobian matrix of partial derivatives containing partial derivatives of all functions evaluated at the specified point. Consider the equation below as an example:

\begin{equation}
\begin{pmatrix} f(x,y) \\ g(x,y)\end{pmatrix} =
\begin{pmatrix} 2xy \\ x+y^2 \end{pmatrix} at \begin{pmatrix} x=a \\ y=b \end{pmatrix}
\end{equation}


Here the multi_func_all_derivatives() function would return:  
1) out_vals, a vector of the values of each function after evaluation $(f(a,b), g(a,b))$  
2) out_derv_matrix, a matrix containing the partial derivatives evaluated at the point (a, b)

Since the code for this function is not long it is listed below. It takes advantage of the
AutoDiff( ) class and the all_derivatives( ) function:


    def multi_func_all_derivatives(functions, in_vars, point):
        '''
        Function that calculates the value of all input functions at the input point and all their
        partial derivatives producing a numpy array of output values and a numpy 2D array (matrix)
        of partial derivatives, the Jacobian

        :params func: (functions) list of function to evaluate whose input is a list of variables
                in_vars: (list)   list of variables to use when calculating the values of the input functions
                point: (list)     list of values of each input variable at the input point

        :returns: out_vals (numpy array 1 dimensional) the values of the input functions at the input point
                  out_dervs_matrix (numpy array 2 dimensional) a matrix of the 1st derivative values of each
                  input variable at the input point, the Jacobian.

        :dependencies: requires class AutoDiff from module AutoDiff, function all_derivatives from this module,
                                All_derivatives, and the package numpy imported as np

        Example:

        f(x,y) = x^2 + y^2

        (x,y) = x^2*y^2


        def func1(vars):
            return vars[0]**2 + vars[1]**2

        def func2(vars):
            return vars[0]**2*vars[1]**2

        a, b = multi_func_all_derivatives([func1,func2] , ['x','y'], [3,2])
        print(a, b)
        [13. 36.] [[ 6.  4.], [24. 36.]]

        '''
        out_vals = []
        out_dervs_matrix =[]

        # for each function in functions list
        # run all_derivatives and store output
        for func in functions:
            val, dervs = all_derivatives(func, in_vars, point)
            out_vals.append(val)
            out_dervs_matrix.append(dervs)

        return np.array(out_vals), np.array(out_dervs_matrix)



*Testing*

We have implemented a series of tests to test the AutoDiff( ) class and the all_derivatives and multi_func_all_derivatives functions. There are 67 tests in all. They are designed to be run by *pytest* and to provide full code coverage. They can be found in the modules 'test_AutoDiff.py and 'test_all_derivatives.py'. Travis CI and codecov inidicate that all tests are passed by our code, and our code coverage is very high.

# EXTENSION: OPTIONS HEDGING

In the financial markets, large institutions (e.g. banks, asset managers) hold complex portfolios with a variety of financial instruments. While calculating the value of such portfolios can be done fairly easily with simulations and aggregation of market pricing, the sensitivities of these portfolios to changes in price, volatility and interest rates in the market can be a lot more difficult to calculate and come with an accuracy vs. efficiency trade off. 

Given the time and the scope of this project we chose to focus on a subset of the financial instruments and potential end users.  Specifically, we aim to extend our DeltaPi package into a second package HedgeDeltaPi that would be useful to equity options traders as they estimate sensitivities  of their options portfolios and purchase underlying assets (delta-hedge) to minimize their volatility. Accurate and efficient estimate of these sensitivities would allow options traders to buy the right amounts of options to hedge their stock positions from big swings in the asset prices (more on this later).

Here we will focus on vanilla, European, non-dividend paying stock options that can only be exercised on specified date of their expiration as opposed to American options that can be exercised during any time up until the expiration date.  

### What are Financial Options?

A call option is a right but not an obligation to buy an underlying asset (in our case stock) at a given price (strike price) on a given date (expiration date).  Note from the plot below that the option has no intrinsic value below the strike price of USD100.  However, as the stock reaches that value and climbs higher, the intrinsic value of the option is the difference between USD100 and the current price (in the >100 region). That makes sense because we can buy the stock at a 100 and sell it at a price that is now >100, netting a profit that is now the intrinsic value of the option. When the stock price crosses the strike value like this, the option goes from being out-of-the-money to being in-the-money.

$$\text{Value of European Call Option}$$

![](http://www.quantopia.net/wp-content/uploads/2013/03/EuroCallRates.png)</br>
$$\text{source: www.quantopia.net/wp-content/uploads/2013/03/EuroCallRates.png}$$

The plot above illustrates the value progression of a European call option. Essentially, call value (blue curve) is made up of two components: 1) intrinsic value and 2) time value. The green line is its intrinsic value, that goes from 0 when the option is out of the money to climbing with the stock price once it crosses over the strike value at USD100.  The orange curve is the time value of the option that takes into account how far we are from the expiration date (since we cannot exercise the option until it gets there)  and how close we are to the strike price, the discount rate (risk free rate, usually a 10 year Treasury) and the overall volatility of the share price. Black Scholes equation for pricing options takes all of these variables into account (more on this later).  

A put option is a right but not an obligation to sell an underlying asset (in our case stock) at a given price (strike) on a given date (expiration).  The plot below shows that the put option has no intrinsic value until the stock price drops below the strike. Then the intrinsic value of the option is the profit one would net by buying the stock <100 and selling it at a strike price of USD100. Similarly to the call option, the total value of the put option is a combination of intrinsic value and time value that takes into account how far the stock is from the strike, time to expiration and stock volatility discounted by the risk free rate. 

$$\text{Value of European Put Option}$$

![](http://www.quantopia.net/wp-content/uploads/2013/03/EuroPutRates.png)<br/>
$$\text{source: www.quantopia.net/wp-content/uploads/2013/03/EuroPutRates.png}$$

The classical approach to valuing options is with Black-Scholes Equation that describes the option price overtime as follows:

$$\frac{\partial V}{\partial t}+\frac{1}{2}\sigma^2 S^2\frac{\partial^2 V}{\partial S^2}+r S\frac{\partial V}{\partial S} -r V = 0$$

where $V$ is the value of the option, $t$ is time, $\sigma$ is volatility of the underlying asset, $S$ is the price of the underlying asset and $r$ is the risk free rate (Black and Scholes, 1973).

Further, Black-Sholes-Merton formula for estimating price of the European Call option is as follows:

$$C(S_0,t)=S_0N(d_1)-Ke^{-r(T-t)}N(d_2)$$

and similarly the put option value is as follows:


$$P(S_0,t)=Xe^{-r(T-t)}N(-d_2)-S_tN(-d_1)$$

where $S_0$ is the stock price, $C(S_0,t)$ is a price of the call option and $P(S_0,t)$ is price of a put option as a function of stock price and time, $K$ is the strike prices, $(T-t)$ is time to maturity represented in years and $N(d_1)$ and $N(d_2)$ are  cumulative distribution functions for a standard normal distribution (Merton, 1973).

$$d_1=\frac{ln(\frac{S_t}{X})+(r+\frac{\sigma^2_s}{2})(T-t)}{\sigma_s\sqrt{(T-t)}}$$

$$d_2=\frac{ln(\frac{S_t}{X})+(r-\frac{\sigma^2_s}{2})(T-t)}{\sigma_s\sqrt{(T-t)}}=d_1-\sigma_s\sqrt{(T-t)}$$

$$N(d_1)=\int_{-\infty} ^{d_1}f(u)du=\int_{-\infty}^{d_1}\frac{1}{\sqrt{2\pi}}e^{\frac{-u^2}{2}}du$$

$$\dot{N(d_1)}=\frac{\partial N(d_1)}{\partial d_1}=\frac{1}{\sqrt{2\pi}}e^{-\frac{d_1^2}{2}}$$

We can solve for both put and call deltas and obtain a closed form solution in the form. 

$$\Delta_{call}=\frac{\partial C_t}{\partial S_t}=N(d_1)+S_t\frac{\partial N(d_1)}{\partial S_t}-Xe^{-r(T-t)}\frac{\partial N(d_2)}{\partial S_t}= N(d_1)$$

$$\Delta_{put}=\frac{\partial P_t}{\partial S_t}=Xe^{-r(T-t)}\frac{\partial N(-d_2)}{\partial S_t}-N(-d_1)-S_t\frac {\partial N(-d_1)}{\partial S_t}=N(d_1)-1$$

Note, that delta is positive for call options and negative for put options, which makes sense.  As the price of the stock goes up so does the value of call option (i.e. right to buy at potentially lower price).  As the price of the stock goes up, the value of the put option goes down (i.e. right to sell at potentially lower price is not valuable). The plot below illustrates this by plotting call/put deltas vs. the 'moneyness' of the option (OTM=Out of the Money, ATM = At the Money and ITM = In the Money)

![](https://www.optiontradingtips.com/images/delta-vs-moneyness.png)<br/>
$$\text{source: https://www.optiontradingtips.com/images/delta-vs-moneyness.png}$$

We can also obtain closed form solution for first order partial derivative w.r.t. volatility - Vega (note its the same for both put and call options):

$$v=\frac{\partial{C_t}}{\partial\sigma}=\frac{\partial{P_t}}{\partial\sigma}=S_t\sqrt{T-t} N(d_1)$$

However, it is important to keep in mind that the  𝜎  (volatility) of the option is not static overtime (i.e. the volatility surface is not flat). Thus, rather then focusing on simple delta above, our package will take this into account and calculate minimum variance delta that minimizes volatility of a delta neutral portfolio (i.e. more accurate than static delta). For local volatility models, for options that are closer to being in the money, several papers ( Derman et al (1995), and Coleman et al (2001)) have shown that:

$$\Delta_{MV}=\Delta_{BS}+v_{BS}\frac{\partial{\sigma_{imp}}}{\partial{K}}$$

where $\Delta_{BS}$ $v_{BS}$ are Black Scholes delta and vega respectively, and $\sigma_{imp}$ is the value of implied volatility.  To estimation equations of implied volatility have proven to be most accurate (Isengildina-Massa, et. al 2007).  We will consider them in this package. 

1) Implied volatility equation proposed by Bharadia et al. (1996):

$$\sigma_{imp}\approx\sqrt{\frac{2\pi}{(T-t)}}\frac{C-(S-K)/2}{S-(S-K)/2}$$

2) Implied volatility equation  from Corrado et. al., (1996):

$$\sigma_{imp}\approx\sqrt{\frac{2\pi}{(T-t)}} \frac{1}{S+K}(C-\frac{S-K}{2}+\sqrt{(C-\frac{S-K}{2})^2-\frac{(S-K)^2}{\pi}})$$

DeltaHedgePi will leverage DeltaPi package to calculate the first derivative of $\sigma_{imp}$ and will calculate $\Delta_{BS}$ and $v_{BS}$ using market data inputed by the traders and checking it against the currently available live market data. The package will output the following:<br/>
1) Three delta values: <br/>
&nbsp;&nbsp;    a. classic Black Scholes <br/>
&nbsp;&nbsp;    b.  Black Scholes + Bharadia et al. <br/>
&nbsp;&nbsp;    c.  Black Scholes + Corrado et al.<br/>
2) Depending on the trader's position and type of option, the package will output what action the trader needs to take in order to delta hedge their position. <br/> 
3) Two visualization for volatility surfaces for the non static approaches.<br/>
&nbsp;&nbsp; a.  Black Scholes + Bharadia et al. <br/>
&nbsp;&nbsp; b.  Black Scholes + Corrado et al.<br/>

The goal of this package is to provide the traders with a more accurate delta estimations for near the money options and make the process of delta hedging easier. 

### What is Delta Hedging?

In the simple case of vanilla, non dividend paying, European call/put options above, delta hedging refers to buying or selling specific amounts of underlying stock to neutralize the impact of the stock price moves on the options portfolio.  Let us consider the following example:

Consider the following example. Let's supposed we have a stock XYZ that currently has a price of USD100 and the bank sold call options (a while ago before the stock price run up) for 1000 shares with strike USD10.  Thus, the buyer of our option has the right but not obligation to buy 1000 shares of XYZ at just USD10 on expiration date (right now the option is deep in the money). If the delta of this call option is 0.5, this means that for every USD1 increase in stock price, the value of the call option goes up USD0.50 (to the holder).  To execute a delta hedge, the bank should buy 1000*.5=500 shares of XYZ. Now if the stock price goes down by USD1, the bank will lose USD500 on the stock position but will gain 0.5x1000=USD500 on its short call option position. Thus, the impact is neutral.  

Keep in mind that delta is not static as the slope of option price vs. stock price is usually non linear. This is why accurate and frequent estimates of delta are so important (Lee, 2008).

In summary, HedgeDeltaPi will apply automatic differentiation from DeltaPi to compute deltas using 3 approaches described above: 1) Pure Black Scholes, 2) Black Scholes + Implied Vol Simple, 3) Black Scholes + Implied Vol More Complex. The package will then produce a buy/sell recommendation and the quantity of shares in the underlying assets required to successfully delta hedge the traders' position.  The package will also produce a volatility surface visualizations using the latter 2 approaches (classic Black Scholes volatility surface is flat).

**Note:** The package is pulling live options data from yahoo finance to confirm current market value of call and put options respectively and realistic expiration dates and strike values. Thus, **the package assumes that the traders are analyzing real options that would practically trade in the market** (i.e. not wildly mis-priced/unrealistically structured options that wouldn't occur in real life). If the user would like to run the package on real options for practice here are some of the valid options to try: 

### How to Use the Delta Hedge Pi Extension


#### 0. Initialize extension in the installed directory
        pythonw DeltaHedgePi.py

#### 1. Enter your option type or exit by selecting 1, 2 or 3 (number)
        Please Select Type of Option to Evaluate
        1) Exit
        2) Puts
        3) Calls

#### 2. Enter the ticker for the stock you want to delta hedge (string)
        Please Enter Ticker

#### 3. Enter the strike price (number)
        Please Enter Strike

#### 4. Enter the expiration year (number)
        Please Enter Expiration Year

#### 5. Enter the expiration month (number)
        Please Enter Expiration Month
        
#### 5. Enter the expiration day (number)
        Please Enter Expiration Day
        
#### 6. Confirm you want to delta hedge entering 'y'       
        Would You Like to Delta Hedge Your Position: y/n?
        
#### 6. Enter the number of stocks you have long/short (positive number for long, negative number for short)
        How many units? Enter + values for long and - values for short
        
### Example

Scenario:
* Company: Apple
* Ticker: AAPL
* Expiration: 1/24/2020
* Option type: Puts
* Strike: 300
* Ownershio: 10,000 units
                    
        [INPUT]     pythonw DeltaHedgePi.py
        [OUTPUT]    Please Select Type of Option to Evaluate
                    1) Exit
                    2) Puts
                    3) Calls
        [INPUT]     2
        [OUTPUT]    Please Enter Ticker
        [INPUT]     aapl
        [OUTPUT]    Please Enter Strike
        [INPUT]     300
        [OUTPUT]    Please Enter Expiration Year
        [INPUT]     2020
        [OUTPUT]    Please Enter Expiration Month
        [INPUT]     1
        [OUTPUT]    Please Enter Expiration Day
        [INPUT]     24
        [OUTPUT]    Black Scholes Delta:  0.20314624495527905
                    negative number has been input to the sqaure root function
                    Bharadia delta:  0.20519111437373902
                    Corrado delta:  0.20416867966450905
                    Would You Like to Delta Hedge Your Position: y/n?
        [INPUT]     y
        [OUTPUT]    How many units? Enter + values for long and - values for short
        [INPUT]     10000
        [OUTPUT]    According to Black Scholes you need to  sell short  2031  shares of  aapl
                    According to Bharadia apporach you need to  sell short  2051  shares of  aapl
                    Accoding to Corrado approach you need to  sell short  2041  shares of  aapl



In [None]:
#Possible Imputs for Practice:
#Run during market Open (weekdays 9:30-4)

#inputs:
#demo tickers:
#ticker='GE'
#exp_year=2020
#exp_month=1
#exp_day=17
#exp_date=str(str(exp_month)+'/'+str(exp_day)+'/'+str(exp_year))
#op_type='puts'
#strike=8

#ticker='aapl'
#exp_year=2020
#exp_month=1
#exp_day=24
#exp_date=str(str(exp_month)+'/'+str(exp_day)+'/'+str(exp_year))
#op_type='puts'
#strike=300


#ticker='nflx'
#exp_year=2021
#exp_month=1
#exp_day=15
#exp_date=str(str(exp_month)+'/'+str(exp_day)+'/'+str(exp_year))
#op_type='calls'
#strike=350

Since we are pulling live data form yahoo finance, the package is dependent on yahoo finance API working properly. **If it is yahoo finance is not working, the package would not work.** This is why, for more practical application our package should be linked to more reliable platforms like Bloomberg, which is what most of the traders use.  Since we cannot afford Bloomberg, we built the package with free yahoo finance but it is less reliable. 

# **REFERENCES:**<br/>
Atilim Baydin. Automatic Differentiation in Machine Learning: a Survey. Journal of Machine Learning Research, 2018. <br/> 
Fischer Black, & Myron Scholes, M. The Pricing of Options and Corporate Liabilities. Retrieved November 1, 2019, from http://www.jstor.org/stable/1831029<br/>
George F. Corliss. Application of differentiation arithmetic, volume 19 of Perspectives in Computing, pages 127–48. Academic Press, Boston, 1988.<br/>
Andreas Griewank and Andrea Walther. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Humboldt-Universität zu Berlin, Berlin, Germany. Second Edition, 2008.<br/>
Michael Heath. Scientific Computing: An Introductory Survey. Society for Industrial and Applied Mathematics. 8(6):367, 2018.<br/> 
Christina Homescu. Adjoints and automatic (algorithmic) differentiation in computational finance. arXiv:1107.1831v1 10 Jul 2011.<br/>
Olga Isengildina-Massa,Charles E. Curtis, Jr., William C. Bridges, Jr. and Minhuan Nian. Accuracy of Implied Volatility Approximations Using “Nearest-to-the-Money” Option Premiums. 2007.  Retrieved on November 20, 2019, from  https://ageconsearch.umn.edu/record/34927/files/sp07is02.pdf.<br/>
Max E. Jerrell. Automatic differentiation and interval arithmetic for estimation of disequilibrium models. Computational Economics, 10(3):295–316, 1997.<br/>
Cheng-Few Lee, 'Handbook on Quantitative Finance and Risk Management.' Rutgers University. 2008. <br/>
Robert Merton. Theory of Rational Option Pricing. Retrieved on November 1, 2019, from http://www.jstor.org/stable/3003143 <br/>
David Sondak. Lectures 10 - 11. Harvard University. CS207 Fall 2019. <br/> 
Eric Sunnegardh and Ludvig Lamm. "Efficient Sensitivity Analysis Using Algorithmic Diferentation in Financial Applications". 2015. 