# Symbolic Partial Derivative Routine

## Authors: Zach Etienne & Tyler Knowles

## This module contains a routine for computing partial derivatives of a mathematical expression that is written as several subexpressions.

**Notebook Status:** <font color='green'><b> Validated </b></font>

**Validation Notes:** This tutorial notebook has been confirmed to be self-consistent with its corresponding NRPy+ module, as documented [below](#code_validation). Additionally, this notebook has been validated by checking that results are consistent with exact derivative expressions used in the SEOBNRv3_opt approixment of [LALSuite](https://git.ligo.org/lscsoft/lalsuite).

### NRPy+ Source Code for this module: [SEOBNR_Derivative_Routine.py](../edit/SEOBNR/SEOBNR_Derivative_Routine.py)

## Introduction
$$\label{intro}$$

This notebook documents the symbolic partial derivative routine used to generate analytic derivatives of the [SEOBNRv3](https://git.ligo.org/lscsoft/lalsuite) Hamiltonian (documented [here](../Tutorial-SEOBNR_v3_Hamiltonian.ipynb)) and described in [this article](https://arxiv.org/abs/1803.06346).  In general, this notebook takes as input a file of inter-dependent mathematical expressions (in SymPy syntax), a file listing the names of values within those expressions, and a file listing all variables with which to take partial derivatives of each expression.  The output is a text file containing the original expression and those for each partial derivative computation.  The intention is to perform CSE on these expressions to create efficient partial derivative code!

<a id='toc'></a>

# Table of Contents
$$\label{toc}$$

This notebook is organized as follows

1. [Step 1](#initializenrpy): Initialize core Python/NRPy+ modules
1. [Step 2:](#read_expressions) Read in Hamiltonian expressions from `Hamstring.txt`
1. [Step 3:](#list_constants) Specify constants and variables in Hamiltonian expression
1. [Step 4:](#list_free_symbols) Extract free symbols
1. [Step 5:](#convert_to_func) Convert variables to function notation; e.g., `var` goes to `var(xx)`
1. [Step 6:](#differentiate) Differentiate with respect to `xx`
1. [Step 7:](#remove_zeros) Remove derivatives (of constants) that evaluate to zero, simplifying derivative expressions
1. [Step 8:](#partial_derivative) Simplify derivatives with respect to a specific variable
1. [Step 9:](#store_results) Store partial derivatives to SymPy notebook `partial_derivatives.txt-VALIDATION.txt`
1. [Step 10:](#code_validation) Validate against LALSuite and trusted `SEOBNR_Derivative_Routine` NRPy+ module
1. [Step 11:](#latex_pdf_output) Output this notebook to $\LaTeX$-formatted PDF file

<a id='initializenrpy'></a>

# Step 1: Initialize core Python/NRPy+ modules \[Back to [top](#toc)\]
$$\label{initializenrpy}$$

Let's start by importing all the needed modules from Python/NRPy+ and creating the output directory (if it does not already exist):

In [1]:
# Step 1.a: import all needed modules from Python/NRPy+:
import sympy as sp                # SymPy: The Python computer algebra package upon which NRPy+ depends
import sys, os     # Standard Python modules for multiplatform OS-level functions
sys.path.append('../')

from outputC import superfast_uniq, lhrh      # Remove duplicate entries from a Python array; store left- and right-
                                              #   hand sides of mathematical expressions

# As of April 2021, "sp.sympify("Q+1")" fails because Q is a reserved keyword.
#   This is the workaround, courtesy Ken Sible.
custom_global_dict = {}
exec('from sympy import *', custom_global_dict)
del custom_global_dict['Q']

# Step 1.b: Check for a sufficiently new version of SymPy (for validation)
# Ignore the rc's and b's for release candidates & betas.
sympy_version = sp.__version__.replace('rc', '...').replace('b', '...')
sympy_version_decimal = float(int(sympy_version.split(".")[0]) + int(sympy_version.split(".")[1])/10.0)
if sympy_version_decimal > 1.2:
    custom_parse_expr = lambda expr: sp.parse_expr(expr, global_dict=custom_global_dict)
else:
    custom_parse_expr = lambda expr: sp.sympify(expr)

if sympy_version_decimal < 1.2:
    print('Error: NRPy+ does not support SymPy < 1.2')
    sys.exit(1)

# Step 1.c: Name of the directory containing the input file
inputdir = "Hamiltonian"

<a id='read_expressions'></a>

# Step 2: Read in Hamiltonian expressions from `Hamstring.txt` \[Back to [top](#toc)\]
$$\label{read_expressions}$$

We read in the expressions of which we will compute partial derivatives in a single large string before splitting the string by line (carriage return) and by "=".  Doing so allows us to manipulate the right- and left-hand sides of the expressions appropriately.  We store the left- and right-hand sides in the array `lr`, which consists of `lhrh` arrays with left-hand sides `lhs` and right-hand sides `rhs`.  Note that `Lambda` is a protected keyword in Python, so the variable $\Lambda$ in the Hamiltonian is renamed `Lamb`.

In [2]:
# Step 2.a: Read in expressions as a (single) string
with open(os.path.join(inputdir,'Sympy_Hreal_on_Bottom.txt'), 'r') as file:
    expressions_as_lines = file.readlines()

#print(expressions_as_lines)
# Step 2.b: Create and populate the "lr" array, which separates each line into left- and right-hand sides
#   Each entry is a string of the form lhrh(lhs='',rhs='')
lr = []

for i in range(len(expressions_as_lines)):
    # Ignore lines with 2 or fewer characters and those starting with #
    if len(expressions_as_lines[i]) > 2 and expressions_as_lines[i][0] != "#":
        # Split each line by its equals sign
        split_line = expressions_as_lines[i].split("=")
        #print(split_line)
        # Append the line to "lr", removing spaces, "sp." prefixes, and replacing Lambda->Lamb
        #   (Lambda is a protected keyword):
        lr.append(lhrh(lhs=split_line[0].replace(" ","").replace("Lambda","Lamb"),
                       rhs=split_line[1].replace(" ","").replace("sp.","").replace("Lambda","Lamb")))

# Step 2.c: Separate and sympify right- and left-hand sides into separate arrays
lhss = []
rhss = []
for i in range(len(lr)):
    #print(lr[i].rhs)
    lhss.append(custom_parse_expr(lr[i].lhs))
    rhss.append(custom_parse_expr(lr[i].rhs))

<a id='list_constants'></a>

# Step 3: Specify constants and variables in Hamiltonian expression \[Back to [top](#toc)\]
$$\label{list_constants}$$

We read in and declare as SymPy symbols the constant values; derivatives with respect to these variables will be set to zero.  We then read in the variables with respect to which we want to take derivatives and declare those as SymPy variables as well.

In [3]:
# Step 3.a: Create `input_constants` array and populate with SymPy symbols
m1,m2,tortoise,eta,KK,k0,k1,EMgamma,d1v2,dheffSSv2 = sp.symbols('m1 m2 tortoise eta KK k0 k1 EMgamma d1v2 dheffSSv2',
                                                                real=True)
input_constants = [m1,m2,tortoise,eta,KK,k0,k1,EMgamma,d1v2,dheffSSv2]

# Step 3.b: Create `dynamic_variables` array and populate with SymPy symbols
x,y,z,p1,p2,p3,S1x,S1y,S1z,S2x,S2y,S2z = sp.symbols('x y z p1 p2 p3 S1x S1y S1z S2x S2y S2z', real=True)
dynamic_variables = [x,y,z,p1,p2,p3,S1x,S1y,S1z,S2x,S2y,S2z]

<a id='list_free_symbols'></a>

# Step 4: Extract free symbols \[Back to [top](#toc)\]
$$\label{list_free_symbols}$$

By ''free symbols'' we mean the variables in the right-hand sides.  We first create a list of all such terms (using SymPy's built-in free_symbol attribute), including duplicates, and then strip the duplicates.  We then remove input constants from the symbol list.

In [4]:
# Step 4.a: Prepare array of "free symbols" in the right-hand side expressions
full_symbol_list_with_dups = []
for i in range(len(lr)):
    for variable in rhss[i].free_symbols:
        full_symbol_list_with_dups.append(variable)

# Step 4.b: Remove duplicate free symbols
full_symbol_list = superfast_uniq(full_symbol_list_with_dups)

# Step 4.c: Remove input constants from symbol list
for inputconst in input_constants:
    for symbol in full_symbol_list:
        if str(symbol) == str(inputconst):
            full_symbol_list.remove(symbol)

<a id='convert_to_func'></a>

# Step 5: Convert variables to function notation; e.g., `var` goes to `var(xx)` \[Back to [top](#toc)\]
$$\label{convert_to_func}$$

In order to compute the partial derivative of each right-hand side, we mark each variable (left-hand side) and each free symbol (in right-hand sides) as a function with argument $\texttt{xx}$.

In [5]:
# Step 5.a: Convert each left-hand side to function notation
#   while separating and simplifying left- and right-hand sides
xx = sp.Symbol('xx',real=True)
func = []
for i in range(len(lr)):
    func.append(sp.sympify(sp.Function(lr[i].lhs,real=True)(xx)))

# Step 5.b: Mark each free variable as a function with argument xx
full_function_list = []
for symb in full_symbol_list:
    func = sp.sympify(sp.Function(str(symb),real=True)(xx))
    full_function_list.append(func)
    for i in range(len(rhss)):
        for var in rhss[i].free_symbols:
            if str(var) == str(symb):
                rhss[i] = rhss[i].subs(var,func)

<a id='differentiate'></a>

# Step 6: Differentiate with respect to `xx` \[Back to [top](#toc)\]
$$\label{differentiate}$$

Now we differentiate the right-hand expressions with respect to `xx`.  We use the SymPy $\texttt{diff}$ command, differentiating with respect to $\texttt{xx}$.  After so doing, we remove $\texttt{(xx)}$ and "Derivative" (which is output by $\texttt{diff}$), and use "prm" suffix to denote the derivative with respect to $\texttt{xx}$.

In [6]:
# Step 6: Use SymPy's diff function to differentiate right-hand sides with respect to xx
#   and append "prm" notation to left-hand sides
lhss_deriv = []
rhss_deriv = []
for i in range(len(rhss)):
    lhss_deriv.append(custom_parse_expr(str(lhss[i])+"prm"))
    newrhs = custom_parse_expr(str(sp.diff(rhss[i],xx)).replace("(xx)","").replace(", xx","prm").replace("Derivative",""))
    rhss_deriv.append(newrhs)

KeyboardInterrupt: 

<a id='remove_zeros'></a>

# Step 7: Remove derivatives (of constants) that evaluate to zero, simplifying derivative expressions \[Back to [top](#toc)\]
$$\label{remove_zeros}$$

We declare a function to simply the derivative expressions.  In particular, we want to remove terms equal to zero.

In [None]:
# Step 7.a: Define derivative simplification function
def simplify_deriv(lhss_deriv,rhss_deriv):
    # Copy expressions into another array
    lhss_deriv_simp = []
    rhss_deriv_simp = []
    for i in range(len(rhss_deriv)):
        lhss_deriv_simp.append(lhss_deriv[i])
        rhss_deriv_simp.append(rhss_deriv[i])
    # If a right-hand side is 0, substitute value 0 for the corresponding left-hand side in later terms
    for i in range(len(rhss_deriv_simp)):
        if rhss_deriv_simp[i] == 0:
            for j in range(i+1,len(rhss_deriv_simp)):
                for var in rhss_deriv_simp[j].free_symbols:
                    if str(var) == str(lhss_deriv_simp[i]):
                        rhss_deriv_simp[j] = rhss_deriv_simp[j].subs(var,0)
    zero_elements_to_remove = []
    # Create array of indices for expressions that are zero
    for i in range(len(rhss_deriv_simp)):
        if rhss_deriv_simp[i] == sp.sympify(0):
            zero_elements_to_remove.append(i)

    # When removing terms that are zero, we need to take into account their new index (after each removal)
    count = 0
    for i in range(len(zero_elements_to_remove)):
        del lhss_deriv_simp[zero_elements_to_remove[i]+count]
        del rhss_deriv_simp[zero_elements_to_remove[i]+count]
        count -= 1
    return lhss_deriv_simp,rhss_deriv_simp

# Step 7.b: Call the simplication function and then copy results
lhss_deriv_simp,rhss_deriv_simp = simplify_deriv(lhss_deriv,rhss_deriv)
lhss_deriv = lhss_deriv_simp
rhss_deriv = rhss_deriv_simp

<a id='partial_derivative'></a>

# Step 8: Simplify derivatives with respect to a specific variable \[Back to [top](#toc)\]
$$\label{partial_derivative}$$

In [Step 6](#differentiate) we took a generic derivative of each expression, assuming all variables were functions of `xx`.  We now define a function that will select a specific dynamic variable (element of `dynamic_variables`) and set the derivative of the variable to 1 and all others to 0.

In [None]:
# Step 8.a: Define onevar derivative function
def deriv_onevar(lhss_deriv,rhss_deriv,variable_list,index):
    # Denote each variable with prm
    variableprm_list = []
    for variable in variable_list:
        variableprm_list.append(str(variable)+"prm")

    # Copy expressions into another array
    lhss_deriv_new = []
    rhss_deriv_new = []
    for i in range(len(rhss_deriv)):
        lhss_deriv_new.append(lhss_deriv[i])
        rhss_deriv_new.append(rhss_deriv[i])
    # For each free symbol's derivative, replace it with:
    #   1, if we are differentiating with respect to the variable, or
    #   0, if we are note differentiating with respect to that variable
    for i in range(len(rhss_deriv_new)):
        for var in variableprm_list:
            if variableprm_list.index(str(var))==index:
                rhss_deriv_new[i] = rhss_deriv_new[i].subs(var,1)
            else:
                rhss_deriv_new[i] = rhss_deriv_new[i].subs(var,0)
    # Simplify derivative expressions again
    lhss_deriv_simp,rhss_deriv_simp = simplify_deriv(lhss_deriv_new,rhss_deriv_new)
    return lhss_deriv_simp,rhss_deriv_simp

# Step 8.b: Call the derivative function and populate dictionaries with the result
lhss_derivative = {}
rhss_derivative = {}
for index in range(len(dynamic_variables)):
    lhss_temp,rhss_temp = deriv_onevar(lhss_deriv,rhss_deriv,dynamic_variables,index)
    lhss_derivative[dynamic_variables[index]] = lhss_temp
    rhss_derivative[dynamic_variables[index]] = rhss_temp

<a id='store_results'></a>

# Step 9: Store partial derivatives to SymPy notebook `partial_derivatives.txt-VALIDATION.txt` \[Back to [top](#toc)\]
$$\label{store_results}$$

We write the resulting derivatives in SymPy syntax.  Each partial derivative is output in its own file, in a similar format to the input expressions.

In [None]:
# Step 9: Output original expression and each partial derivative expression in SymPy snytax
with open(os.path.join(inputdir,'partial_derivatives.txt-VALIDATION'), 'w') as output:
    for i in range(len(lr)):
        right_side = lr[i].rhs
        right_side_in_sp = right_side.replace("sqrt(","sp.sqrt(").replace("log(","sp.log(").replace("pi",
                                                "sp.pi").replace("sign(","sp.sign(").replace("Abs(",
                                                "sp.Abs(").replace("Rational(","sp.Rational(")
        output.write(str(lr[i].lhs)+" = "+right_side_in_sp)
    for var in dynamic_variables:
        for i in range(len(lhss_derivative[var])):
            right_side = str(rhss_derivative[var][i])
            right_side_in_sp = right_side.replace("sqrt(","sp.sqrt(").replace("log(","sp.log(").replace("pi",
                                                "sp.pi").replace("sign(","sp.sign(").replace("Abs(",
                                                "sp.Abs(").replace("Rational(","sp.Rational(").replace("prm",
                                                "prm_"+str(var))
            output.write(str(lhss_derivative[var][i]).replace("prm","prm_"+str(var))+" = "+right_side_in_sp+"\n")

<a id='code_validation'></a>

# Step 10: Validate against LALSuite and trusted `SEOBNR_Derivative_Routine` NRPy+ module \[Back to [top](#toc)\]
$$\label{code_validation}$$

We validate the output of this notebook against known LALSuite values of the Hamiltonian partial derivatives and the output of the `SEOBNR_Derivative_Routine` NRPy+ module.  We note that due to cancellations in the deriavtive terms, various versions of SymPy may result in relative errors that differ as much as an order of magnitude.  Furthermore, even changing the set of input pararameters can affect the relative error by as many as two orders of magnitude.  Therefore we look for agreement with LALSuite to at least 10 significant digits.

When comparing the notebook output to that of the NRPy+ module, we compare term-by-term using SymPy to check that each right-hand side side is equivalent.

In [None]:
# Numerically evaluate right-hand sides using input values
def evaluate_expression(left_sides,right_sides,input_values):
    new_right_sides = []
    for i in range(len(right_sides)):
        term = custom_parse_expr(str(right_sides[i]).replace("(xx)",""))
        # Only look for the free variables in each expression to reduce computation time
        free_vars = term.free_symbols
        for variable in free_vars:
            term = term.subs(variable, input_values[str(variable)])
        # Evaluate each term to reduce computation time
        new_right_sides.append(sp.sympify(term.evalf()))
        # Store each subexpression in values numerically
        input_values[str(left_sides[i])] = new_right_sides[i]
    # Return the input values dictionary with all numerical right-hand added
    return input_values




In [None]:
##hard validation against lalsuite integration
import numpy as np

Derivativedir = 'Derivatives'
var_index = ['x', 'y', 'z', 'p1', 'p2', 'p3', 'S1x', 'S1y', 'S1z', 'S2x', 'S2y', 'S2z']

for index in range(1):
    validationfile = 'outputv4Pindex'+str(index)+'.txt'
    validationfile_pert = 'outputv4Pindex'+str(index)+'pert.txt'
    with open(os.path.join(Derivativedir,validationfile)) as file1, open(os.path.join(Derivativedir,validationfile_pert)) as file2:
        file1_lines = file1.readlines()
        file2_lines = file2.readlines()
        for i in range(len(file2_lines)//3):
            valuepairs = file1_lines[3*i].strip('{}\n').split(',')
            valuepairspert = file2_lines[3*i].strip('{}\n').split(',')
            derivpairs = file1_lines[3*i+1].strip('{}\n').split(',')
            derivpairspert = file2_lines[3*i+1].strip('{}\n').split(',')
            values = dict()
            derivs = dict()
            valuespert = dict()
            derivspert = dict()
            for j in range(len(valuepairs)):
                pair = valuepairs[j].split(': ')
                pairpert = valuepairspert[j].split(': ')
                values[pair[0].strip(" ''")] = float(pair[1])
                valuespert[pairpert[0].strip(" ''")] = float(pairpert[1])
            for j in range(len(derivpairs)):
                pair = derivpairs[j].split(': ')
                pairpert = derivpairspert[j].split(': ')
                derivs[pair[0].strip(" ''")] = float(pair[1])
                derivspert[pairpert[0].strip(" ''")] = float(pairpert[1]) 
            values = evaluate_expression(lhss,rhss,values)
            eta = values['eta']
            #Hreal = values['Hreal']
            #LAL_Hreal = derivs['Hreal']
            #LAL_Hreal_pert = derivspert['Hreal']
            #Erel = abs((Hreal-LAL_Hreal)/LAL_Hreal)
            #Erelpert = abs((LAL_Hreal-LAL_Hreal_pert)/LAL_Hreal)
            #if Erel > Erelpert:
            #    print("Line %d: Erel in Hamiltonian is %.2e, allowed is %.2e"%(i,Erel,Erelpert))
            for k in range(len(var_index)):
                Hrealprm = evaluate_expression(lhss_derivative[dynamic_variables[k]],rhss_derivative[dynamic_variables[k]],values)['Hrealprm']/eta
                LAL_Hrealprm = derivs['dHreal_d'+var_index[k]]
                LAL_Hreal_pertprm = derivspert['dHreal_d'+var_index[k]]
                Erel = abs((Hrealprm-LAL_Hrealprm)/LAL_Hrealprm)
                Erelpert = abs((LAL_Hrealprm-LAL_Hreal_pertprm)/LAL_Hrealprm)
                if Erel > 10*Erelpert:
                    print("Line %d: Erel in %s derivative is %.2e, allowed is %.2e"%(i,var_index[k],Erel,Erelpert))
            

In [None]:
print("Printing difference between notebook output and trusted NRPy+ module output...")
# Open the files to compare
file = 'partial_derivatives.txt'
outfile = 'partial_derivatives.txt-VALIDATION'
outputdir = 'Derivatives'
print("Checking file " + outfile)
with open(os.path.join(outputdir,file), "r") as file1, open(os.path.join(outputdir,outfile), "r") as file2:
    # Read the lines of each file
    file1_lines = file1.readlines()
    file2_lines = file2.readlines()
    # Compare right-hand sides of the expressions by computing the difference between them
    num_diffs = 0
    for i in range(len(file1_lines)):
        expr_new = custom_parse_expr(file1_lines[i].split("=")[1].replace("sp.",""))
        expr_validated = custom_parse_expr(file2_lines[i].split("=")[1].replace("sp.",""))
        difference = sp.simplify(expr_new - expr_validated)
        if difference != 0:
            num_diffs += 1
            print(difference)
    if num_diffs == 0:
        print("No difference. TEST PASSED!")
    else:
        print("ERROR: Disagreement found with the trusted file. See differences above.")
        sys.exit(1)

<a id='latex_pdf_output'></a>

# Step 11: Output this notebook to $\LaTeX$-formatted PDF file \[Back to [top](#toc)\]
$$\label{latex_pdf_output}$$

The following code cell converts this Jupyter notebook into a proper, clickable $\LaTeX$-formatted PDF file. After the cell is successfully run, the generated PDF may be found in the root NRPy+ tutorial directory, with filename
[Tutorial-SEOBNR_Derivative_Routine.pdf](Tutorial-SEOBNR_Derivative_Routine.pdf) (Note that clicking on this link may not work; you may need to open the PDF file through another means.)

In [None]:
import cmdline_helper as cmd      # NRPy+: Multi-platform Python command-line interface

cmd.output_Jupyter_notebook_to_LaTeXed_PDF("Tutorial-SEOBNR_Derivative_Routine")