# The Harder Way: C Code generation, Custom Printers, and CSE [1 hour]

One of the most common low level programming languages in use is C. Compiled C code can be optimized for execution speed for many different computers. Python is written in C as well as many of the vectorized operations in NumPy and numerical algorithms in SciPy. It is often necessary to translate a complex mathematical expression into C for optimal exceution speeds and memory management. In this notebook you will learn how to automatically translate a complex SymPy expression into C, compile the code, and run the program.

We will continue examining the complex chemical kinetic reaction ordinary differential equation introduced in the previous lesson.

## Learning Objectives

After this lesson you will be able to:

- use a code printer class to convert a SymPy expression to compilable C code
- use an array compatible assignment to print valid C array code
- subclass the printer class and modify it to provide custom behavior
- utilize common sub expression elimination to simplify and speed up the code execution

In [None]:
import sympy as sm

Enable mathematical printing in the Jupyter notebook.

In [None]:
sm.init_printing()

# Ordinary Differential Equations

The previously generated ordinary differential equations that describe chemical kinetic reactions are loaded below. These expressions describe the right hand side of this mathematical equation:

$$\frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}(t))$$

where the state vector $\mathbf{y}(t)$ is made up of 14 states, i.e. $\mathbf{y}(t) \in \mathbb{R}^{14}$.

Below the variable `rhs_of_odes` represents $\mathbf{f}(\mathbf{y}(t))$ and `states` represents $\mathbf{y}(t)$.

From now own we will simply use $\mathbf{y}$ instead of $\mathbf{y}(t)$ and assume an implicit function of $t$.

In [None]:
from scipy2017codegen.chem import load_large_ode

In [None]:
rhs_of_odes, states = load_large_ode()

## Exercise

Display the expressions (`rhs_of_odes` and `states`), inspect them, and find out their types and dimensions. What are some of the characetistics of the equations?

In [None]:
states

In [None]:
rhs_of_odes

In [None]:
type(rhs_of_odes)

In [None]:
rhs_of_odes.shape

In [None]:
type(states)

In [None]:
states.shape

The equations are nonlinear equations of the states. There are 14 equations and 14 states. The coefficients in the equations are various floating point numbers.

# Compute the Jacobian

As has been shown in the previous lesson the Jacobian of the right hand side of the differential equations is often very useful for computations, such as integration and optimization. With:

$$\frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y})$$

the Jacobian is defined as:

$$\mathbf{J}(\mathbf{y}) = \frac{\partial\mathbf{f}(\mathbf{y})}{\partial\mathbf{y}}$$

SymPy can easily compute the Jacobian of matrix objects with the `Matrix.jacobian()` method.

## Exercise

Look up the Jacobian in the SymPy documentation then compute the Jacobian and store the result in the variable `jac_of_odes`. Inspect the resulting Jacobian for dimensionality, type, and the symbolic form.

In [None]:
jac_of_odes = rhs_of_odes.jacobian(states)

In [None]:
type(jac_of_odes)

In [None]:
jac_of_odes.shape

In [None]:
jac_of_odes

# C Code Printing

The two expressions are large and will likely have to be excuted many thousands of times to compute the desired numerical values, so we want them to execute as fast as possible. We can use SymPy to print these expressions as C code.

We will design a double precision C function that evaluates both $\mathbf{f}(\mathbf{y})$ and $\mathbf{J}(\mathbf{y})$ simultaneously given the values of the states $y$. Below is a basic template for a C program that can includes such a function. Our job is to populate the function with the SymPy expressions represented a C code.

```C
#include <math.h>
#include <stdio.h>

void evaluate_odes(const double state_vals[14], double rhs_result[14], double jac_result[196])
{
      // We need to fill in the code here using SymPy.
}

int main() {

    // initialize the state vector with some values
    double state_vals[14] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14};
    // create "empty" 1D arrays to hold the results of the computation
    double rhs_result[14];
    double jac_result[196];
    
    // call the function
    evaluate_odes(state_vals, rhs_result, jac_result);
    
    // print the computed values to the terminal
    int i;

    printf("The right hand side of the equations evaluates to:\n");
    for (i=0; i < 14; i++) {
        printf("%lf\n", rhs_result[i]);
    }

    printf("\nThe Jacobian evaluates to:\n");
    for (i=0; i < 196; i++) {
        printf("%lf\n", jac_result[i]);
    }

    return 0;
}

```

Instead of using the `ccode` convenience function let's use the underlying code printer class to do the printing. This will allow us to modify the class to for custom printing further down.

In [None]:
from sympy.printing.ccode import C99CodePrinter

All printing classes have to be instantiated and then the `.doprint()` method can be used to print SymPy expressions. Let's try to print the right hand side of the differential equations.

In [None]:
printer = C99CodePrinter()

In [None]:
print(printer.doprint(rhs_of_odes))

In this case the C code printer does not do what we desire. It does not support printing a SymPy Matrix (see the first line of the output). In C, on possible representation of a matrix is an array type. The array type in C stores contigous values, e.g. doubles, in a chunk of memory. You can declare an array of doubles in C like:

```C
double my_array[10];
```

The word `double` is the data type of the individual values in the array which must all be the same. The word `my_array` is the variable name we choose to name the array and the `[10]` is the syntax to declare that this array will have 10 values.

The array is "empty" when first declared and can be filled with values like so:

```C
my_array[0] = 5;
my_array[1] = 6.78;
my array[2] = my_array[0] * 12;
```

or like:

```C
my_array = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
```

It is possible to declare multidimensional arrays in C that could map more directly to the indices of our two dimensional matrix, but in this case we will map our two dimensional matrix to a one dimenasional array using C contingous row ordering.

The code printers are capable of dealing with this need through the `assign_to` keyword argument in the `.doprint()` method but we must define a SymPy object that is appropriate to be assigned to. In our case, since we want to assign a Matrix we need to use an appropriately sized Matrix symbol.

In [None]:
rhs_result = sm.MatrixSymbol('rhs_result', 14, 1)

In [None]:
print(rhs_result)

In [None]:
print(printer.doprint(rhs_of_odes, assign_to=rhs_result))

Notice that we have proper array value assignment and valid lines of C code that can be used in our function.

## Excercise

Print out valid C code from the Jacobian.

In [None]:
jac_result = sm.MatrixSymbol('jac_result', 14, 14)

In [None]:
print(jac_result)

In [None]:
print(printer.doprint(jac_of_odes, assign_to=jac_result))

# Changing the Behavior of the Printer

The SymPy code printers are relatively easy to extend. They are designed such that if you want to change how a particularly SymPy object prints, for example a `Symbol`, then you only need to modify the `_print_Symbol` method. 

In [None]:
class MyCodePrinter(C99CodePrinter):
    def _print_Symbol(self, expr):
        return "No matter what symbol you pass in I will always print:\n\nNi!"

In [None]:
my_printer = MyCodePrinter()

In [None]:
print(my_printer.doprint(sm.Symbol('theta')))

# Exercise

It turns out that in C calling `pow()` for low exponents executes slower than simply expanding the multiplication. For example `pow(x, 2)` should be printed as `x*x`. Modify the CCodePrinter `._print_Pow` method to expand the multiplication if the exponent is less than or equal to 4. You may want to have a look at the source code with `printer._print_Pow??`

In [None]:
#printer._print_Pow??

In [None]:
class MyCodePrinter(C99CodePrinter):
    def _print_Pow(self, expr):
        if expr.exp > 0 and expr.exp <= 4:
            return '*'.join([self._print(expr.base) for i in range(expr.exp)])
        else:
            return super()._print_Pow(expr)

In [None]:
my_printer = MyCodePrinter()

In [None]:
x = sm.Symbol('x')
my_printer.doprint(x)

In [None]:
my_printer.doprint(x**2)

In [None]:
my_printer.doprint(x**4)

In [None]:
my_printer.doprint(x**5)

## Exercise

One issue with our current code printer is that the expressions use the symbols `y0, y1, ..., y13` instead of accessing the values directly from the arrays with `y[0], y[1], ..., y[13]`. We could go back and rename our SymPy symbols to use brackets, but another way would be to override the `_print_Symbol()` method to print these symbols as we desire. Modify the code printer so that it prints with the proper array access in the expression.

In [None]:
state_vals = sm.MatrixSymbol('state_vals', 14, 1)

In [None]:
print(state_vals[0])

In [None]:
class MyCodePrinter(C99CodePrinter):
    def _print_Symbol(self, expr):
        if expr in states:
            idx = list(states).index(expr)
            return self._print(state_vals[idx])

In [None]:
my_printer = MyCodePrinter()
print(my_printer.doprint(rhs_of_odes, assign_to=rhs_result))

Or as mentioned you can replace the symbols up front. Notice that the C printer assumes that a 2D matrix will get mapped to a 1D C array.

In [None]:
state_array_map = dict(zip(states, state_vals))
print(state_array_map)

In [None]:
print(printer.doprint(rhs_of_odes.xreplace(state_array_map), assign_to=rhs_result))

# Common Subexpression Elimination

If you look carefully at the expressions in the two matrices you'll see repeated expressions. These are not ideal in the sense that the computer has to repeat the exact same calculation multiple times. For large expressions this can be a major issue. Compilers, such as gcc, can often eliminate common subexpressions on their own when different optimization flags are invoked but for complex expressions the algorithms in some compilers do not do a thorough job or compilation can take an extremely long time. SymPy has tools to perform common subexpression elimination which is both thorough and reasonably efficient. In particular if gcc is run with the lowest optimization setting `-O0` cse can give large speedups.

For example if you have two expressions:

```python
a = x*y + 5
b = x*y + 6
```

you can convert this to these three expressions:

```python
z = x*y
a = z + 5
b = z + 6
```

and `x*y` only has to be computed once.

The `cse()` function in SymPy returns the subexpression, `z = x*y`, and the simplified expressions: `a = z + 5`, `b = z + 6`.

Here is how it works:

In [None]:
sub_exprs, simplified_rhs = sm.cse(rhs_of_odes)

In [None]:
for var, expr in sub_exprs:
    sm.pprint(sm.Eq(var, expr))

`cse()` can return a number of simplified expressions and to do this it returns a list. In our case we have 1 simplified expression that can be accessed as the first item of the list.

In [None]:
simplified_rhs[0]

You can find common subexpressions among multiple objects also:

In [None]:
sub_exprs, simplified_exprs = sm.cse((rhs_of_odes, jac_of_odes))

In [None]:
for var, expr in sub_exprs:
    sm.pprint(sm.Eq(var, expr))

In [None]:
simplified_exprs[0]

In [None]:
simplified_exprs[1]

# Exercise

Use common subexpression elimination to print out C code for your two arrays such that:

```C
double x0 = some_sub_expression;
...
double xN = the_last_sub_expression;

rhs_result[0] = expressions_containing_the_subexpressions;
...
rhs_result[13] = ...;

jac_result[0] = ...;
...
jac_result[195] = ...;
```

This code can be copied and pasted into the provided template above to make a C program.

You can add in cse fairly easily for printing a single matrix:

In [None]:
from sympy.printing.codeprinter import Assignment

class CMatrixPrinter(C99CodePrinter):
    def _print_ImmutableDenseMatrix(self, expr):
        sub_exprs, simplified = sm.cse(expr)
        lines = []
        for var, sub_expr in sub_exprs:
            lines.append('double ' + self._print(Assignment(var, sub_expr)))
        M = sm.MatrixSymbol('M', *expr.shape)
        return '\n'.join(lines) + '\n' + self._print(Assignment(M, expr))

In [None]:
p = CMatrixPrinter()
print(p.doprint(jac_of_odes))

In [None]:
class CMatrixPrinter(C99CodePrinter):
    
    def _print_list(self, list_of_exprs):
        # NOTE : The MutableDenseMatrix is turned in an ImmutableMatrix inside here.
        if all(isinstance(x, sm.ImmutableMatrix) for x in list_of_exprs):
            sub_exprs, simplified_exprs = sm.cse(list_of_exprs)
            lines = []
            for var, sub_expr in sub_exprs:
                ass = Assignment(var, sub_expr.xreplace(state_array_map))
                lines.append('double ' + self._print(ass))
            for mat in simplified_exprs:
                lines.append(self._print(mat.xreplace(state_array_map)))
            return '\n'.join(lines)
        else:
            return super()._print_list(list_of_exprs)
            
    def _print_ImmutableDenseMatrix(self, expr):
        if expr.shape[1] > 1:
            M = sm.MatrixSymbol('jac_result', *expr.shape)
        else:
            M = sm.MatrixSymbol('rhs_result', *expr.shape)
        return self._print(Assignment(M, expr))

In [None]:
p = CMatrixPrinter()
print(p.doprint([rhs_of_odes, jac_of_odes]))

# Bonus: Compile and Run the C Program

In [None]:
c_template = """\
#include <math.h>
#include <stdio.h>

void evaluate_odes(const double state_vals[14], double rhs_result[14], double jac_result[196])
{{
    // We need to fill in the code here using SymPy.
{code}
}}

int main() {{

    // initialize the state vector with some values
    double state_vals[14] = {{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}};
    // create "empty" 1D arrays to hold the results of the computation
    double rhs_result[14];
    double jac_result[196];

    // call the function
    evaluate_odes(state_vals, rhs_result, jac_result);

    // print the computed values to the terminal
    int i;
    printf("The right hand side of the equations evaluates to:\\n");
    for (i=0; i < 14; i++) {{
        printf("%lf\\n", rhs_result[i]);
    }}
    printf("\\nThe Jacobian evaluates to:\\n");
    for (i=0; i < 196; i++) {{
        printf("%lf\\n", jac_result[i]);
    }}

    return 0;
}}\
"""

In [None]:
c_program = c_template.format(code=p.doprint([rhs_of_odes, jac_of_odes]))
print(c_program)

In [None]:
with open('run.c', 'w') as f:
    f.write(c_program)

To compile the code there are several options. The first is gcc (the GNU C Compiler). If you have Linux, Mac, or Windows (w/ mingw installed) you can use the Jupyter notebook `!` command to send your command to the terminal. For example:

```ipython
!gcc run.c -lm -o run
```

This will compile `run.c`, link against the C math librar with `-lm` and output, `-o`, to a file `run` (Mac/Linux) or `run.exe` (Windows).

On Mac and Linux the program can be executed with:

```ipython
!./run
```

and on Windows:

```ipython
!run.exe
```

Other options are using the clang compiler or Windows cl:

```ipython
!clang run.c -lm -o run
!cl run.c -lm
```