# The Harder Way: Code generation and compilation [1 hour]

One of the most common low level programming languages is C. Compiled C code can be optimized for execution speed for many different computers.Python is written in C as well as many of the vectorized operations in NumPy and numerical algorithms in SciPy. It is often necessary to translate a complex mathematical expression into C for optimal exceution speeds and memory management. In this notebook you will learn how to automatically translate a complex SymPy expression into C, compile the code, and run the program.

We will continue examining the complex chemical kinetic reaction ordinary differential equation introduced in the previous lesson.

## Learning Objectives

After this lesson you will be able to:

- use import and use a code printer class to convert a SymPy expression to compilable C code
- use an array compatible assignment to print valid C array code
- subclass the printer class and modify it to provide custom behavior
- utilize common sub expression elimination to simplify and speed up the code execution
- compile a C program and execute it (may not get to this)

In [None]:
import sympy as sp

Enable mathematical printing in the Jupyter notebook.

In [None]:
sp.init_printing()

# Ordinary Differential Equations

The right hand side of the ordinary differential equations that describe a chemical kinetic reaction are loaded below. These expressions describe this mathematical equation:

$$\frac{dy}{dt} = f(y)$$

where the state vector $y$ is made up of 14 states, i.e. $y \in \mathbb{R}^{14}$.

Below the variable `rhs_of_odes` represents $f(y)$ and `states` represents $y$.

In [None]:
from scipy2017codegen.chem import load_large_ode

In [None]:
rhs_of_odes, states = load_large_ode()

## Exercise

Display the expressions, inspect them, and find out their types and dimensions. What are some of the characetistics of the equations?

In [None]:
rhs_of_odes

In [None]:
states

In [None]:
type(rhs_of_odes)

In [None]:
rhs_of_odes.shape

In [None]:
type(states)

In [None]:
states.shape

The equations are nonlinear equations of the states. There are 14 equations and 14 states. The coefficients in the equations are various floating point numbers.

# Compute the Jacobian

As has been shown in the previous lesson the Jacobian of the right hand side of the differential equations is often very useful for computations, such as integration and optimization. With:

$$\frac{dy}{dt} = f(y)$$

the Jacobian is defined as:

$$J(y) = \frac{df(y)}{dy}$$

SymPy can easily compute the Jacobian of matrix objects with the `.jacobian` method.

## Exercise

Compute the Jacobian and store the result in the variable `jac_of_odes`. Inspect the resulting Jacobian for dimensionality, type, and the symbolic form.

In [None]:
jac_of_odes = rhs_of_odes.jacobian(states)

In [None]:
type(jac_of_odes)

In [None]:
jac_of_odes.shape

In [None]:
jac_of_odes

# C Code Printing

The expressions are large and will have to be excuted many thousands of times to compute the desired numerical values, so we want them to execute as fast as possible. Now that we have some mathematical expressions avaible it is time to print these as C code.

We will design a single C function that evaluates both $f(y)$ and $J(y)$ simultaneously given the values of the states $y$. Below is a basic template for a C program that can includes such a function. Our job is to populate the function with the SymPy expressions represented a C code.

```C
#include <math.h>
#include <stdio.h>

void evaluate_odes(const float state_vals[14], float rhs_result[14], float jac_result[196])
{
      // We need to fill in the code here in this function using SymPy.
}

int main() {

    float state_vals[14] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14};
    float rhs_result[14];
    float jac_result[196];
    
    evaluate_odes(state_vals, rhs_result, jac_result);
    
    int i;

    printf("The right hand side of the equations evaluates to:\n");
    for (i=0; i < (sizeof (rhs_result) / sizeof (rhs_result[0])); i++) {
        printf("%lf\n", rhs_result[i]);
    }

    printf("The Jacobian evaluates to:\n");
    for (i=0; i < (sizeof (jac_result) / sizeof (jac_result[0])); i++) {
        printf("%lf\n", jac_result[i]);
    }

    return 0;
}



```

Instead of using the `ccode` convenience function let's use the underlying code printer class to do the printing. Further down we will modify the class to for custom printing.

In [None]:
from sympy.printing.ccode import CCodePrinter

All printing classes have to be instantiated and then the `.doprint()` method can be used to print SymPy expressions. Let's try to print the right hand side of the differential equations.

In [None]:
printer = CCodePrinter()

In [None]:
print(printer.doprint(rhs_of_odes))

In this case the C code printer does not do what we desire. It does not support printing a SymPy Matrix. In C, a matrix can be represented by an array type. The array type in C stores contigous values, e.g. floats, in a chunk of memory. You can declare an array of floats in C like:

```C
float my_array[10];
```

The word `float` is the data type of the individual values in the array which must all be the same. The word `my_array` is the variable name we choose to name the array and the `[10]` is the syntax to declare that this array will have 100 values.

The array is "empty" when first declared and can be filled with values like so:

```C
my_array[0] = 5;
my_array[1] = 6.78;
my array[2] = my_array[0] * 12;
```

It is possible to declare multidimensional arrays in C that could map more directly to the indices of our two dimensional matrix, but in this case we will map our two dimensional matrix to a one dimenasional array using C contingous row ordering. The means that the rows...TODO

The code printers are capable of dealing with this need through the `assign_to` keyword argument in the `.doprint()` method but we must define a SymPy object that is appropriate to be assigned to. In our case, since we want to assign a Matrix we need to use an appropriately sized Matrix symbol.

In [None]:
rhs_vals = sp.MatrixSymbol('rhs_vals', 14, 1)

In [None]:
print(printer.doprint(rhs_of_odes, assign_to=rhs_vals))

Notice that we have proper array value assignment and valid lines of C code that can be used in our function.

## Excercise

Print out valid C code from the Jacobian.

In [None]:
jac_vals = sp.MatrixSymbol('jac_vals', 14, 14)

In [None]:
print(printer.doprint(jac_of_odes, assign_to=jac_vals))

# Changing the Behavior of the Printer

The SymPy code printers are relatively easy to extend. They are designed such that if you want to change how a particularly SymPy object prints, for example a `Symbol`, then you only need to modify the `_print_Symbol` method. 

In [None]:
class MyCodePrinter(CCodePrinter):
    def _print_Symbol(self, expr):
        return "No matter what symbol you pass in I will always print:\n\nNi!"

In [None]:
my_printer = MyCodePrinter()

In [None]:
print(my_printer.doprint(sp.Symbol('theta')))

# Exercise

It turns out that in C calling `pow()` for low exponents executes slower than simply expanding the multiplication. For example `pow(x, 2)` should be printed as `x*x`. Modify the CCodePrinter `._print_Pow` method to expand the multiplication if the exponent is less than or equal to 4. You may want to have a look at the source code with `._print_Pow??`

In [None]:
printer._print_Pow??

In [None]:
class MyCodePrinter(CCodePrinter):
    def _print_Pow(self, expr):
        if expr.exp > 0 and expr.exp <= 4:
            return '*'.join([self._print(expr.base) for i in range(expr.exp)])
        else:
            return super()._print_Pow(expr)

In [None]:
my_printer = MyCodePrinter()

In [None]:
x = sp.Symbol('x')
my_printer.doprint(x)

In [None]:
my_printer.doprint(x**2)

In [None]:
my_printer.doprint(x**4)

In [None]:
my_printer.doprint(x**5)

## Exercise

One issue with our current code printer is that the expressions use the symbols $y0, y1, \ldots, y13$ instead of access the values directly from the arrays with `y[0], y[1], ..., y[13]`. We could go back and rename our SymPy symbols to use brackets, but another way would be to override the `_print_Symbol()` method to print these symbols as we desire. Modify the code printer so that it prints with the proper array access in the expression.

In [None]:
class MyCodePrinter(CCodePrinter):
    def _print_Symbol(self, expr):
        if expr in states:
            return 'state_vals[{}]'.format(expr.name[1:])

In [None]:
my_printer = MyCodePrinter()
print(my_printer.doprint(rhs_of_odes, assign_to=rhs_vals))

# Common Subexpression Elimination

If you look carefully at the expressions in the two matrices you'll see repeated expressions. These are not ideal in the sense that the computer has to repeat the exact same calculation multiple times. For large expressions this can be a major issue. Compilers, such as gcc, can often elimination common subexpressions on their own but for complex expressions the algorithms in most compilers do not do a thorough job or compilation can take an extremely long time. SymPy has tools to perform common subexpression elimination which is both thorough and reasonably efficient.

For example if you have two expressions:

```python
a = x*y + 5
b = x*y + 6
```

you can convert this to:

```python
z = x*y
a = z + 5
b = z + 6
```

and `z` only has to be computed once.

The `cse()` function in SymPy returns the subexpression, `z = x*y`, and the simplified expressions, `a = z + 5` `b = z + 6`.

Here is how it works:

In [None]:
sub_exprs, simplified_rhs = sp.cse(rhs_of_odes)

In [None]:
sub_exprs

In [None]:
simplified_rhs[0]

You can find common subexpressions among multiple expressions also:

In [None]:
sub_exprs, simplified_exprs = sp.cse((rhs_of_odes, jac_of_odes))

In [None]:
sub_exprs

In [None]:
simplified_exprs[0]

In [None]:
simplified_exprs[1]

# Exercise

Use common subexpression elimnation to print out C code for your two arrays such that:

```C
float x0 = some_sub_expression;
...
float xN = the_last_sub_expression;

rhs_results[0] = expressions_containing_the_subexpressions;
...
rhs_results[13] = ...;

jac_results[0] = ...;
...
jac_results[195] = ...;
```

You can make it work fairly easily 

In [None]:
from sympy.printing.codeprinter import Assignment

class CMatrixPrinter(CCodePrinter):
    def _print_ImmutableMatrix(self, expr):
        sub_exprs, simplified = sp.cse(expr)
        lines = []
        for var, sub_expr in sub_exprs:
            lines.append('float ' + self._print(Assignment(var, sub_expr)))
        M = sp.MatrixSymbol('M', *expr.shape)
        return '\n'.join(lines) + self._print(Assignment(M, expr))

In [None]:
p = CMatrixPrinter()
print(p.doprint(jac_of_odes))

In [None]:
class CMatrixPrinter(CCodePrinter):
    
    def _print_list(self, list_of_exprs):
        # NOTE : The MutableDenseMatrix is turned in an ImmutableMatrix inside here.
        if all(isinstance(x, sp.ImmutableMatrix) for x in list_of_exprs):
            sub_exprs, simplified_exprs = sp.cse(list_of_exprs)
            lines = []
            for var, sub_expr in sub_exprs:
                ass = Assignment(var, sub_expr)
                lines.append('float ' + self._print(ass))
            for mat in simplified_exprs:
                lines.append(self._print(mat))
            return '\n'.join(lines)
        else:
            return super()._print_list(list_of_exprs)

# TODO : This doesn't work as expected.
#    def _print_Symbol(self, expr):
#         if expr in states:
#            return self._print(sp.Symbol('state_vals[{}]'.format(expr.name[1:])))
            
    def _print_ImmutableMatrix(self, expr):
        if expr.shape[1] > 1:
            M = sp.MatrixSymbol('jac_result', *expr.shape)
        else:
            M = sp.MatrixSymbol('rhs_result', *expr.shape)
        return self._print(Assignment(M, expr))

In [None]:
p = CMatrixPrinter()
print(p.doprint([rhs_of_odes, jac_of_odes]))