# Cython: C made painless

* Python: fast development, slow execution

* C/C++/Fortran: slow development, fast execution

## Why is Python execution slow: 

It's all too dynamic.

* Runtime interprets the bytecode.

* Everything is an object (boxing/unboxing)

* Function calls are expensive

* Global interpreter lock (GIL)

Python has well defined C API. Can use that for moving computationally expensive parts to compiled code: *C extensions*.

Example: numpy, scipy, scikit-learn, lxml, Sage, ZeroMQ, ...

### Human user $\Longleftrightarrow$ Python runtime $\Longleftrightarrow$ C extensions.

The idea is to keep user interface, database, web, visualization etc etc in Python.

Writing C extensions manually can be daunting. (All pleasures of manual memory management, *plus* reference counting, parsing python arguments etc). 

https://docs.python.org/3.5/c-api/

E.g. Paul Ross's http://pythonextensionpatterns.readthedocs.io/en/latest/index.html

## Enter Cython

Cython (http://cython.org) is a static compiler from a superset of Python to C (or C++).

### Human user $\Longleftrightarrow$ Python runtime $\Longleftrightarrow$ Cython $\Longleftrightarrow$ a C extension.


If you already have a C/C++/Fortran code, expose it to Python by *wrapping* it in Cython.

Otherwise, 

1. build a prototype in pure python, 
2. profile to identify hotspots,
3. move hotspots to Cython,
4. Profit!

Perks:

* First-class NumPy support
* Can use the C++ standard library
* Parallelism: can release the GIL

## A worked example of Cythonizing a computation

Shamelessly stolen from Pauli Virtanen, *Cython tutorial*,
https://python.g-node.org/python-summerschool-2011/_media/materials/cython/cython-slides.pdf

Consider a planet orbiting a star.

Need to solve a second-order ODE:


$$
\begin{align}
\frac{d\mathbf{x}}{dt} &= \mathbf{v} \;,\\
\frac{d\mathbf{v}}{dt} &= \frac{\mathbf{F(\mathbf{x})}}{m} \;.
\end{align}
$$

Note that solving an ODE cannot be vectorized, hence NumPy is of no help.

For the sake of example, only use the Euler method.

In [1]:
import numpy as np

In [9]:
from math import sqrt

class Planet(object):
    """A class to store a planet's position and velocity."""
    def __init__(self):
        self.x = 1.0
        self.y = 0
        self.z = 0
        self.vx = 0
        self.vy = 0
        self.vz = 1.0
        
        self.m = 1.0
        

def single_step(planet, dt):
    """Make a single step in time, t -> t+dt."""
    
    # Gravitational force pulls towards origin
    r = sqrt(planet.x**2 + planet.y**2 + planet.z**2)
    r3 = r**3
    
    Fx = -planet.x / r3
    Fy = -planet.y / r3
    Fz = -planet.z / r3
    
    # update position
    planet.x += planet.vx * dt
    planet.y += planet.vy * dt
    planet.z += planet.vz * dt
    
    # update velocity
    m = planet.m
    planet.vx += Fx * dt / m
    planet.vy += Fy * dt / m
    planet.vz += Fz * dt / m


def propagate(planet, time_span, num_steps):
    """Make a number of time steps."""
    dt = time_span / num_steps
    
    for _ in range(num_steps):
        single_step(planet, dt)

In [10]:
planet = Planet()
%timeit propagate(planet, 1, 1000)

1.72 ms ± 19.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


## Compile the program to Cython

Every python program is a valid cython program

In [1]:
%load_ext cython

In [4]:
%%cython -a
# -a is for "annotate"

from math import sqrt

class Planet(object):
    def __init__(self):
        self.x = 1.0
        self.y = 0
        self.z = 0
        self.vx = 0
        self.vy = 0
        self.vz = 1.0
        
        self.m = 1.0
        

def single_step(planet, dt):
    """Make a single step in time, t -> t+dt."""
    
    # Gravitational force pulls towards origin
    r = sqrt(planet.x**2 + planet.y**2 + planet.z**2)
    r3 = r**3
    
    Fx = -planet.x / r3
    Fy = -planet.y / r3
    Fz = -planet.z / r3
    
    # update position
    planet.x += planet.vx * dt
    planet.y += planet.vy * dt
    planet.z += planet.vz * dt
    
    # update velocity
    m = planet.m
    planet.vx += Fx * dt / m
    planet.vy += Fy * dt / m
    planet.vz += Fz * dt / m


def propagate(planet,
              time_span,
              Py_ssize_t num_steps):
    """Make a number of time steps."""
    dt = time_span / num_steps
    
    cdef Py_ssize_t j
    
    for j in range(num_steps):
        single_step(planet, dt)

In [11]:
planet = Planet()
%timeit propagate(planet, 1, 100)

177 µs ± 13.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
