# Cython 101



## How Cython Works

Python is an interpreted language without a compilation step.
Cython must be compiled in a two step process.

1. a `.pyx` file is compiled to a `.c` file.
2. the `.c.` file is compiled to a `.so` file (`.pyd` in Windows) which can be imported to a python module as if was python code.

> One of the benefits of Python is that you don't need to compile code.
> Cython requires a compilation step.


## installation

to install Cython you need to install Cython `pip install cython`
and a C compiler (gcc).

* __Linux__ The GNU C Compiler (gcc) is usually present, or easily available through the package system. On Ubuntu or Debian, for instance, the command sudo apt-get install build-essential will fetch everything you need.
* __Mac OS X__ To retrieve gcc, one option is to install Apple’s XCode, which can be retrieved from the Mac OS X’s install DVDs or from https://developer.apple.com/.
* __Windows__ A popular option is to use the open source MinGW (a Windows distribution of gcc). See the appendix for instructions for setting up MinGW manually. Enthought Canopy and Python(x,y) bundle MinGW, but some of the configuration steps in the appendix might still be necessary. Another option is to use Microsoft’s Visual C. One must then use the same version which the installed Python was compiled with.

> The docker container running the notebook already has Cython and gcc installed. So you don't need to worry about it.


## Running Cython in a Python package

The compiled code can be imported directly into Python as if was Python code.
write the cython code in the file `my_package/sub_subpackage/my_cython_file.pyx`

and import it with the following line in another Python or Cython file in the module.

```python
from my_package.sub_package.my_cython_file import my_cython_function
```

There are three ways to build cython code in a package.

* Write a distutils/setuptools setup.py. This is the normal and recommended way.
* Use Pyximport, importing Cython .pyx files as if they were .py files (using distutils to compile and build in the background). This method is easier than writing a setup.py, but is not very flexible. So you’ll need to write a setup.py if, for example, you need certain compilations options.
* Run the cython command-line utility manually to produce the .c file from the .pyx file, then manually compiling the .c file into a shared object library or DLL suitable for import from Python. (These manual steps are mostly for debugging and experimentation.)

We won't be running our code in a module. So if you want to learn more read the [docs here](https://cython.readthedocs.io/en/latest/src/quickstart/build.html)

## Cython compilation at runtime

Cython has a tool called pyximport to compile Cython code when it's imported.
It isn't recommended for production or distribution, but it removes the compile step from the devloop.

```python
>>> import pyximport; pyximport.install()
>>> from my_package.sub_package.my_cython_file import my_cython_function
```

## Running Cython in a notebook

Jupyter let's us run Cython code inline.
It compiles the code before running a cell.
Running Cython in a cell is noticibly slower than a python cell, but still the easiest way to iterate Cython code.

To enable cython compilation you need to load the Cython jupyter extension by running the following magic command in a cell.
```
%load_ext Cython
```


In [None]:
%load_ext Cython

To tell jupyter what cell should compile Cython code we need to add the magic command `%%cython`

In [None]:
%%cython
cdef int add(int x, int y):
    return x + y
print(add(1, 2))

Cython let's us view how much fast c code we've written and how much is slow python we have written by using the flag `--annotate`

In [None]:
%%cython --annotate
cdef int add(int x, int y):
    return x + y
my_sum = add(1,2) # the parameters must be integers because that was the defined type
print(my_sum)

## Digging into Cython syntax

### Why is some of the code yellow and some white?

The `--annotate` flag shows us what code is compiled to pure `C` code and what code has python calls.

Let's deconstruct our code line by line

```cdef int add(int x, int y):```

We are defining a `C` function.
`cdef` tells the compiler to make a `C` function without a python wrapper.
You will notice that we also had to add types to our variables in a `C` style not the python type annotation style.

```def add(x: int, y: int) -> int:```

We will dive deeper into Cython function and variable definitions later

the next line `return x + y` doesn't run python code because the Cython compiler knows that x and y are both ints and the return type is also an int.

The third line `my_sum = add(1,2)` cannot be completely compiled to C  because `my_sum` is a python variable not a C variable. The Cython compiler knows to turn the C int returned and cast it as a python int.

The line `print(my_sum)` is pure python.

# Cython Syntax

## Variable and type definitions

Cython is not dynamically typed.
Cython files support running Python code but to compile to fast Cython we need to declare variables

the `cdef` statement is used to declare C variables. Similar to C code

In [None]:
%%cython
cdef int x = 1
cdef float f
cdef float g[4]
cdef int *h

We can create the same variables by grouping them a `cdef` block

In [None]:
%%cython
cdef:
    int x = 1
    float f
    float g[4]
    int *h

Or we can create structs, unions, and enum types

In [None]:
%%cython
cdef struct Grail:
    int age
    float volume

cdef union Food:
    char *spam
    float *eggs

cdef enum CheeseType:
    cheddar,
    edam,
    camembert


We use the same `cdef` keyword to declare a function

In [None]:
%%cython
cdef unsigned long foo(unsigned long bar):
    return bar * 3

We can also declare classes with `cdef` making them extension types. They are faster than python classes because they use a struct internally to store attributes instead of a dict

In [None]:
%%cython
cdef class foo:
    cdef int x, y
    
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    cdef int get_y(self):
        return self.y

## Digging into functions

In Cython, there are two types of functions, C functions and Python functions.

Python functions are defined using the `def` statement. Just like in python, they take python objects as parameters and return Python objects

C functions are defined with the new `cdef` statement. They can take Python objects or C values as parameters and can return Python objects or C values.

When the parameter of a python function is declared with a C data type it passes in the Python object and converts it to a C value. 
The two following functions are equivalent.

In [None]:
%%cython
def foo(bar):
    cdef int b = bar
    return b * 2
    
def foo(int bar):
    return bar * 2


> Note: Currently it is only possible to auto convert numeric types, string types and structs.

### def, cdef, cpdef?

`cdef` functions run very fast but can only be called from Cython code.
`def` functions are slower because they take in Python objects, but can be called from Cython __and__ Python code.

If we want the performance of `cdef` and the portability of a `def` function then we declare the function as `cpdef`.
`cpdef` actually creates two functions: a `cdef` and a `def` function. (when calling a `cpdef` from Cython code there is a slight overhead)

Now let's time Fibonacci for an implementation in Python, Cython `def`, Cython `cpdef`, and Cython `cdef`

In [None]:
def fib_python(n):
    if n < 2:
        return n
    return fib_python(n-2) + fib_python(n-1)

In [None]:
%%cython
def fib_def(int n):
    if n < 2:
        return n
    return fib_def(n-2) + fib_def(n-1)

cpdef int fib_cpdef(int n):
    if n < 2:
        return n
    return fib_cpdef(n-2) + fib_cpdef(n-1)

cdef int fib_cdef(int n):
    if n < 2:
        return n
    return fib_cdef(n-2) + fib_cdef(n-1)

def fib_cdef_wrapper(int n):
    return fib_cdef(n)


In [None]:
%%cython
def fib_def(int n):
    pass

cpdef int fib_cpdef(int n):
    pass

cdef int fib_cdef(int n):
    pass

# wrap the cdef in a def function because anything defined as a cdef can't be accessed from outside of Cython or C code
def fib_cdef_wrapper(int n):
    return fib_cdef(n)


In [None]:
cycles = 30
print('Python:')
%timeit fib_python(cycles)
print('Cython def:')
%timeit fib_def(cycles)
print('Cython cpdef:')
%timeit fib_cpdef(cycles)
print('Cython cdef:')
%timeit fib_cdef_wrapper(cycles)