# Sage Special Topics: Cython

## Overview

[Cython](https://cython.org/) is a compiler  
for the Python programming language along with  
the extended Cython programming language.  
This combination allows you to enjoy  
the readability of Python and 
the efficiency of C.

This means you may:  
- use the **static data type in C** to speed up the program,  
- **wrap C functions** into Python interface, and  
- yet more features that I don't totally understand.

Sage developing team plays an important role  
in building up Cython; see [the history of Cython](https://en.wikipedia.org/wiki/Cython#History).  
However, Cython is not limited to the Sage environment,  
and this tutorial is also applicable for general Python users.

## Installation

If you are using SageNB  
or Jupyter under the Sage kernel,  
you may ignore this section.

For general Python users,  
you may use  
```Python
pip install Cython --user
```
to **install** Cython on your machine.  

To use it in a Jupyter notebook under the Python kernel,  
you have to run the following cell  
to **activate** the Cython extension.  
Then refer to the [A quick example](#A-quick-example) section.

In [1]:
%load_ext Cython

For using Cython in the console or other IDEs,  
please refer to the [Workflow](#Workflow) section.

## Workflow

As mentioned before,  
Cython is a **compiler** rather than an interpreter,  
so each piece of the code need to be compiled before execution.

Here is the basic workflow: (See [Basic Tutorial](https://cython.readthedocs.io/en/latest/src/tutorial/cython_tutorial.html) of the official Cython documentation for more details.)  
- Prepare a `cython_code.pyx` file that contains your Cython code.
- Prepare a `setup.py` file that contains the setup information.
- run `python setup.py build_ext --inplace` in the terminal  
to generate `cython_code.so` (unix) or `cython_code.pyd` (Windows).
- In another Python file or in a Jupyter notebook, `import cython_code`.

Sample `cython_code.pyx`:

```Python
def print_interest():
    print("I love Math!")
```

Sample `setup.py`:

```Python
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("cython_code.pyx")
)
```

Sample usage:

```Python
import cython_code

print_interest()
```

Jupyter wrap this process  
by the `%%cython` magic function  
(`%cython` in SageNB).  

Let's look at an example below.

## A quick example

Here is a piece of Python code  
that calculates all primes below or equal to `n`.

In [2]:
def a_prime(p):
    """Tell p is a prime or not"""
    for i in range(2,p):
        if p % i == 0:
            return False
    else:
        return True

def primes_below(n):
    """Return all primes below or equal to n"""
    primes = [p for p in range(2,n+1) if a_prime(p)]
    return primes

In [3]:
%%timeit
primes = primes_below(10000)

488 ms ± 2.86 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


If you are using Jupyter,  
simply **add `%%cython` to the first line** of the target cell.  
Thus, the code in the cell will be compiled  
and then imported into the notebook automatically.  
(The same process as in the workflow.)

If you use pure Python,  
remember to activate the extension first.
```Python
%load_ext Cython
```

If you use SageNB, 
then add `%cython` instead.

In [4]:
%%cython

def a_prime_pinc(p):
    """Tell p is a prime or not"""
    for i in range(2,p):
        if p % i == 0:
            return False
    else:
        return True

def primes_below_pinc(n):
    """Return all primes below or equal to n"""
    primes = [p for p in range(2,n+1) if a_prime_pinc(p)]
    return primes

In [5]:
%%timeit
primes = primes_below_pinc(10000)

292 ms ± 998 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


By compiling the code through Cython,  
it already speed up the code  
without doing anything further.

If we appropriately change some  
dynamic data types in Python  
to **static data types** in C,  
then this will enhance the performance further.

In [6]:
%%cython

cdef bint a_prime_c(int p):
    """Tell p is a prime or not"""
    cdef int i
    for i in range(2,p):
        if p % i == 0:
            return False
    else:
        return True

cpdef primes_below_c(int n):
    """Return all primes below or equal to n"""
    primes = [p for p in range(2,n+1) if a_prime_c(p)]
    return primes

In [7]:
%%timeit
primes = primes_below_c(10000)

21.9 ms ± 97.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


## Static data types

One may use `cdef` to **declare a static data type**.

Possible static data types include 
- `unsigned`, `long` + `int`, `float`, `double`, 
- `char`,
- array and pointer,
- and yet more [here](https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html).

In [8]:
%%cython

cdef int a = 1
print(a)

1


One may also use `cdef` to define a function.

In [9]:
%%cython

cdef int plus_one(int x):
    return x+1

print(plus_one(1))

2


But then this variable or function is not accessible by other cells.

In [None]:
a ### result in an error

In [None]:
plus_one(1) ### result in an error

For functions,  
it is possible to use `cpdef` instead  
to **wrap it for the Python interface**.  

In [10]:
%%cython

cpdef int plus_two(int x):
    return x+2

print(plus_two(1))

3


In [11]:
plus_two(1)

3

Recall that you can still use `def` in Cython,  
but it does not support static typing for the function.  
(Static typing for the arguments is okay.)

In [12]:
%%cython

def plus_three(int x):
    return x+3

print(plus_three(1))

4


In [13]:
plus_three(1)

4

There are many combinations of  
`def`, `cdef`, `cpdef`, and with/without static typing.  

Here is an article "[How Fast are def cdef cpdef?](https://notes-on-cython.readthedocs.io/en/latest/fibo_speed.html)"  
that gives expriments on the speed of these combinations.

## Where to add types?

Use `%%cython -a` (or `--annotate`)  
to see where is more Python-like.  

Lines with darker yellow requires more attentions.  
(It seems this feature is not available for SageMath.)

In [14]:
%%cython -a

def a_prime_pinc(p):
    """Tell p is a prime or not"""
    for i in range(2,p):
        if p % i == 0:
            return False
    else:
        return True

def primes_below_pinc(n):
    """Return all primes below or equal to n"""
    primes = [p for p in range(2,n+1) if a_prime_pinc(p)]
    return primes

In [15]:
%%cython -a

cdef bint a_prime_c(int p):
    """Tell p is a prime or not"""
    cdef int i
    for i in range(2,p):
        if p % i == 0:
            return False
    else:
        return True

cpdef primes_below_c(int n):
    """Return all primes below or equal to n"""
    primes = [p for p in range(2,n+1) if a_prime_c(p)]
    return primes

##### Exercise
Optimize the following function by Cython.

In [16]:
### Python code: edit the Cython cell instead of this one
def root_two_py(a, b, r):
    """Starting with two endpoints a and b  
    and apply the bisection method r rounds.  
    Return the approximation of root 2."""
    
    if a**2 > 2 or b**2 <2:
        raise ValueError("It should be a^2<=2 and b^2>=2.")
    for i in range(r):
        c = (a+b) / 2
        if c**2 > 2:
            a,b = a,c
        else:
            a,b = c,b
    return a

In [17]:
%%timeit 
root_two_py(1,2,100)

25.2 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [18]:
%%cython

### add some static type to accelerate the program
cpdef root_two_cy(a, b, r):
    """Starting with two endpoints a and b  
    and apply the bisection method r rounds.  
    Return the approximation of root 2."""
    
    if a**2 > 2 or b**2 <2:
        raise ValueError("It should be a^2<=2 and b^2>=2.")
    for i in range(r):
        c = (a+b) / 2
        if c**2 > 2:
            a,b = a,c
        else:
            a,b = c,b
    return a

In [19]:
%%timeit 
root_two_cy(1,2,100)

16.6 µs ± 31.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
