## Let's speak ~~from my heart~~ about cython

#### *Sorry, this article will be done later*

And science we are developers, we will use code right here.

I don't want to do boring things later. So I include numpy, cython compiler in IPython and small package for profiling parts of code.

In [1]:
%load_ext autotime

In [2]:
import numpy as np

time: 71.1 ms


In [3]:
%load_ext cython
%load_ext autotime

The autotime extension is already loaded. To reload it, use:
  %reload_ext autotime
time: 324 ms


In [4]:
sections = 1000000

time: 372 µs


I would like to do this topic as short and informative as possible. And the format which i choose - to show 5 most obviously reasons for using cython in your projects right now.

## 5 reasons to use cython instead python

### 1: for using cython you could stay code as it exist

We define to python functions: some f(x) and integrator. Each reason will modificate this code. Every next reason for using cython will change your code more, but also it would have better effect.

In [5]:
def f(x):
    return x**2-x

def integrate_f(a, b, N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f(a + dx * i)
    return s * dx
    

time: 1.2 ms


Important note: initialization of python function (at least in jupyter notebook) faster then cython function. But it's only initialization!

In [6]:
integrate_f(0, 10, sections)

283.3328833334909

time: 346 ms


That's time for python function to calculate integral of f(x) on [0, 10] with 10 million sections.

In [7]:
%%cython

def f_cython(x):
    return x**2-x

def integrate_f_cython(a, b, N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f_cython(a + dx * i)
    return s * dx

time: 4.15 ms


In [8]:
integrate_f_cython(0, 10, sections)

283.3328833334909

time: 248 ms


#### Total:

So in my notebook I have speed ~ x1.33 without any changes in my code. Probably, it's my favorite thing in Cython. Nobody require some actions from you: just replace python on cython. Just use %%cython in your jupyter or add setup file to compile your code in object module.

### 2: you could use fast types conversion

What if i want to do requirement for my function: numbers in interval have to be integer.

In [9]:
integrate_f(0, 10.1, sections)

292.42820252133953

time: 335 ms


It's a bad behaviour for my requirement. I have to change my integrate_f function:

In [10]:
def integrate_f(a, b, N):
    a, b = int(a), int(b)
    
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f(a + dx * i)
    return s * dx
    

time: 1.04 ms


In [11]:
integrate_f(0, 10.1, sections)

283.3328833334909

time: 357 ms


Works pretty, but how to do this in cython?

In [12]:
%%cython

def f_cython(x):
    return x**2-x

def integrate_f_cython(int a, int b, N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f_cython(a + dx * i)
    return s * dx

time: 2.15 ms


In [13]:
integrate_f_cython(0, 10.1, sections)

283.3328833334909

time: 240 ms


#### Total:

### 3: you could make your code faster very well

Let's add some C stuff in Cython:

In Cython we have three types of function definidtion:
1. `def`
2. `cdef`
3. `cpdef`

`def` method of definition in cython is the same as the python method. Your performance would be greater only because your code will be compiled in object files.

`cdef` method mean that the code in this function will be transform in pure C code. Python syntax is no more than sugar in the code.

Last not least `cpdef` is the method which as `def` method use arguments and return value as python object but inside them it called `cdef` method.

For better understanding I show you a very good part of cython compiler: annotations. It allows you to see how compiler are transformed your code.

In [14]:
%%cython -a

def f_annotations(x):
    for i in range(10):
        pass
    return x**2-x

time: 13.5 ms


In [15]:
%%cython -a

cdef float f_annotations(float x):
    cdef int i = 0
    for i in range(10):
        pass
    return x**2-x

time: 4.86 ms


In [16]:
%%cython -a

cpdef float f_annotations(float x):
    cdef int i = 0
    for i in range(10):
        pass
    return x**2-x

time: 9.81 ms


As you can see `cdef` method generate a shorter code. But practically you more often will use `cpdef` method for binding your functions with other python code.

In [17]:
%%cython

cdef float f_cython(float x):
    return x**2-x

cpdef float integrate_f_cython(int a, int b, int N):
    cdef float s = 0
    cdef float dx = (b-a)/N
    cdef int i 
    
    for i in range(N):
        s += f_cython(a + dx * i)
    return s * dx

time: 6.8 ms


In [18]:
integrate_f(0, 10.1, sections)

283.3328833334909

time: 342 ms


In [19]:
integrate_f_cython(0, 10.1, sections)

283.33111572265625

time: 7.31 ms


#### Total:

### 4: you could use functions from C and C++ which haven't bindings to python

For me it was CUDA functions in OpenCV.. But I really don't want to write about it and I can't share that code.
So I just paste here a link about that.

[link](https://dmtn-013.lsst.io/)

### 5: Easy way to parallelize your code

In [20]:
%%cython

cpdef print_parallel(int N):
    cdef int i = 0
        
    for i in range(N):
        print(i, end='')

    print()

time: 5.28 ms


In [21]:
print_parallel(10)

0123456789
time: 1.98 ms


In [22]:
%%cython --compile-args=-fopenmp --link-args=-fopenmp --force

# import cython.parallel as cp
from cython.parallel import prange

cpdef print_parallel_cython(int N):
    cdef int i = 0
        
    for i in prange(N, nogil=True):
        with gil:
            print(i, end='')
    print()

time: 388 ms


In [23]:
print_parallel_cython(10)

8670912345
time: 21.1 ms


## Conclusion


## Recommendations for reading: