# Lecture 05 
### Introduction to Cython - Part 01 
### March 1, 2021

---

Based on the material at: https://nyu-cds.github.io/python-cython/

This lecture provides a very brief introduction to Cython. See the [Cython documentation](http://cython.readthedocs.io/en/latest/) for a more detailed description of the Cython language.

### Cython

* The Python interpreter is a C program, can we leverage C further?
* One can write Python packages directly in C, but it tends to be complicated/ugly code
* Cython: easy way to incorporate compiled C/C++ code in your Python programs



- Cython is a modification of Python that **adds C data types** and converts python codes to C;

- It allows for **compilation into a shared library** that can be imported into Python;

- Almost any piece of Python code is also valid Cython code (with a few limitations).

- Seamless conversion between C types and (some) Python objects. e.g. function parameters.






### Speed

* Performance gains depend very much on the program
* Not much gains in numerical programs since most of it is already in C
* Programs with loops: often large improvements

### Easy calls to C/C++ code

* Cython makes it easy to wrap existing C/C++ libraries


<h2 id="prerequisites">Installation</font></h2>
    <p>The examples in this lesson can be run directly using the Python interpreter, using IPython interactively, 
or using Jupyter notebooks. Anaconda users will already have Cython installed. You will also need a functioning
C compiler to be able to use Cython. See the <a href="http://cython.readthedocs.io/en/latest/src/quickstart/install.html">Cython installation guide</a> for more details.</p>

On debian or ubuntu, if you do not have GCC: ```sudo apt-get install build-essential```

To install cython with conda run: ```conda install cython```



#### <font color='black'>Basic C Types</font>
| Type        |	Description |
| :---        | :---: |
| char	| 8-bit signed integer |
| short	| 16-bit signed integer |
|int	| 32-bit signed integer |
| long	| 64-bit signed integer |
| float	| 32-bit floating point |
| double |64-bit floating point |
| long double | 80-bit floating point |<br>
#### <font color='blue'>Array</font>
type name[size]
#### <font color='blue'>Pointer</font>
type* name
#### <font color='blue'>Structure</font>
struct name { declaration }

### Using the magic `%%cython` in jupyter

In [None]:
import numpy as np

# sum non-negative integers 

a = 0
g = np.zeros((10, ))

for i in range(10):
    g[i] = a
    a += i
    
print(g)

In [None]:
%load_ext Cython


Cython code can be compiled using the `%%cython` cell magic command:


In [None]:
%%cython
import numpy as np

cdef int a = 0
cdef int g[10]
cdef int i

for i in range(10):
    g[i] = a
    a += i
    
print(g)

In [None]:
%%cython --annotate

cdef int a = 0
cdef int g[10]
cdef int i

for i in range(10):
    g[i] = a
    a += i
    
print(g)


- Each line can be expanded to show the generated C code  


- More yellow: ''more calls into the Python virtual machine''  


- More white: ''more non-Python C code''   


- ''more yellow lines'' means more calls into the virtual machine -- will not necessarily make the code slower 


- Each call into the virtual machine has a cost


- The cost of those calls will only be significant if the calls occur inside large loops  



In [None]:
%%cython

cdef struct Student:
    unsigned char *name
    unsigned char *lastname
    unsigned char *university_id
    int age
    float gpa
    
cdef Student student

student.name = 'John'
student.lastname = 'Smith'
student.university_id = 'js1234'
student.age = 20
student.gpa = 4.0

print("student:", student)

print("gpa:", student.gpa) 

----
## Performance Comparisons
The following pure Python example generates a list of kmax prime numbers

In [None]:
# Pure Python code
import time

def primes_with_python(kmax):
    
    kmax = max(1000, kmax)
    primes = [None] * kmax # Initialize the list to the max number of elements
    
    result = []
    k = 0
    n = 2
    
    while k < kmax:
        
        i = 0
        while i < k and n % primes[i] != 0:
            i = i + 1
            
        if i == k:
            primes[k] = n
            k = k + 1
            result.append(n)
        
        n = n + 1
    return result

t = time.process_time()
x = primes_with_python(1000)
elapsed_time = time.process_time() - t
print(elapsed_time,'s')

%timeit x = primes_with_python(1000)

---

The same code can be run without any change in Cython.

---

In [None]:
%load_ext Cython

In [None]:
%%cython --annotate
# Using the magic cython

import time

def primes_with_cython(kmax):
    kmax = max(1000, kmax)
    primes = [None] * kmax # Initialize the list to the max number of elements
    
    result = []
    k = 0
    n = 2
    while k < kmax:
        i = 0
        while i < k and n % primes[i] != 0:
            i = i + 1
        
        if i == k:
            primes[k] = n
            k = k + 1
            result.append(n)
        
        n = n + 1
    return result

t = time.process_time()
x = primes_with_cython(1000)
elapsed_time = time.process_time() - t
print(elapsed_time,'s')


In [None]:
%timeit x = primes_with_cython(1000)

---

We can define some types to improve the code:

In [None]:
%%cython 
#--annotate
import time

def primes_ctype(int kmax):
    
    cdef int i, k, n
    cdef int primes[1000]
    
    kmax = max(1000, kmax)
    
    result = []
    k = 0
    n = 2
    while k < kmax:
        i = 0
        while i < k and n % primes[i] != 0:
            i = i + 1
            
        if i == k:
            primes[k] = n
            k = k + 1
            result.append(n)
        
        n = n + 1
    return result

t = time.process_time()
x = primes_ctype(1000)
elapsed_time = time.process_time() - t
print(elapsed_time,'s')

In [None]:
%timeit x = primes_ctype(1000)

----
### Using cython outside jupyter (Compiling with distutils)

See https://cython.readthedocs.io/en/latest/src/quickstart/build.html

- Cython code is normally saved in files ending with .pyx (the x indicates it is different from standard Python code). 


- A Cython file can be translated to C using the **distutils** package.

The **distutils** package is part of the standard library. It is the standard way of building Python packages, including native extension modules. The following example configures the build for a Cython file called **my_module.pyx** with the following content:

```python
def cfunc(int n):
    cdef int s = 0
    cdef int i
    for i in range(n + 1):
        s += i
    return s
```

In [None]:
!ls

In [None]:
!cat my_module.pyx

---

In order to use **distutils** we have to create a **setup.py** script. In our example it can be:

```python
from distutils.core import setup
from Cython.Build import cythonize

setup(
    name = "my_module_app",
    ext_modules = cythonize("my_module.pyx"), 
)
```

---

In [None]:
!cat setup.py

---

Now, run this command in your system’s command shell and you are done.



In [None]:
!python setup.py build_ext --inplace

# here the flag "inplace" is to: 
# ignore build-lib and put compiled extensions into the source directory alongside your pure Python modules

In [None]:
!ls

---

The two files:
- my_module.c
- my_module.cpython-*.so
will be created

The .so library can be treated just like any Python module and imported using the normal import statement:
```python
import my_module
```

In [None]:
import my_module

s = my_module.cfunc(100)
print("sum of the first 100 natural numbers:", s)

In [None]:
n = 2000
print("sum of the first %d natural numbers: %d" % (n, my_module.cfunc(n)))

## Other features

* Ensure C-only functions with `cdef`, mixed functions with `cpdef`
* Extension types: `cdef class`
* Better parallelism, with ability to disable the GIL https://cython.readthedocs.io/en/latest/src/userguide/parallelism.html
* Integration with NumPy (see part 2)
* etc. (see docs: https://cython.readthedocs.io/en/latest/index.html)