# Lecture 05 
### Introduction to Cython - Part 01 
### February 25, 2020   

---

Based on the material at: https://nyu-cds.github.io/python-cython/

This lecture provides a very brief introduction to Cython. See the [Cython documentation](http://cython.readthedocs.io/en/latest/) for a more detailed description of the Cython language.

What we will learn:
- What cython is
- The different manners of using cython
- some comparison in terms of computational times

----
- Cython is a modification of Python that adds C data types and converts python codes to C;

- It allows for compilation into a shared library that can be imported into Python;

- Almost any piece of Python code is also valid Cython code (with a few limitations).

- In Cython, function parameters and variables can be declared to have C data types and code which manipulates Python values and C values can be freely intermixed. Cython takes care of automatically converting from C to Python data types wherever possible.

- Parameters of either type of function can be declared to have C data types, using normal C declaration syntax 





---



- Speed

How much performance improves depends very much on the program. Typical Python numerical programs would tend to gain very little as most time is spent in lower-level C anyway. However, for-loop-style programs can improve by many orders of magnitude.

- Easy calling into C code

One of Cython’s purposes is to allow easy wrapping of C libraries. When writing code in Cython you can call into C code as easily as into Python code.

---

<blockquote>
<h2 id="prerequisites"><font color='blue'>__Prerequisites__</font></h2>
    <p>The examples in this lesson can be run directly using the Python interpreter, using IPython interactively, 
or using Jupyter notebooks. Anaconda users will already have Cython installed. You will also need a functioning
C compiler to be able to use Cython. See the <a href="http://cython.readthedocs.io/en/latest/src/quickstart/install.html">Cython installation guide</a> for more details.</p>
</blockquote>

---

C compiler *gcc* is usually present

To install essentials on debian or ubuntu: ```sudo apt-get install build-essential```

To install cython with conda run: ```conda install -c anaconda cython```


---

#### <font color='blue'>Basic C Types</font>
| Type        |	Description |
| :---        | :---: |
| char	| 8-bit signed integer |
| short	| 16-bit signed integer |
|int	| 32-bit signed integer |
| long	| 64-bit signed integer |
| long long	| 64-bit signed integer|
| float	| 32-bit floating point |
| double |64-bit floating point |
| long double | 80-bit floating point |<br>
#### <font color='blue'>Array</font>
type name[size]
#### <font color='blue'>Pointer</font>
type *name
#### <font color='blue'>Structure</font>
struct name { declaration }

----
### Using the magic %%cython in jupyter

In [1]:
import numpy as np

# sum non-negative integers 

a = 0
g = np.zeros((10, ))

for i in range(10):
    g[i] = a
    a += i
    
print(g)

[ 0.  0.  1.  3.  6. 10. 15. 21. 28. 36.]


In [3]:
%load_ext Cython

The Cython extension is already loaded. To reload it, use:
  %reload_ext Cython


---

Cython code can be compiled using the %%cython cell magic command:

---

In [4]:
%%cython
import numpy as np

cdef int a = 0 #defines the type
cdef int g[10] #defines the size
cdef int i #defines the type

for i in range(10):
    g[i] = a
    a += i
    
print(g)

[0, 0, 1, 3, 6, 10, 15, 21, 28, 36]


In [5]:
%%cython --annotate

cdef int a = 0
cdef int g[10]
cdef int i

for i in range(10):
    g[i] = a
    a += i
    
print(g)

[0, 0, 1, 3, 6, 10, 15, 21, 28, 36]


---

- Each line can be expanded to show the generated C code  


- More yellow: ''more calls into the Python virtual machine''  


- More white: ''more non-Python C code''   


- ''more yellow lines'' means more calls into the virtual machine -- will not necessarily make the code slower 


- Each call into the virtual machine has a cost


- The cost of those calls will only be significant if the calls occur inside large loops  
-- want as more as more white lines

---

In [7]:
%%cython

cdef struct Student: #structure
    unsigned char *name
    unsigned char *lastname
    unsigned char *university_id
    int age
    float gpa
    
cdef Student student

student.name = 'John'
student.lastname = 'Smith'
student.university_id = 'js1234'
student.age = 20
student.gpa = 4.0

print("student:", student)

print("gpa:", student.gpa) 

student: {'name': b'John', 'lastname': b'Smith', 'university_id': b'js1234', 'age': 20, 'gpa': 4.0}
gpa: 4.0


----
## Performance Comparisons
The following pure Python example generates a list of kmax prime numbers

In [8]:
# Pure Python code
import time

def primes_with_python(kmax):
    
    kmax = max(1000, kmax)
    primes = [None] * kmax # Initialize the list to the max number of elements
    
    result = []
    k = 0
    n = 2
    
    while k < kmax:
        
        i = 0
        while i < k and n % primes[i] != 0:
            i = i + 1
            
        if i == k:
            primes[k] = n
            k = k + 1
            result.append(n)
        
        n = n + 1
    return result

t = time.process_time()
x = primes_with_python(1000)
elapsed_time = time.process_time() - t
print(elapsed_time,'s')

0.17693799999999982 s


---

The same code can be run without any change in Cython.

---

In [9]:
%load_ext Cython

The Cython extension is already loaded. To reload it, use:
  %reload_ext Cython


In [10]:
%%cython --annotate
# Using the magic cython

import time

def primes_with_cython(kmax):
    kmax = max(1000, kmax)
    primes = [None] * kmax # Initialize the list to the max number of elements
    
    result = []
    k = 0
    n = 2
    while k < kmax:
        i = 0
        while i < k and n % primes[i] != 0:
            i = i + 1
        
        if i == k:
            primes[k] = n
            k = k + 1
            result.append(n)
        
        n = n + 1
    return result

t = time.process_time()
x = primes_with_cython(1000)
elapsed_time = time.process_time() - t
print(elapsed_time,'s')

0.0833939999999993 s


---

We can define some types to improve the code:

In [11]:
%%cython 
#--annotate
import time

def primes_ctype(int kmax):
    
    cdef int i, k, n
    cdef int primes[1000]
    
    kmax = max(1000, kmax)
    
    result = []
    k = 0
    n = 2
    while k < kmax:
        i = 0
        while i < k and n % primes[i] != 0:
            i = i + 1
            
        if i == k:
            primes[k] = n
            k = k + 1
            result.append(n)
        
        n = n + 1
    return result

t = time.process_time()
x = primes_ctype(1000)
elapsed_time = time.process_time() - t
print(elapsed_time,'s')

0.004426000000000485 s


----
### Using cython outside jupyter (Compiling with distutils)

See https://cython.readthedocs.io/en/latest/src/quickstart/build.html

- Cython code is normally saved in files ending with .pyx (the x indicates it is different from standard Python code). 


- A Cython file can be translated to C using the **distutils** package.

The **distutils** package is part of the standard library. It is the standard way of building Python packages, including native extension modules. The following example configures the build for a Cython file called **my_module.pyx** with the following content:

```python
def cfunc(int n):
    cdef int s = 0
    cdef int i
    for i in range(n + 1):
        s += i
    return s
```

In [12]:
!ls

2020_spring_lect05_part01.ipynb [31mmy_module.pyx[m[m
2020_spring_lect05_part02.ipynb [31msetup.py[m[m


In [None]:
#pyx is an extension for program that is not python
# store the function

In [13]:
!cat my_module.pyx

def cfunc(int n):
    cdef int a = 0
    cdef int i
    for i in range(n):
        a += i
    return a


In [None]:
# set up, you hold the name of the model, 

---

In order to use **distutils** we have to create a **setup.py** script. In our example it can be:

```python
from distutils.core import setup
from Cython.Build import cythonize

setup(
    name = "my_module_app",
    ext_modules = cythonize("my_module.pyx"), 
)
```

---

In [14]:
!cat setup.py

from distutils.core import setup
from Cython.Build import cythonize

setup(
    name = "My module app",
    ext_modules = cythonize('my_module.pyx'),
)


---

Now, run this command in your system’s command shell and you are done.



In [15]:
!python setup.py build_ext --inplace

# here the flag "inplace" is to: 
# ignore build-lib and put compiled extensions into the source directory 
# alongside your pure Python modules

Compiling my_module.pyx because it changed.
[1/1] Cythonizing my_module.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)
running build_ext
building 'my_module' extension
creating build
creating build/temp.macosx-10.9-x86_64-3.7
x86_64-apple-darwin13.4.0-clang -DNDEBUG -fwrapv -O3 -Wall -Wstrict-prototypes -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -D_FORTIFY_SOURCE=2 -mmacosx-version-min=10.9 -I/anaconda3/include/python3.7m -c my_module.c -o build/temp.macosx-10.9-x86_64-3.7/my_module.o
x86_64-apple-darwin13.4.0-clang -bundle -undefined dynamic_lookup -Wl,-pie -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-rpath,/anaconda3/lib -L/anaconda3/lib -flto -Wl,-export_dynamic -Wl,-pie -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-rpath,/anaconda3/lib -L/anaconda3/lib -Wl,-pie -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstac

In [16]:
!ls

2020_spring_lect05_part01.ipynb [31mmy_module.cpython-37m-darwin.so[m[m
2020_spring_lect05_part02.ipynb [31mmy_module.pyx[m[m
[34mbuild[m[m                           [31msetup.py[m[m
my_module.c


---

The two files:
- my_module.c
- my_module.cpython-*.so
will be created

The .so library can be treated just like any Python module and imported using the normal import statement:
```python
import my_module
```

In [17]:
import my_module

s = my_module.cfunc(100)
print("sum of the first 100 natural numbers:", s)

sum of the first 100 natural numbers: 4950


In [18]:
n = 2000
print("sum of the first %d natural numbers: %d" % (n, my_module.cfunc(n)))

sum of the first 2000 natural numbers: 1999000
