### Cython

- Cython se utiliza principalmente para generar paquetes compilados para Python
- Sintaxis picada a C
- Todas las variables deben ser declaradas con sus tipos respectivos
- Codigo se escribe en archivos con extensión `.pyx`
- Tiene interacción completa con C
- Puede utilizarse igualmente de manera rápida en notebooks


- Para poder utilizarlo en windows se debe descargar e instalar MSVC

In [1]:
import numpy as np

In [2]:
%load_ext cython

### Syntaxis

#### Variables

Variables en cython se declaran utilizando el comando `cdef` que a su vez cumple otras funciones adicionales.

In [3]:
%%cython

cdef int a
cdef unsigned int b
cdef const unsigned int c
cdef const unsigned long long d
cdef long e
cdef long long f
cdef float g
cdef double h

#### Arrays

Para trabajar arrays se utiliza una estructura de memoria especial que se conoce como `Typed Memoryview` que acceden de memoria eficiente a los buffers de los objetos que se asignan.
- Similar al funcionamiento del buffer de Numpy

In [4]:
%%cython
import numpy as np
cimport numpy as np

num1 = np.arange(100)
num2 = np.arange(100).reshape(10, 10)

cdef int[:] num1_view = num1 
cdef int[:, :] num2_view = num2

print('num1 preview:', num1[0:10])
print('num1 shape:', num1.shape)
print('num1 view shape:', num1_view.shape, end='\n'*2)

num1_view[2:6] = 0

print('num1: ', num1[0:10], end='\n'*2)

# para utilizar atributos de numpy se debe utilizar np.asarray()
print('Memview:', num1_view)
print('Memview as array:', np.asarray(num1_view)[0:10])

num1 preview: [0 1 2 3 4 5 6 7 8 9]
num1 shape: (100,)
num1 view shape: [100, 0, 0, 0, 0, 0, 0, 0]

num1:  [0 1 0 0 0 0 6 7 8 9]

Memview: <MemoryView of 'ndarray' object>
Memview as array: [0 1 0 0 0 0 6 7 8 9]


#### Funciones

Funciones pueden definirse utilizando:
```python
def <func_name> # Funciona con Python
 
cdef <output_type> <func_name> # Crea funciones solo para ser usadas en C / C++

cpdef <output_type> <func_name> # Compila el código para C y Python
```

#### Ejemplo

In [5]:
%%cython

import numpy as np
cimport numpy as np

def unit_vector_cy(double[:] arr):
    
    cdef int sz = arr.shape[0]
    cdef double[:] uvec_view = np.zeros(sz, dtype=np.float64)
    cdef double norm = 0.0
    cdef int i
    
    for i in range(sz):
        norm += arr[i] ** 2
    norm = norm ** 0.5
    
    for j in range(sz):
        uvec_view[j] = arr[j] / norm
        
    return np.asarray(uvec_view)

In [6]:
vec = np.random.rand(20_000_000)

In [7]:
%%timeit
_ = unit_vector_cy(vec)

481 ms ± 2.13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


#### QUE SE ACONTECE

In [8]:
%%cython --annotate

import numpy as np
cimport numpy as np

def unit_vector_cy(double[:] arr):
    
    cdef int sz = arr.shape[0]
    cdef double[:] uvec_view = np.zeros(sz, dtype=np.float64)
    cdef double norm = 0.0
    cdef int i
    
    for i in range(sz):
        norm += arr[i] ** 2
    norm = norm ** 0.5
    
    for j in range(sz):
        uvec_view[j] = arr[j] / norm
        
    return np.asarray(uvec_view)

### Mejoras

In [9]:
%%cython --annotate

import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False) 
@cython.cdivision(True)
def unit_vector_cy2(double[:] arr):
    
    cdef int sz = arr.shape[0]
    cdef double[:] uvec_view = np.zeros(sz, dtype=np.float64)
    cdef double norm = 0.0
    cdef int i
    
    for i in range(sz):
        norm += arr[i] ** 2
    norm = norm ** 0.5
    
    for j in range(sz):
        uvec_view[j] = arr[j] / norm
        
    return np.asarray(uvec_view)

In [10]:
%%timeit
_ = unit_vector_cy2(vec)

476 ms ± 4.01 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Threading

Al igual que `Numba` Cython utiliza `openmp` de C para implementar threading

In [11]:
%%cython --compile-args=/openmp --link-args=/openmp

import numpy as np
cimport numpy as np
cimport cython
from cython.parallel cimport prange

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
@cython.cdivision(True)
def unit_vector_cyp(double[:] arr):
    
    cdef:
        int sz = arr.shape[0]
        double[:] uvec_view = np.zeros(sz, dtype=np.float64)
        double norm = 0.0
        int i
    
    for i in prange(sz, nogil=True, num_threads=16, schedule='static'): 
        norm += arr[i] ** 2
    norm = norm ** 0.5

    for j in range(sz):
        uvec_view[j] += arr[j] / norm
        
    return np.asarray(uvec_view)

In [12]:
%%timeit
_ = unit_vector_cyp(vec)

96.9 ms ± 4.94 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [13]:
unit_vector_cyp(vec)[0:10]

array([6.43039450e-05, 7.32115466e-05, 1.88822667e-04, 1.61936004e-04,
       1.52432401e-04, 2.49632170e-04, 5.57852941e-05, 9.92107586e-05,
       3.70230548e-04, 2.91428035e-04])

### Liberias de C

In [14]:
%%cython

cdef extern from "math.h":
    double sqrt(double x)

def c_sqrt(double x):
    return sqrt(x)

In [15]:
%%timeit
c_sqrt(5200)

36.5 ns ± 0.283 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [16]:
import math

In [17]:
%%timeit
math.sqrt(5200)

98.2 ns ± 0.367 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [18]:
%%timeit
np.sqrt(5200)

881 ns ± 3.33 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


### Compilar Código

- El código de Cython se compila utilizando un archivo `setup.py`
- Se generan archivos `.c` y `.pyd` en windows y `.c` y `.so` en Linux y macOS

```python
from setuptools import setup, Extension
from Cython.Build import cythonize
import numpy

setup(
    name='conv_cy',
    ext_modules=cythonize("conv_cy.pyx"),
    zip_safe=False,
    include_dirs=[numpy.get_include()]
```

\---
```bash
!python setup.py build_ext --inplace
```
---

o Incluyendo librerias como openmp

```python
from setuptools import setup, Extension
import numpy

ext_modules = [
    Extension(
        "conv_cyp",
        ["conv_cyp.pyx"],
        extra_compile_args=['/openmp'],
        extra_link_args=['/openmp'],
    )
]
setup(
    name='conv_cyp',
    ext_modules=ext_modules,
    zip_safe=False,
    include_dirs=[numpy.get_include()],
)
```

___

### Overflow

In [19]:
%%cython

cpdef int sum_one(int var):
    return var + 1

In [20]:
int32 = 2 ** 32 // 2 - 1
int64 = 2 ** 64 // 2 - 1

In [21]:
sum_one(int32)

-2147483648

In [22]:
%%cython

cpdef unsigned long long sum_one2(unsigned long long var):
    return var + 1

In [23]:
print(sum_one2(int32))
print(sum_one2(int64))

2147483648
9223372036854775808
