<a href="https://colab.research.google.com/github/xiaorui777/CV/blob/master/cython.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 关于Cython

## 1.什么时候需要Cython
- 想在python里达到C语言的速度
- 调用现有的C/C++库

## 2.哪些项目用了Cython
- SageMath
- Pandas
- Scipy
- scikit-learn
- spaCy：Industrial-strength NLP
- A pythonic python wrapper around FFTW
- kivy
- 100Times Faster Natural Language
Processing in Python
- ...

In [0]:
import sys
import cython

import numba
from numba import jit

import numpy as np

In [2]:
numba.__version__,   np.__version__,   cython.__version__,   sys.version

('0.40.1',
 '1.16.4',
 '0.29.10',
 '3.6.8 (default, Jan 14 2019, 11:02:34) \n[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]]')

In [0]:
%load_ext Cython

## 3.Cython入门

### 3.1 静态类型加速

In [0]:
# 纯python代码

def f(x):
    return x ** 2 - x

def integrate_f(a,b,N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f(a + i * dx)
    return s * dx

In [5]:
t1 = %timeit -n5 -o integrate_f(1,100,10000)

5 loops, best of 3: 2.74 ms per loop


In [6]:
# version1: 用Cython加速（python代码不作改动）

# 在 %%cython后加一个 -a 会生成如下：黄色代表纯用python编译的，很慢

%%cython -a

def f_cy1(x):
    return x ** 2 - x

def integrate_f_cy1(a,b,N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f_cy1(a + i * dx)
    return s * dx

In [7]:
t2 = %timeit -n5 -o integrate_f_cy1(1,100,10000)

5 loops, best of 3: 1.49 ms per loop


In [0]:
# vershion2: 用Cython加速（规定变量类型）

%%cython

def f_cy2(double x):
    return x ** 2 - x

def integrate_f_cy2(double a,double b,int N):
    cdef double s,dx
    cdef int i
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f_cy2(a + i * dx)
    return s * dx

In [9]:
t3 = %timeit -n5 -o integrate_f_cy2(1,100,10000)

5 loops, best of 3: 493 µs per loop


In [0]:
# 查看帮助文档
%%cython?

In [11]:
# version3: 用Cython加速（用C语言改写，但不是合法的python语句）

# 可以看到黄色部分减少很多，变快了

%%cython -a

cdef double f_cy3(double x):
    return x ** 2 - x

def integrate_f_cy3(double a,double b,int N):
    cdef double s,dx
    cdef int i
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f_cy3(a + i * dx)
    return s * dx

In [12]:
t4 = %timeit -n5 -o integrate_f_cy3(1,100,10000)

5 loops, best of 3: 14.5 µs per loop


In [0]:
# 用numba的jit装饰器加速(经常比C语言改写的cython更快)

@jit(nopython=True, nogil=True)
def f_jit(x):
    return x ** 2 - x

@jit(nopython=True, nogil=True, fastmath=True)
def integrate_f_jit(a,b,N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f_jit(a + i * dx)
    return s * dx

In [14]:
t5 = %timeit -n5 -o integrate_f_jit(1,100,10000)

The slowest run took 4402.48 times longer than the fastest. This could mean that an intermediate result is being cached.
5 loops, best of 3: 11.5 µs per loop


## 4.Cython做了什么？

- Cython是一个编译器，Cython首先将Python代码编译成优化的C语言
- 使用C语言编译器，如 gcc、clang等，将优化的C程序编译成共享库

如果不用jupyter或者colab来做的话，Cython提供了cythonize可以自动完成上面的过程

# 5.编译器指令

两种方式：
- 直接加在  .pyx 源代码的开头
- 使用修饰器

具体可查看Cython官网的（complier directives）