# Cython 模块

<h1>Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Cython-基础" data-toc-modified-id="Cython-基础-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Cython 基础</a></span></li><li><span><a href="#将源代码转换成扩展模块" data-toc-modified-id="将源代码转换成扩展模块-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>将源代码转换成扩展模块</a></span><ul class="toc-item"><li><span><a href="#ipython-中使用-Cython-命令" data-toc-modified-id="ipython-中使用-Cython-命令-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>ipython 中使用 Cython 命令</a></span></li><li><span><a href="#使用-distutils-编译-Cython" data-toc-modified-id="使用-distutils-编译-Cython-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>使用 distutils 编译 Cython</a></span></li></ul></li><li><span><a href="#使用-pyximport" data-toc-modified-id="使用-pyximport-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>使用 pyximport</a></span></li><li><span><a href="#Cython-语法" data-toc-modified-id="Cython-语法-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Cython 语法</a></span><ul class="toc-item"><li><span><a href="#cdef-关键词" data-toc-modified-id="cdef-关键词-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>cdef 关键词</a></span></li><li><span><a href="#def,-cdef,-cpdef-函数" data-toc-modified-id="def,-cdef,-cpdef-函数-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>def, cdef, cpdef 函数</a></span></li><li><span><a href="#cimport" data-toc-modified-id="cimport-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>cimport</a></span></li><li><span><a href="#cimport-和-pxd-文件" data-toc-modified-id="cimport-和-pxd-文件-4.4"><span class="toc-item-num">4.4&nbsp;&nbsp;</span>cimport 和 pxd 文件</a></span></li></ul></li><li><span><a href="#调用其他C库" data-toc-modified-id="调用其他C库-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>调用其他C库</a></span></li><li><span><a href="#class-和-cdef-class" data-toc-modified-id="class-和-cdef-class-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>class 和 cdef class</a></span></li><li><span><a href="#使用-C++-类" data-toc-modified-id="使用-C++-类-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>使用 C++ 类</a></span></li><li><span><a href="#Typed-memoryviews" data-toc-modified-id="Typed-memoryviews-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Typed memoryviews</a></span></li></ul></div>

## Cython 基础

**将源代码转换成扩展模块**，之前使用了手动的方法对 C 程序进行编译，而 Cython 则简化了这个过程。

In [1]:
# 斐波拉契数列
def fib(n):
    a,b = 1,1
    for i in range(n):
        a,b = a+b,a
        return a

C版本：
```c
int fib(int n)
{
   int tmp,i,a,b;
   for (i = 0; i<n; i++){
      tmp = a;a += b;b = tmp;
   }
   return a;
}
```

**Cython版本：这里 `cdef` 定义了 C 变量的类型。**

```Cython
def fib(int n):
    cdef int i,a,b
    for i in range(n):
        a,b = a+b,a
    return n
```

[`Cython`](http://www.cython.org) 的好处在于，我们使用了 `Python` 的语法，又有 `C/C++` 的效率，同时省去了之前直接编译成扩展模块的麻烦，并且提供了原生的 **`Numpy `**支持。

其主要用法有两点：

* **将 `Python` 程序转化为 `C` 程序**
* 包装 C/C++ 程序

## 将源代码转换成扩展模块

### ipython 中使用 Cython 命令
导入 `Cython magic` 命令：

In [2]:
!pip install Cython



In [3]:
%load_ext Cython

In [4]:
%%cython
def cyfib(int n):
    cdef int i, a, b
    a,b = 1,1
    for i in range(n):
        a,b = a+b, a
    return a

In [5]:
cyfib(10)

144

说一下，这里一般会出现`Unable to find vcvarsall.bat`这个问题，参见博客自行解决：
https://my.oschina.net/u/1024349/blog/120375

这个问题一解决，之前识别不了gcc的命令这个问题也解决了，其实主要就是环境变量的问题，无非就是没有配置环境或者Python本身定位不到C/C++编译环境

In [6]:
!gcc -v

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=d:/mingw/bin/../libexec/gcc/mingw32/6.3.0/lto-wrapper.exe
Target: mingw32
Configured with: ../src/gcc-6.3.0/configure --build=x86_64-pc-linux-gnu --host=mingw32 --target=mingw32 --with-gmp=/mingw --with-mpfr --with-mpc=/mingw --with-isl=/mingw --prefix=/mingw --disable-win32-registry --with-arch=i586 --with-tune=generic --enable-languages=c,c++,objc,obj-c++,fortran,ada --with-pkgversion='MinGW.org GCC-6.3.0-1' --enable-static --enable-shared --enable-threads --with-dwarf2 --disable-sjlj-exceptions --enable-version-specific-runtime-libs --with-libiconv-prefix=/mingw --with-libintl-prefix=/mingw --enable-libstdcxx-debug --enable-libgomp --disable-libvtv --enable-nls
Thread model: win32
gcc version 6.3.0 (MinGW.org GCC-6.3.0-1) 


### 使用 distutils 编译 Cython

`Cython` 代码以 `.pyx` 结尾，先通过 `cython `转化为 `.c `文件，再用` gcc`转化为` .so(.pyd) `文件。

使用 `distutils` 编译c文件为`python`模块，见笔记:[Python 扩展模块](02-python-extension-modules.ipynb)

In [7]:
%%file fib.pyx
def cyfib(int n):
    cdef int i, a, b
    a,b = 1,1
    for i in range(n):
        a,b = a+b, a
    return a

Writing fib.pyx


In [8]:
%%file setup.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

ext = Extension("fib", sources=["fib.pyx"])
setup(ext_modules=[ext], cmdclass={'build_ext': build_ext})

Overwriting setup.py


用`setup.py `编译一下

In [9]:
!python setup.py build_ext --inplace

running build_ext
cythoning fib.pyx to fib.c
building 'fib' extension
D:\MinGW\bin\gcc.exe -mdll -O -Wall -ID:\Python\include -ID:\Python\PC -c fib.c -o build\temp.win32-2.7\Release\fib.o
writing build\temp.win32-2.7\Release\fib.def
D:\MinGW\bin\gcc.exe -shared -s build\temp.win32-2.7\Release\fib.o build\temp.win32-2.7\Release\fib.def -LD:\Python\libs -LD:\Python\PCbuild -LD:\Python\PC\VS9.0 -lpython27 -lmsvcr90 -o E:\Project_Sources\notebook\other-languages\fib.pyd


In [10]:
import fib
fib.cyfib(10)

144

In [11]:
import zipfile
f = zipfile.ZipFile('fib.zip','w',zipfile.ZIP_DEFLATED)
names = 'fib.pyd fib.pyx fib.c setup.py'.split()
for name in names:
    f.write(name)
f.close()

In [12]:
 !rm -f fib.pyd

In [13]:
!rm -f fib.pyc
!rm -f fib.C

## 使用 pyximport

清理之前导入的模块：

In [14]:
%reset -f

In [15]:
import pyximport
# install 函数会自动检测 Cython 程序的变化，
# 自动导入，不过一般用于简单文件的编译。
pyximport.install()

import fib

fib.cyfib(10)

144

In [16]:
!rm -f setup*.*
!rm -f fib.pyx
!rm -rf build

## Cython 语法

### cdef 关键词

`cdef` 定义 `C` 类型变量：
```cython
# 定义局部变量
def fib(int n):
    cdef int a,b,i
    ...
    
# 定义函数返回值    
cdef float distance(float *x, float *y, int n):
    cdef:
        int i
        float d = 0.0
    for i in range(n):
        d += (x[i] - y[i]) ** 2
    return d


# 定义函数
cdef class Particle(object):
    cdef float psn[3], vel[3]
    cdef int id
```

### def, cdef, cpdef 函数

`Cython` 一共有三种定义方式，`def, cdef, cpdef `三种：

* `def - Python, Cython` 都可以调用
* `cdef` - 更快，只能 `Cython` 调用，可以使用指针
* `cpdef` - `Python, Cython` 都可以调用，不能使用指针

### cimport

In [17]:
from math import sin as pysin
from numpy import sin as npsin

In [18]:
%reload_ext Cython

从标准 **`C `语言库中调用模块**，`cimport `只能在 `Cython` 中使用：

### cimport 和 pxd 文件

如果想在多个文件中复用 `Cython `代码，可以定义一个` .pxd `文件（相当于头文件` .h`）定义方法，这个文件对应于一个 `.pyx `文件（相当于源文件 `.c`），然后在其他的文件中使用 cimport 导入：

* `fib.pxd, fib.pyx` 文件存在

```cython
from fib cimport fib
```
* 还可以调用 `C++` 标准库和 `Numpy C Api` 中的文件：

```cython
from libcpp.vector cimport vector
cimport numpy as cnp
```

## 调用其他C库

从标准库`string.h `中调用` strlen`：

In [19]:
%%file len_extern.pyx
cdef extern from "string.h":
    int strlen(char *c)
    
def get_len(char *message):
    return strlen(message)

Writing len_extern.pyx


**Cython 不会自动扫描导入的头文件，所以要使用的函数必须再声明一遍：**

In [20]:
%%file setup_len_extern.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

setup(
  ext_modules=[ Extension("len_extern", ["len_extern.pyx"]) ],
  cmdclass = {'build_ext': build_ext}
)

Writing setup_len_extern.py


编译生成`.c`和`.pyd`文件

In [21]:
!python setup_len_extern.py build_ext --inplace

running build_ext
cythoning len_extern.pyx to len_extern.c
building 'len_extern' extension
creating build
creating build\temp.win32-2.7
creating build\temp.win32-2.7\Release
D:\MinGW\bin\gcc.exe -mdll -O -Wall -ID:\Python\include -ID:\Python\PC -c len_extern.c -o build\temp.win32-2.7\Release\len_extern.o
writing build\temp.win32-2.7\Release\len_extern.def
D:\MinGW\bin\gcc.exe -shared -s build\temp.win32-2.7\Release\len_extern.o build\temp.win32-2.7\Release\len_extern.def -LD:\Python\libs -LD:\Python\PCbuild -LD:\Python\PC\VS9.0 -lpython27 -lmsvcr90 -o E:\Project_Sources\notebook\other-languages\len_extern.pyd


In [22]:
!ls

01-introduction.ipynb
02-python-extension-modules.ipynb
03-cython.ipynb
03-particle.zip
build
cython_sum.c
cython_sum.pyd
cython_sum.pyx
example.zip
extern.zip
fib.zip
len_extern.c
len_extern.pyd
len_extern.pyx
particle.cpp
particle.pyx
particle_extern.cpp
particle_extern.h
setup_len_extern.py


In [23]:
# 调用模块
import len_extern

In [24]:
# 没有strlen函数
dir(len_extern)

['__builtins__',
 '__doc__',
 '__file__',
 '__name__',
 '__package__',
 '__test__',
 'get_len']

In [25]:
!cat len_extern.pyx

cdef extern from "string.h":
    int strlen(char *c)
    
def get_len(char *message):
    return strlen(message)


In [26]:
len_extern.get_len('122ddd')

6

**除了对已有的 C 函数进行调用，还可以对已有的 C 结构体进行调用和修改：**

In [27]:
%%file time_extern.pyx
cdef extern from "time.h":

    struct tm:
        int tm_mday
        int tm_mon
        int tm_year

    ctypedef long time_t
    tm* localtime(time_t *timer)
    time_t time(time_t *tloc)

def get_date():
    """Return a tuple with the current day, month and year."""
    cdef time_t t
    cdef tm* ts
    t = time(NULL)
    ts = localtime(&t)
    return ts.tm_mday, ts.tm_mon + 1, ts.tm_year + 1900

Writing time_extern.pyx


In [28]:
%%file setup_time_extern.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

setup(
  ext_modules=[ Extension("time_extern", ["time_extern.pyx"]) ],
  cmdclass = {'build_ext': build_ext}
)

Writing setup_time_extern.py


In [29]:
!python setup_time_extern.py build_ext --inplace

running build_ext
cythoning time_extern.pyx to time_extern.c
building 'time_extern' extension
D:\MinGW\bin\gcc.exe -mdll -O -Wall -ID:\Python\include -ID:\Python\PC -c time_extern.c -o build\temp.win32-2.7\Release\time_extern.o
writing build\temp.win32-2.7\Release\time_extern.def
D:\MinGW\bin\gcc.exe -shared -s build\temp.win32-2.7\Release\time_extern.o build\temp.win32-2.7\Release\time_extern.def -LD:\Python\libs -LD:\Python\PCbuild -LD:\Python\PC\VS9.0 -lpython27 -lmsvcr90 -o E:\Project_Sources\notebook\other-languages\time_extern.pyd


In [30]:
import time_extern
time_extern.get_date()

(23, 7, 2018)

In [31]:
import zipfile
import os
f = zipfile.ZipFile('extern.zip','w',zipfile.ZIP_DEFLATED)

names = ['setup_len_extern.py',
         'len_extern.pyx',
         'setup_time_extern.py',
         'time_extern.pyx']
for name in names:
    f.write(name)

f.close()

!rm -f setup*.*
!rm -f len_extern.*
!rm -f time_extern.*
!rm -rf build

## class 和 cdef class

* `class` 定义属性变量比较自由，`cdef class` 可以定义 `cdef`

* `class`使用 `__init__` 初始化，`cdef class `在使用`__init__`之前用 `__cinit__` 对 C 相关的参数进行初始化。

* `cdef class `中的方法可以是 `def, cdef, cpdef` 三种，只有 `public` 的属性才可以被访问，不可以添加新的属性。

* `__dealloc__` 函数类似析构函数，负责释放申请的内存。

`Cython `属性可以使用关键词` property `来定义，然后定义` __get__ 和 __set__`方法来进行获取和设置：
```cython
property name:
    def __get__(self):
        return something
    def __set__(self):
        set_something
```

## 使用 C++ 类

使用 C++ 类时要加上 cppclass 关键词，在编译时 setup 中要加上 language="c++" 的选项。

写一个C++类

In [32]:
%%file particle_extern.h
#ifndef _PARTICLE_EXTERN_H_
#define _PARTICLE_EXTERN_H_

class Particle {

    public:

        Particle() :
            mass(0), charge(0) {}
        
        Particle(float m, float c, float *p, float *v);

        ~Particle() {}

        float getMass() {return mass; }

        void setMass(float m) { mass = m; }

        float getCharge() { return charge; }

        const float *getVel() { return vel; }
        const float *getPos() { return pos; }

        void applyImpulse(float *f, float t);

    private:
        float mass, charge;
        float pos[3], vel[3];
};

#endif

Overwriting particle_extern.h


In [33]:
%%file particle_extern.cpp
#include "particle_extern.h"

Particle::Particle(float m, float c, float *p, float *v) :
    mass(m), charge(c) 
{
    for (int i=0; i<3; ++i) {
        pos[i] = p[i]; vel[i] = v[i];
    }
}

void Particle::applyImpulse(float *f, float t)
{
    float newvi;
    for(int i=0; i<3; ++i) {
        newvi = vel[i] + t / mass * f[i];
        pos[i] = (newvi + vel[i]) * t / 2.;
        vel[i] = newvi;
    }
}
Overwr

Overwriting particle_extern.cpp


In [34]:
%%file particle.pyx
import numpy as np

# 首先从头文件声明这个类：
cdef extern from "particle_extern.h":
# 这里要使用 cppclass 关键词，并且为了方便，
# 我们将 Particle 类的名字在 Cython 中重命名为 _Particle。
    cppclass _Particle "Particle":
        _Particle(float m, float c, float *p, float *v)
        float getMass()
        void setMass(float m)
        float getCharge()
        const float *getVel()
        const float *getPos()
        void applyImpulse(float *f, float t)


cdef class Particle:
    cdef _Particle *thisptr # ptr to C++ instance

    def __cinit__(self, m, c, float[::1] p, float[::1] v):
        if p.shape[0] != 3 or v.shape[0] != 3:
            raise ValueError("...")
        self.thisptr = new _Particle(m, c, &p[0], &v[0])

    def __dealloc__(self):
        del self.thisptr

    def apply_impulse(self, float[::1] v, float t):
        self.thisptr.applyImpulse(&v[0], t)

    def __repr__(self):
        args = ', '.join('%s=%s' % (n, getattr(self, n)) for n in ('mass', 'charge', 'pos', 'vel'))
        return 'particle.Particle(%s)' % args

    property charge:

        def __get__(self):
            return self.thisptr.getCharge()

    property mass:  # Cython-style properties.
        def __get__(self):
            return self.thisptr.getMass()

        def __set__(self, m):
            self.thisptr.setMass(m)

    property vel:

        def __get__(self):
            cdef const float *_vel = self.thisptr.getVel()
            cdef float[::1] arr = np.empty((3,), dtype=np.float32)
            for i in range(3):
                arr[i] = _vel[i]
            return np.asarray(arr)

    property pos:

        def __get__(self):
            cdef const float *_pos = self.thisptr.getPos()
            cdef float[::1] arr = np.empty((3,), dtype=np.float32)
            for i in range(3):
                arr[i] = _pos[i]
            return np.asarray(arr)

Overwriting particle.pyx


**为了使用这个类，我们需要定义一个该类的指针，然后用指针指向一个` _Particle `对象**

In [35]:
%%file setup.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

ext = Extension("particle", ["particle.pyx", "particle_extern.cpp"], language="c++")

setup(
    cmdclass = {'build_ext': build_ext},
    ext_modules = [ext],
)

Writing setup.py


之前是这样写的
```
setup(
  ext_modules=[ Extension("len_extern", ["len_extern.pyx"]) ],
  cmdclass = {'build_ext': build_ext}
)
```

In [36]:
# !python setup.py build_ext -i

尴尬！不过我现在没空折腾了，以后有时间在弄，参见笔记：[Cython应用手记](http://gashero.iteye.com/blog/649516)

In [37]:
import zipfile

f = zipfile.ZipFile('03-particle.zip','w',zipfile.ZIP_DEFLATED)

names = ['particle.pyx',
         'particle_extern.cpp',
         'particle_extern.h',
         'setup.py']
for name in names:
    f.write(name)

f.close()

!rm -f setup*.*
!rm -f particle*.*
!rm -rf build

## Typed memoryviews

这里 `double[::1]` 是一种 `memoryview` 方法，效率跟 `Numpy` 数组差不多，可以给` C` 数组赋值，可以给` Numpy `数组赋值，可以像 `Numpy `一样切片：

%%file cython_sum.pyx
def cython_sum(double[::1] a):
    cdef double s = 0.0
    cdef int i, n = a.shape[0]
    for i in range(n):
        s += a[i]
    return s

In [40]:
%%file cython_sum.pyx
def cython_sum(double[::1] a):
    cdef double s = 0.0
    cdef int i, n = a.shape[0]
    for i in range(n):
        s += a[i]
    return s

Overwriting cython_sum.pyx


In [38]:
%%file setup.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

ext = Extension("cython_sum", ["cython_sum.pyx"])

setup(
    cmdclass = {'build_ext': build_ext},
    ext_modules = [ext],
)

Writing setup.py


由于在ipython中运行时间是真的长，索性直接在cmd中编译了
```
python setup.py build_ext -i
```
![](https://raw.githubusercontent.com/ds19991999/githubimg/master/picgo/20180723230712.png)

In [39]:
from cython_sum import cython_sum
from numpy import *

In [42]:
a = arange(1e6)

In [43]:
cython_sum(a)

499999500000.0

In [44]:
a.sum()

499999500000.0

效率比较

In [46]:
%timeit cython_sum(a)

1000 loops, best of 3: 1.34 ms per loop


In [47]:
%timeit a.sum()

100 loops, best of 3: 1.61 ms per loop


In [48]:
import zipfile

f = zipfile.ZipFile('03-cython-sum.zip','w',zipfile.ZIP_DEFLATED)

names = ['cython_sum.pyx',
         'setup.py']
for name in names:
    f.write(name)

f.close()

!rm -f setup*.*
!rm -f cython_sum*.*
!rm -rf build