
# 用 C 的方式寫 Python

_1._ 記憶體存取效率

[knapsack_0_original.py](/edit/knapsack/knapsack_0_original.py) 的問題：

第 21 行
```python
left = [item for item in items if item.weight <= K]
```

第 39 行

```python
 v, idxs = search(left[i + 1:], K-item.weight, best_v, current_v+item.value, current_line+[item.index])
```

list copy/create 太過昂貴（雖然很方便）

解決方法：
* 非必要的東西不傳/不建立
* 一定要傳的東西，共用/還原
* 但是要很小心
* np.array 和 cython 的 typed memoryview (類似 go 的 slice)
* c array 自己追範圍
* C++ 的 vector 取代 list

標準例子 from http://docs.cython.org/src/userguide/memoryviews.html

In [4]:
%%cython
# cython: boundscheck=False
from cython.view cimport array as cvarray
import numpy as np

# Memoryview on a NumPy array
narr = np.arange(27, dtype=np.dtype("i")).reshape((3, 3, 3))
cdef int [:, :, :] narr_view = narr

# Memoryview on a C array
cdef int carr[3][3][3]
cdef int [:, :, :] carr_view = carr

# Memoryview on a Cython array
cyarr = cvarray(shape=(3, 3, 3), itemsize=sizeof(int), format="i")
cdef int [:, :, :] cyarr_view = cyarr

# Show the sum of all the arrays before altering it
print "NumPy sum of the NumPy array before assignments:", narr.sum()

# We can copy the values from one memoryview into another using a single
# statement, by either indexing with ... or (NumPy-style) with a colon.
carr_view[...] = narr_view
cyarr_view[:] = narr_view
# NumPy-style syntax for assigning a single value to all elements.
narr_view[:, :, :] = 3

# Just to distinguish the arrays
carr_view[0, 0, 0] = 100
cyarr_view[0, 0, 0] = 1000

# Assigning into the memoryview on the NumPy array alters the latter
print "NumPy sum of NumPy array after assignments:", narr.sum()

# A function using a memoryview does not usually need the GIL
cpdef int sum3d(int[:, :, :] arr) nogil:
    cdef int total = 0
    I = arr.shape[0]
    J = arr.shape[1]
    K = arr.shape[2]
    for i in range(I):
        for j in range(J):
            for k in range(K):
                total += arr[i, j, k]
    return total

# A function accepting a memoryview knows how to use a NumPy array,
# a C array, a Cython array...
print "Memoryview sum of NumPy array is", sum3d(narr)
print "Memoryview sum of C array is", sum3d(carr)
print "Memoryview sum of Cython array is", sum3d(cyarr)
# ... and of course, a memoryview.
print "Memoryview sum of C memoryview is", sum3d(carr_view)

NumPy sum of the NumPy array before assignments: 351
NumPy sum of NumPy array after assignments: 81
Memoryview sum of NumPy array is 81
Memoryview sum of C array is 451
Memoryview sum of Cython array is 1351
Memoryview sum of C memoryview is 451


Memoryview 的用法和思維接近 np.array

你可以用 np.array 來包
* np.array
* C array `cdef int carr[3][3][3]`
* Cython array `from cython.view cimport array`
* cython 會幫你自動轉換

In [15]:
%%cython
import numpy as np
from array import array
def f(long[:] a):
    print "type of a is", a
f(np.array([1,2,3]))
cdef long[:] b =np.array([1,2,3])
print "type of b is", b

type of a is <MemoryView of 'ndarray' object>
type of b is <MemoryView of 'ndarray' object>


通常不需要 gil

In [34]:
%%cython  --compile-args=-fopenmp  --link-args=-fopenmp
# cython: infer_types=True, boundscheck=False 
from cython.parallel import prange
from libc.math cimport sin
cpdef double[:] fx3(double[:] a):
    cdef Py_ssize_t i
    cdef int n = a.size, j
    for i in prange(n, nogil=True):
    #for i in range(n):
        for j in range(10000):
            a[i] = sin(a[i]*a[i]+a[i])
    return a

In [37]:
%%time
import numpy as np
a = np.arange(10000, dtype=float)
fx3(a)

CPU times: user 2.91 s, sys: 2 ms, total: 2.91 s
Wall time: 405 ms


### More about prange

In [44]:
%%cython  --compile-args=-fopenmp  --link-args=-fopenmp
# cython: infer_types=True, boundscheck=False 
from cython.parallel import prange
import numpy as np
cdef long fx3(long[:] a):
    cdef Py_ssize_t i
    cdef long n = a.size, j=0
    for i in prange(n, nogil=True):    
        j+=a[i]
    return j
print fx3(np.array(range(100)))

4950


In [48]:
%%cython  -f --cplus --compile-args=-fopenmp  --link-args=-fopenmp
# cython: infer_types=True, boundscheck=False 
from cython.parallel import prange
from libcpp.vector cimport vector
cdef fx3():
    cdef Py_ssize_t i
    cdef vector[long] v
    for i in prange(100, nogil=True):
        with gil:
            v.push_back(i)
    return v
print fx3()

[88, 52, 13, 39, 53, 0, 40, 26, 76, 41, 14, 64, 1, 42, 54, 89, 43, 15, 77, 90, 27, 2, 91, 55, 78, 56, 16, 65, 28, 57, 44, 17, 45, 18, 66, 92, 19, 3, 67, 46, 4, 58, 47, 29, 5, 59, 79, 93, 6, 7, 8, 9, 10, 11, 12, 48, 94, 68, 20, 60, 95, 61, 49, 96, 69, 21, 70, 22, 71, 50, 97, 62, 51, 98, 63, 99, 72, 73, 74, 75, 23, 24, 25, 80, 81, 82, 30, 31, 32, 33, 83, 34, 84, 35, 36, 37, 85, 38, 86, 87]


# 用 Python 的方式寫 C

* list 可以用 vector 替代
* dict 可以用 map 替代（不過 map 比較像 defaultdict）

但怎麼共存？
* len(list)   $\neq$ vector.size
* list.append $\neq$ vector.push_back

例子:  brainfuck4_merge

in bf4_merge.pxd
```cython
from vector_list import vector
vector[char] cells
vector[char] P
```

in bf4_merge.py
```python
# 看起來跟普通的 list 一樣
cells = [0]*1000
P = [ord(x) for x in open(sys.argv[1], "r").read()]
```
in vector_list.pxd  (`list.__len__()` 在 python 也可以用)
```cython
        void append "push_back"(T&) except +
        size_t __len__ "size"()
```

cdef 出來的函數在 python 中沒有怎麼辦?

例子: brainfuck5
in brainfuck5_improved.pxd
```cython
from libc.stdio cimport putchar
```
in brainfuck5_improved.py
```python
from __future__ import print_function
globals()['putchar'] = lambda x: print(chr(x), end="")
```

或者 `from xxx import putchar` 也行


_2._ 指標

cython 不像 C++ 有 `p->x` 和 `p.x` 的差別，在 cython 要特別小心 