# Cython in Jupyter notebooks

To use cython in a Jupyter notebook, the extension has to be loaded.

In [1]:
%load_ext cython

## Pure Python

To illustrate the performance difference between a pure Python function and a cython implementation, consider a function that computes the list of the first $k_{\rm max}$ prime numbers.

In [2]:
from array import array

In [3]:
def primes(kmax, p=None):
    if p is None:
        p = array('i', [0]*kmax)
    result = []
    k, n = 0, 2
    while k < len(p):
        i = 0
        while i < k and n % p[i] != 0:
            i += 1
        if i == k:
            p[k] = n
            k += 1
            result.append(n)
        n += 1
    return result

Checking the results for the 20 first prime numbers.

In [4]:
primes(20)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]

Note that this is not the most efficient method to check whether $k$ is prime.

In [5]:
%timeit primes(1_000)

73.1 ms ± 8.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [6]:
p = array('i', [0]*10_000)
%timeit primes(10_000, p)

7.65 s ± 993 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Cython

The cython implementation differs little from that in pure Python, type annotations have been added for the function's argument, and the variables `n`, `k`, `i`, and `p`.  Note that cython expects a constant array size, hence the upper limit on `kmax`.

In [7]:
%%cython
def c_primes(int kmax):
    cdef int n, k, i
    cdef int p[10_000]
    if kmax > 10_000:
        kmax = 10_000
    result = []
    k, n = 0, 2
    while k < kmax:
        i = 0
        while i < k and n % p[i] != 0:
            i += 1
        if i == k:
            p[k] = n
            k += 1
            result.append(n)
        n += 1
    return result

Checking the results for the 20 first prime numbers.

In [8]:
c_primes(20)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]

In [9]:
%timeit c_primes(1_000)

1.84 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [10]:
%timeit c_primes(10_000)

195 ms ± 15 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


It is clear that the cython implementation is more than 30 times faster than the pure Python implementation.

## Dynamic memory allocation

The cython implementation can be improved by adding dynamic memory allocation for the array `p`.

In [11]:
%%cython
from libc.stdlib cimport calloc, free

def c_primes(int kmax):
    cdef int n, k, i
    cdef int *p = <int *> calloc(kmax, sizeof(int))
    result = []
    k, n = 0, 2
    while k < kmax:
        i = 0
        while i < k and n % p[i] != 0:
            i += 1
        if i == k:
            p[k] = n
            k += 1
            result.append(n)
        n += 1
    free(p)
    return result

Checking the results for the 20 first prime numbers.

In [12]:
c_primes(20)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]

This has no noticeable impact on performance.

In [13]:
%timeit c_primes(1_000)

2.29 ms ± 473 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [14]:
%timeit c_primes(10_000)

243 ms ± 32.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
