# Accelerating Python

Main reference:
- [*Numerical Python*](https://link.springer.com/book/10.1007/978-1-4842-4246-9), Robert Johansson, Chapter 19: *Code Optimization*

In [None]:
import numpy as np

In [None]:
import matplotlib.pyplot as plt

In [None]:
import numba

[`Numba`](https://numba.pydata.org)
=================================

In [None]:
def py_sum(data: list):
    """Pythonic-function: sum
    """
    s = 0
    for d in data:
        s += d
    return s

In [None]:
repeat = 50000
data = np.random.randn(repeat)

In [None]:
# profile to output
# following https://stackoverflow.com/questions/17310752/can-you-capture-the-output-of-ipythons-magic-methods-timeit
t_py = %timeit -o py_sum(data)

In [None]:
dir(t_py)

In [None]:
t_py.average

In [None]:
t_np = %timeit -o np.sum(data)

In [None]:
def times(a: float, b: float):
    """return the ratio of average runtime of a and b"""
    t = a.average / b.average
    print("ratio of average runtime: ", round(t))

In [None]:
times(t_py, t_np)

Check if py_sum and np.sum return the **equal** results *within given tolerence*

In [None]:
tolerence = 1e-10
assert abs(py_sum(data) - np.sum(data)) < tolerence

`numba.jit` Decorator
---------------------
See [PEP 318](https://www.python.org/dev/peps/pep-0318/) -- Decorators for Functions and Methods for details about the "decorator".

In [None]:
@numba.jit
def jit_sum(data):
    s = 0
    for d in data:
        s += d
    return s

In [None]:
t_jit = %timeit -o jit_sum(data)

In [None]:
times(t_py, t_jit)

Decorate an defined function
----------------------------

In [None]:
jit_sum2 = numba.jit()(py_sum)

In [None]:
t_jit2 = %timeit -o jit_sum2(data)

In [None]:
t_jit2.average

Check equality within the given tolerence

In [None]:
assert abs(py_sum(data) - jit_sum(data)) < tolerence

Julia fractal
---------
The [*Julia set*](https://brilliant.org/wiki/fractals/) is defined as the set of all complex numbers, $z$, which arebounded under repeated iteration of the complex quadratic polynomial
$$
z_{n+1} = z_n^2 + c
$$
for a constant complex number $c$.

In [None]:
def py_julia_fractal(z_re, z_im, j):
    """Pythonic function of Julia fractal"""
    c = -0.05 + 0.68j
    for m in range(len(z_re)):
        for n in range(len(z_im)):
            z = z_re[m] + 1j * z_im[n]
            for t in range(256):
                z = z ** 2 + c
                if np.abs(z) > 2.0:
                    j[m, n] = t
                    break

In [None]:
N = 1024
z_real = np.linspace(-1.5, 1.5, N)
z_imag = np.linspace(-1.5, 1.5, N)

In [None]:
j = np.zeros((N, N), np.int64)
t_py_julia = %timeit -n1 -r1 -o py_julia_fractal(z_real, z_imag, j)

In [None]:
jit_julia_fractal = numba.jit(nopython=True)(py_julia_fractal)

In [None]:
j = np.zeros((N, N), np.int64)
t_jit_julia = %timeit -o jit_julia_fractal(z_real, z_imag, j)

In [None]:
times(t_py_julia, t_jit_julia)

`numba.jit(nopython=True)` is equivalent to `numba.njit()`

In [None]:
njit_julia_fractal = numba.njit()(py_julia_fractal)

In [None]:
j = np.zeros((N, N), np.int64)
t_njit_julia = %timeit -o jit_julia_fractal(z_real, z_imag, j)

In [None]:
times(t_jit_julia, t_njit_julia)

In [None]:
fig, ax =plt.subplots(figsize=(8, 8))
ax.imshow(j, cmap=plt.cm.RdBu_r, extent=[-1.5, 1.5, -1.5, 1.5])
ax.set_xlabel("$\mathrm{Re}(z)$", fontsize=12)
ax.set_ylabel("$\mathrm{Im}(z)$", fontsize=12)

Heaviside step function:
\begin{equation}
\Theta(x) =
\begin{cases}
0, \quad & x < 0 \\
1/2, & x = 0 \\
1, & x > 0
\end{cases}
\end{equation}

`numba.vectorize`
-----------------

In [None]:
def py_Heaviside(x: float):
    """Heaviside step function"""
    if x == 0.0:
        return 0.5
    elif x < 0.0:
        return 0.0
    else:
        return 1.0

In [None]:
x = np.linspace(-2, 2, 50001)
t_py_H = %timeit -o [py_Heaviside(xx) for xx in x]

In [None]:
np_vec_Heaviside = np.vectorize(py_Heaviside)
t_np_H = %timeit -o np_vec_Heaviside(x)

In [None]:
def np_Heaviside(x):
    """x: NumPy array"""
    return (x > 0.0) + (x == 0.0) / 2.0

In [None]:
t_npa_H = %timeit -o np_Heaviside(x)

In [None]:
times(t_np_H, t_npa_H)

In [None]:
@numba.vectorize([numba.float32(numba.float32),
                  numba.float64(numba.float64)])
def jit_Heaviside(x):
    if x == 0.0:
        return 0.5
    if x < 0:
        return 0.0
    else:
        return 1.0

In [None]:
t_jit_H = %timeit -o jit_Heaviside(x)

In [None]:
times(t_np_H, t_jit_H)

In [None]:
jit_Heaviside(x[24990:25010])

Fastmath
--------
Pease efer to https://numba.pydata.org/numba-doc/latest/user/performance-tips.html for other performance tips.

In [None]:
njit_f_H = numba.njit(fastmath=True)(np_Heaviside)

In [None]:
t_njit_f_H = %timeit -o njit_f_H(x)

[`Cython`](https://cython.org/)
=============================

In [None]:
%load_ext Cython

In [None]:
%%cython

def cy_sum(data):
    s = 0.0
    for d in data:
        s += d
    return s

In [None]:
t_cy = %timeit -o cy_sum(data)

In [None]:
times(t_py, t_cy)

In [None]:
%%cython
cimport numpy
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
def cy_sum2(numpy.ndarray[numpy.float64_t, ndim=1] data):
    cdef numpy.float64_t s = 0.0
    cdef int n, N = len(data)
    for n in range(N):
        s += data[n]
    return s

In [None]:
t_cy2 = %timeit -o cy_sum2(data)

In [None]:
times(t_py, t_cy2)

Summary
=======

- `Numba` 几乎不必修改原 `Python` 代码，即可实现对特定 `Python` 函数的优化，特别是与 `for` 循环和 `NumPy` 相关的优化
- `Cython` 在几乎不修改原 `Python` 代码的基础上，优化幅度较小；为得到较好的优化，需要对原 `Python` 代码进行较大的修改