# Numba - CPU

Numba adalah JIT (Just-In-Time) compiler yang menerjemahkan code Python ke dalam bahasa machine.

* Menggunakan special decorator pada fungsi Python, Numba meng-compile fungsi tersebut ke dalam bahasa mesin menggunakan LLVM.
* Numba compatible dengan array NumPy.
* Dapat melakukan paralelisasi yang dapat memanfaatkan semua CPU core.

In [1]:
import numpy as np

def inner_rows(C,A,B):
    for i in range(len(A)):
        for j in range(len(A)):
            C[i,j] = A[i,j] + B[i,j]

def inner_cols(C,A,B):
    for j in range(len(A)):
        for i in range(len(A)):
            C[i,j] = A[i,j] + B[i,j]


def inner_alloc(C,A,B):
    for i in range(len(A)):
        for j in range(len(A)):
            val = [A[i,j] + B[i,j]]
            C[i,j] = val[0]


In [2]:
A = np.random.rand(100,100)
B = np.random.rand(100,100)
C = np.random.rand(100,100)


In [3]:
%timeit inner_rows(C,A,B)
%timeit inner_cols(C,A,B)
%timeit inner_alloc(C,A,B)

3.2 ms ± 72.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.32 ms ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.78 ms ± 15.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [4]:
%reload_ext memory_profiler

In [5]:
%memit inner_rows(C,A,B)
%memit inner_cols(C,A,B)
%memit inner_alloc(C,A,B)

peak memory: 85.16 MiB, increment: 0.31 MiB
peak memory: 85.20 MiB, increment: 0.05 MiB
peak memory: 85.20 MiB, increment: 0.00 MiB


## Menggunakan Decorator JIT

### Contoh 1

In [6]:
import math
import numpy as np
import numba
import matplotlib.pyplot as plt

In [7]:
def prima(n):
    if n <= 1:
        raise ArithmeticError('"%s" <= 1' % n)
    if n == 2 or n == 3:
        return True
    elif n % 2 == 0:
        return False
    else:
        n_sqrt = math.ceil(math.sqrt(n))
        for i in range(3, n_sqrt):
            if n % 1 == 0:
                return False
    
    return True

In [8]:
n = np.random.randint(2, 1000_000, dtype=np.int64)
print(n, prima(n))

893937 False


In [9]:
@numba.jit
def prima_numba(n):
    if n <= 1:
        raise ArithmeticError('"%s" <= 1' % n)
    if n == 2 or n == 3:
        return True
    elif n % 2 == 0:
        return False
    else:
        n_sqrt = math.ceil(math.sqrt(n))
        for i in range(3, n_sqrt):
            if n % 1 == 0:
                return False
    
    return True

In [28]:
angka = np.random.randint(2, 1000_000, dtype=np.int64, size=10000)

%time p1 = [prima(n) for i in angka]
%time p2 = [prima_numba(n) for i in angka]

CPU times: user 24.7 ms, sys: 2.12 ms, total: 26.8 ms
Wall time: 25.4 ms
CPU times: user 15.4 ms, sys: 327 µs, total: 15.7 ms
Wall time: 15.8 ms


In [11]:
@numba.njit
def prima_numba_njit(n):
    if n <= 1:
        raise ArithmeticError('"angka" <= 1')
    if n == 2 or n == 3:
        return True
    elif n % 2 == 0:
        return False
    else:
        n_sqrt = math.ceil(math.sqrt(n))
        for i in range(3, n_sqrt):
            if n % 1 == 0:
                return False
    
    return True

In [29]:
%time p1 = [prima(n) for i in angka]
%time p2 = [prima_numba_njit(n) for i in angka]

CPU times: user 28 ms, sys: 1.99 ms, total: 30 ms
Wall time: 28.5 ms
CPU times: user 2.25 ms, sys: 493 µs, total: 2.74 ms
Wall time: 2.74 ms


### Contoh 2

In [13]:
import numba
import numpy as np

In [14]:
def py_sum(x):
    hasil = 0
    for i in range(len(x)):
        hasil = hasil + x[i]
    return hasil

In [15]:
@numba.jit(nopython=True) # Decorator Numba
def numba_sum(x):
    hasil = 0
    for i in range(len(x)):
        hasil = hasil + x[i]
    return hasil

In [16]:
# generating data
x = np.random.randint(10, 100, 100_000)
x.shape

(100000,)

In [17]:
%timeit py_sum(x)

11.8 ms ± 88.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [18]:
%timeit numba_sum(x)

25.9 µs ± 21.1 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


### Latihan

Buat fungsi `numba_sum(x, y)` untuk menghitung jarak-$L_1$ 

$$
L_1 = \sum_{i=0}^{N-1} |x_i - y_i|.
$$

Kemudian, buat perbandingan waktu komputasi antara **python original**, menggunakan **numpy.sum**, dan **numba**. Gunakan modul `from time import time` untuk menghitung waktu eksekusinya.

In [19]:
import numpy as np
import numba

In [20]:
# Generate 1 juta bilangan random (0,9)
x = np.random.randint(0, 9, 1_000_000)
y = np.random.randint(0, 9, 1_000_000)

In [21]:
def py_sum(x, y):
    hasil = 0
    for i in range(len(x)):
        hasil = hasil + abs(x[i] - y[i])
    return hasil

In [22]:
@numba.njit
def numba_sum(x, y):
    hasil = 0
    for i in range(len(x)):
        hasil = hasil + abs(x[i] - y[i])
    return hasil

In [23]:
py_sum(x, y)

2962098

In [30]:
from time import time

start = time()
py_sum(x, y)
exec_py = time() - start
print("waktu eksekusi python: {} detik".format(exec_py))

start = time()
np.sum(np.abs(x-y))
exec_np = time() - start
print("waktu eksekusi numpy: {} detik".format(exec_np))

start = time()
numba_sum(x, y)
exec_numba = time() - start
print("waktu eksekusi numba: {} detik".format(exec_numba))

waktu eksekusi python: 0.2918870449066162 detik
waktu eksekusi numpy: 0.0029730796813964844 detik
waktu eksekusi numba: 0.0004470348358154297 detik


Jalankan 10 kali perhitungan di atas dan simpan hasilnya kemudian tampilkan nilai rata-rata dan standard deviasi dari perhitungan tersebut.

In [32]:
from time import time

hasil_py_sum = np.zeros(shape=10)
hasil_np_sum = np.zeros(shape=10)
hasil_numba_sum = np.zeros(shape=10)

for i in range(10):
    start = time()
    py_sum(x, y)
    exec_py = time() - start
    hasil_py_sum[i] = exec_py

    start = time()
    np.sum(np.abs(x-y))
    exec_np = time() - start
    hasil_np_sum[i] = exec_np

    start = time()
    numba_sum(x, y)
    exec_numba = time() - start
    hasil_numba_sum[i] = exec_numba


In [33]:
print(hasil_py_sum)
print(hasil_np_sum)
print(hasil_numba_sum)

[0.30304694 0.28393698 0.28491616 0.28913093 0.28406382 0.28457808
 0.28503489 0.28801489 0.28690004 0.28831267]
[0.00292611 0.00305986 0.00300121 0.00266504 0.00303197 0.00287485
 0.00294924 0.00347805 0.00321484 0.00273895]
[0.00037098 0.00049996 0.00050735 0.00033617 0.00049996 0.00050497
 0.00039005 0.00052691 0.00108576 0.00037622]


In [34]:
speedup_python = hasil_py_sum / hasil_numba_sum
speedup_np = hasil_np_sum / hasil_numba_sum

mean_py = np.mean(speedup_python)
std_py = np.std(speedup_python)
mean_np = np.mean(speedup_np)
std_np = np.std(speedup_np)

print("Speedup (Python, Numba): mean = {}, std.dev = {}".format(mean_py, std_py))
print("Speedup (NumPy, Numba): mean = {}, std.dev = {}".format(mean_np, std_np))

Speedup (Python, Numba): mean = 624.6113229778235, std.dev = 165.4739722259265
Speedup (NumPy, Numba): mean = 6.401130585121463, std.dev = 1.3950777452279868
