# Benchmarking Python backends

It is known that vanilla Python is not quite efficient for numerical computing. Therefore, in this notebook we are going to try to compute the `n-th` fibonacci number, for a big n value, using the following backends:

1. Vanilla Python 3.6
2. Python 3.6 with `numpy` and `numba`.
3. Julia 1.4
4. C++ 17
5. Fortran 95

### 1. Vanilla Python

In [14]:
from backends.base import fib, fib_matrix

In [18]:
%timeit fib(103)

4.36 µs ± 422 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [16]:
%timeit fib_matrix(103)

86.9 µs ± 1.45 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


### 2. Optimized Python

Aside from getting a speed-up by using `numpy`, `numba` also allows to compile Python by using a JIT compiler.

In [7]:
import numba as np
np.__version__

'0.47.0'

In [7]:
from backends.optimized import fib, fib_matrix

In [11]:
%timeit fib(103)

230 ns ± 9.11 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [9]:
%timeit fib_matrix(103)

24.3 µs ± 660 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


### 3. Julia

Julia is a scientific-computing language which also compiles in runtime by using a JIT compiler.

In [2]:
# in my case, this cell is called twice
# (the first call returns an error as PyCall does not get initialized)
from julia import Main

Main.include("./backends/julia/compute.jl")

<PyCall.jlwrap Main.Fibonacci>

In [5]:
%timeit Main.Fibonacci.fib(103)

2.06 ms ± 75.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [6]:
%timeit Main.Fibonacci.fib_matrix(103)

2.11 ms ± 147 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


### 4. C++ 17

Python has a built-in C/C++ foreign function interface, which allows to easily speed up numeric computing programs. To create `compute.so`, run: `g++ -std=c++17 -o compute.so -fPIC -shared compute.cpp` (more instructions at `compute.cpp`).

**Note that by either using `cffi` or `ctypes` you can also call Golang, Rust, Vlang or any other moder language that is interoperable with C/C++**!

In [1]:
from cffi import FFI

ffi = FFI()

with open("./backends/cpp/compute.h") as code:
    ffi.cdef(code.read())
    
c_api = ffi.dlopen("./backends/cpp/compute.so")

# run when finished
# ffi.dlclose(c_api)

In [2]:
%timeit c_api.fib(103)

519 ns ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [3]:
%timeit c_api.fib_matrix(103)

172 µs ± 9.73 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


### 5. Fortran 95

We will use the `f2py` module in `numpy`. It is pretty easy to get a `compute` fortran module: just run `f2py -c compute.f95 -m compute` from the fortran folder (in the console). Note that compiling in Windows may raise an error but compilation in Linux is straightforward.

In [9]:
from backends.fortran import compute

In [17]:
%timeit compute.fib(103)

259 ns ± 4.21 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [18]:
%timeit compute.fib_matrix(103)

24.1 µs ± 638 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


# Conclusion