# Compilers

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lukeconibear/swd6_hpp/blob/main/docs/05_compilers.ipynb)

## [CPython](https://www.python.org/)

The main Python distribution.

Uses an Ahead-Of-Time (AOT) compiler i.e., the code was compiled in advance.

This was as an assortment of statically compiled C extensions.

CPython is a general purpose interpreter, allowing it to work on a variety of problems.

It is dynamically typed, so types can change as you go e.g., `x = 5`, then later `x = 'gary'`.

## [Numba](http://numba.pydata.org/)

Uses JIT (just-in-time) compiler on functions i.e., compile the function at execution time.

This converts it to fast machine code ([LLVM](https://en.wikipedia.org/wiki/LLVM)).

Numba works with the default CPython.

It uses decorators around functions.

There are two main modes:
- [`nopython`](https://numba.readthedocs.io/en/stable/glossary.html#term-nopython-mode) mode: `@jit(nopython=True)`, `@njit`, [`@vectorize`](https://numba.pydata.org/numba-doc/latest/user/vectorize.html)
    - Compiles code that does not access CPython.
    - Highest performance.
    - Requires specific types (mainly numbers), otherwise returns error.
    - Recommended.
- [`object`](https://numba.readthedocs.io/en/stable/glossary.html#term-object-mode) mode: [`@jit`](https://numba.readthedocs.io/en/stable/user/jit.html)
    - Compiles code that handles all values as Python objects and uses CPython to work on those objects.
    - `@jit` first tries to use `nopython` mode, and if it fails uses `object` mode.
    - Main improvement over CPython is for loops.

Numba is helpful when want to speed up numerical opterations in specific functions.  

Here are some examples for [NumPy](https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html) and [Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html#numba-jit-compilation).

In [1]:
import numpy as np
from numba import njit, vectorize

In [2]:
nums = np.arange(1_000_000)

In [3]:
def super_function(nums):
    trace = 0.0
    for num in nums: # loop
        trace += np.cos(num) # numpy
    return nums + trace # broadcasting

In [4]:
%timeit super_function(nums)

965 ms ± 113 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [9]:
@njit # numba decorator
def super_function(nums):
    trace = 0.0
    for num in nums: # loop
        trace += np.cos(num) # numpy
    return nums + trace # broadcasting

The first call of the expression has an overhead to compile the function.

In [10]:
%%timeit -n 1 -r 1
super_function(nums)

152 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


All subsequent calls use this compiled version, and are therefore much faster.

In [11]:
%%timeit -n 1 -r 1
super_function(nums)

11.4 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


## Exercise

...

## Further information

### Other options

- [Cython](https://cython.org/)
  - *Compiles to statically typed C/C++*.
  - Use for any amount of code.
  - Use with the default CPython.
  - Helpful when need static typing and optimising libraries.  
  - Examples [not using IPython](https://cython.readthedocs.io/en/latest/src/quickstart/build.html#building-a-cython-module-using-setuptools), [NumPy](https://cython.readthedocs.io/en/latest/src/tutorial/numpy.html), [Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html).
- [PyPy](https://www.pypy.org/)
  - *Just−In−Time (JIT) compiler (written in Python).*
  - Enables optimisations at run time, especially for numerical tasks with repitition and loops.
  - Replaces CPython.
  - Faster, though overheads for start-up and memory.
  - Helpful when want to speed up numerical opterations in all of code. 
  - May not be [compatible](http://packages.pypy.org/) with the libraries you use.

### Resources

- [Why is Python slow?](https://youtu.be/I4nkgJdVZFA), Anthony Shaw, PyCon 2020. [CPython Internals](https://realpython.com/products/cpython-internals-book/).