In [None]:
import numba
import numpy as np

# How does it work?

How is numba managing to make our code so fast? We can take a look under the hood with the `inspect_types()` method.

In [None]:
@numba.njit
def mean_numba(x):
    s = 0
    n = 0
    for i in x:
        s += i
        n += 1
    return s / n

Since numba is JIT (Just In Time) compiled by default, at first this isn't very helpful:

In [None]:
mean_numba.inspect_types()

But if we call the function with some arguments, numba will go ahead and compile a function. After that we can see a line by line break down of what numba was able to infer about our code to make a statically compiled version:

In [None]:
mean_numba(np.arange(20))

In [None]:
mean_numba.inspect_types()

# Limitations

## Numba doesn't work with everything

In [None]:
import pandas as pd

In [None]:
mean_numba(pd.Series(np.random.randint(0, 10, 100)))

Numba only works with a subset of numpy and python. The methods and types you can work with can only be a subset of standard python and numpy dtypes and functions. You can find some documentation on what you can use inside a function you'd like to compile in the numba docs.

## What it does work with

To find out what features you can use inside a jit compilled function, take a look at these pages in the documentation:

* http://numba.pydata.org/numba-doc/latest/reference/pysupported.html
* http://numba.pydata.org/numba-doc/latest/reference/numpysupported.html

They contain a summary of all the supported python and numpy features which currently work, and any caveats for their usage. Note that these lists are constantly expanding.

## Numba has to be able to infer types from your code

For your code to be compiled, numba has to be able to tell the types of all of the values before running your code. One big restriction of this is that containers' values must be homogenously typed. Here's an example using a simple counter:

In [None]:
@numba.njit
def count(it):
    d = {}
    for k in it:
        v = d.get(k, 0)
        d[k] = v + 1
    return d

Numba can't figure out the right types here:

In [None]:
count(np.random.randint(0,10, 100))

There are a couple ways we can get around this. One is to just figure out a different way to write it:

In [None]:
from numba.typed import Dict
from numba import types

In [None]:
@numba.njit
def count2(it):
    d = {}
    for k in it:
        if k in d:
            d[k] += 1
        else:
            d[k] = 1
    return d

In [None]:
count2(np.random.randint(0, 10, 100))

Another is to be more specific about the types with **typed objects**, so numba has less work to do. This can be done with numpy arrays by setting dtypes, and for python containers by using the typed variants numba provides.

In [None]:
from numba.typed import Dict

In [None]:
t = numba.typeof([1,2,3])

In [None]:
@numba.njit
def foo(x):
    return numba.typeof(x)

In [None]:
foo([1,2,3])

This doesn't work (but we should expect that):

In [None]:
count([1,1,"a",3])

# Typed objects

Sometimes numba is unable to infer the type of an object, even when you think it should. To get around this you can explicitly state the types. For example:

* `typed.Dict`
* `typed.List`

This doesn't work, though we think it should:

In [None]:
count(np.array(list("aaabcddd")))

In [None]:
import numba

@numba.njit
def foo(x):
    d = dict()
    for i in x:
        val = d.get(i, default=0)
        d[i] = val + 1
    return d

foo([1, 2, 3]) # Throws TypingError

In [None]:
@numba.njit
def bar(x):
    d = dict()
    for i in x:
        val = d.get(i, default=0)
        print(type(val))
        d[i] = 1
    return d

_ = bar([1,2,3])

In [None]:
types.unicode_type[1]

In [None]:
numba.

In [None]:
@numba.njit
def collect_indices(arr):
    """
    Return a dict mapping value in arr to indices it was found at.
    
    Example
    -------
    >>> collect_indices([1, 2, 1, 2])
    {1: [0, 2], 2: [1, 3]}
    """
    d = {}
    for i, val in enumerate(arr):
        if val not in d:
            d[val] = [i]
        else:
            d[val].append(i)
    return d

In [None]:
from numba import types
from numba.typed import Dict, List

In [None]:
d = Dict.empty(key_type=types.unicode_type, value_type=types.int64)

In [None]:
d[1] = List([2])

In [None]:
collect_indices([1,2,1,2])

We can specify the types using 

# Performance