## Is this just magic?  What is Numba doing to make code run quickly?

Let's define a trivial example function.

In [1]:
from numba import jit

In [2]:
@jit
def add(a, b):
    return a + b

In [3]:
add(1, 1)

2

Numba examines Python bytecode and then translates this into an 'intermediate representation'.  To view this IR, run (compile) `add` and you can access the `inspect_types` method.

In [4]:
add.inspect_types()

add (int64, int64)
--------------------------------------------------------------------------------
# File: <ipython-input-2-1c683d2d00ee>
# --- LINE 1 --- 
# label 0
#   del b
#   del a
#   del $0.3

@jit

# --- LINE 2 --- 

def add(a, b):

    # --- LINE 3 --- 
    #   a = arg(0, name=a)  :: int64
    #   b = arg(1, name=b)  :: int64
    #   $0.3 = a + b  :: int64
    #   $0.4 = cast(value=$0.3)  :: int64
    #   return $0.4

    return a + b




Ok.  Numba is has correctly inferred the type of the arguments, defining things as `int64` and running smoothly.  

(What happens if you do `add(1., 1.)` and then `inspect_types`?)

In [5]:
add(1., 1.)

2.0

In [6]:
add.inspect_types()

add (int64, int64)
--------------------------------------------------------------------------------
# File: <ipython-input-2-1c683d2d00ee>
# --- LINE 1 --- 
# label 0
#   del b
#   del a
#   del $0.3

@jit

# --- LINE 2 --- 

def add(a, b):

    # --- LINE 3 --- 
    #   a = arg(0, name=a)  :: int64
    #   b = arg(1, name=b)  :: int64
    #   $0.3 = a + b  :: int64
    #   $0.4 = cast(value=$0.3)  :: int64
    #   return $0.4

    return a + b


add (float64, float64)
--------------------------------------------------------------------------------
# File: <ipython-input-2-1c683d2d00ee>
# --- LINE 1 --- 
# label 0
#   del b
#   del a
#   del $0.3

@jit

# --- LINE 2 --- 

def add(a, b):

    # --- LINE 3 --- 
    #   a = arg(0, name=a)  :: float64
    #   b = arg(1, name=b)  :: float64
    #   $0.3 = a + b  :: float64
    #   $0.4 = cast(value=$0.3)  :: float64
    #   return $0.4

    return a + b




### What about the actual LLVM code?

You can see the actual LLVM code generated by Numba using the `inspect_llvm()` method.  Since it's a `dict`, doing the following will be slightly more visually friendly.

In [7]:
for k, v in add.inspect_llvm().items():
    print(k, v)

((int64, int64), '; ModuleID = \'add\'\nsource_filename = "<string>"\ntarget datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"\ntarget triple = "x86_64-unknown-linux-gnu"\n\n@.const.add = internal constant [4 x i8] c"add\\00"\n@".const.Fatal error: missing _dynfunc.Closure" = internal constant [38 x i8] c"Fatal error: missing _dynfunc.Closure\\00"\n@PyExc_RuntimeError = external global i8\n@".const.missing Environment" = internal constant [20 x i8] c"missing Environment\\00"\n\n; Function Attrs: norecurse nounwind\ndefine i32 @"_ZN8__main__7add$241Exx"(i64* noalias nocapture %retptr, { i8*, i32 }** noalias nocapture readnone %excinfo, i8* noalias nocapture readnone %env, i64 %arg.a, i64 %arg.b) local_unnamed_addr #0 {\nentry:\n  %.15 = add nsw i64 %arg.b, %arg.a\n  store i64 %.15, i64* %retptr, align 8\n  ret i32 0\n}\n\ndefine i8* @"_ZN7cpython8__main__7add$241Exx"(i8* %py_closure, i8* %py_args, i8* nocapture readnone %py_kws) local_unnamed_addr {\nentry:\n  %.5 = alloca i8*, align

## But there's a caveat

Now, watch what happens when we try to do something that is natural in Python, but not particularly mathematically sound:

In [8]:
def add_strings(a, b):
    return a + b

In [9]:
add_strings_jit = jit()(add_strings)

In [10]:
add_strings_jit('a', 'b')

'ab'

It worked, but what does `inspect_types` tell us?

In [11]:
add_strings_jit.inspect_types()

add_strings (str, str)
--------------------------------------------------------------------------------
# File: <ipython-input-8-cf308a6bf2b5>
# --- LINE 1 --- 
# label 0
#   del b
#   del a
#   del $0.3

def add_strings(a, b):

    # --- LINE 2 --- 
    #   a = arg(0, name=a)  :: pyobject
    #   b = arg(1, name=b)  :: pyobject
    #   $0.3 = a + b  :: pyobject
    #   $0.4 = cast(value=$0.3)  :: pyobject
    #   return $0.4

    return a + b




## What's all this pyobject business?  

This means it has been compiled in `object` mode.  This can be a faster than regular python if it can do loop lifting, but not that fast.  
We want those `pyobjects` to be `int64` or another type that can be inferred by Numba. Your best bet is forcing `nopython` mode: this will throw an error if Numba finds itself in object mode, so that you _know_ that it can't give you speed.

For the full list of supported Python and NumPy features in `nopython` mode, see the Numba documentation here: http://numba.pydata.org/numba-doc/latest/reference/pysupported.html

## Figuring out what isn't working

In [12]:
%%file nopython_failure.py
from numba import jit

@jit
def add(a, b):
    for i in range(100):
        c = i
        f = i + 7
        l = c + f
        
    return a + b

add('a', 'b')

Writing nopython_failure.py


In [13]:
!numba --annotate-html fail.html nopython_failure.py

[fail.html](fail.html)

## Forcing `nopython` mode

In [14]:
add_strings_jit = jit(nopython=True)(add_strings)

In [15]:
add_strings_jit('a', 'b')

TypingError: Caused By:
Traceback (most recent call last):
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/compiler.py", line 235, in run
    stage()
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/compiler.py", line 449, in stage_nopython_frontend
    self.locals)
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/compiler.py", line 805, in type_inference_stage
    infer.propagate()
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/typeinfer.py", line 767, in propagate
    raise errors[0]
TypingError: Invalid usage of + with parameters (str, str)
Known signatures:
 * (int64, int64) -> int64
 * (int64, uint64) -> int64
 * (uint64, int64) -> int64
 * (uint64, uint64) -> uint64
 * (float32, float32) -> float32
 * (float64, float64) -> float64
 * (complex64, complex64) -> complex64
 * (complex128, complex128) -> complex128
 * (uint64,) -> uint64
 * (uint16,) -> uint64
 * (uint8,) -> uint64
 * (uint32,) -> uint64
 * (int32,) -> int64
 * (int16,) -> int64
 * (int64,) -> int64
 * (int8,) -> int64
 * (float32,) -> float32
 * (float64,) -> float64
 * (complex64,) -> complex64
 * (complex128,) -> complex128
 * parameterized
File "<ipython-input-8-cf308a6bf2b5>", line 2
[1] During: typing of intrinsic-call at <ipython-input-8-cf308a6bf2b5> (2)

Failed at nopython (nopython frontend)
Invalid usage of + with parameters (str, str)
Known signatures:
 * (int64, int64) -> int64
 * (int64, uint64) -> int64
 * (uint64, int64) -> int64
 * (uint64, uint64) -> uint64
 * (float32, float32) -> float32
 * (float64, float64) -> float64
 * (complex64, complex64) -> complex64
 * (complex128, complex128) -> complex128
 * (uint64,) -> uint64
 * (uint16,) -> uint64
 * (uint8,) -> uint64
 * (uint32,) -> uint64
 * (int32,) -> int64
 * (int16,) -> int64
 * (int64,) -> int64
 * (int8,) -> int64
 * (float32,) -> float32
 * (float64,) -> float64
 * (complex64,) -> complex64
 * (complex128,) -> complex128
 * parameterized
File "<ipython-input-8-cf308a6bf2b5>", line 2
[1] During: typing of intrinsic-call at <ipython-input-8-cf308a6bf2b5> (2)

In [16]:
from numba import njit

In [17]:
add_strings_jit = njit(add_strings)

In [18]:
add_strings_jit('a', 'b')

TypingError: Caused By:
Traceback (most recent call last):
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/compiler.py", line 235, in run
    stage()
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/compiler.py", line 449, in stage_nopython_frontend
    self.locals)
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/compiler.py", line 805, in type_inference_stage
    infer.propagate()
  File "/home/sunny/anaconda2/lib/python2.7/site-packages/numba/typeinfer.py", line 767, in propagate
    raise errors[0]
TypingError: Invalid usage of + with parameters (str, str)
Known signatures:
 * (int64, int64) -> int64
 * (int64, uint64) -> int64
 * (uint64, int64) -> int64
 * (uint64, uint64) -> uint64
 * (float32, float32) -> float32
 * (float64, float64) -> float64
 * (complex64, complex64) -> complex64
 * (complex128, complex128) -> complex128
 * (uint64,) -> uint64
 * (uint16,) -> uint64
 * (uint8,) -> uint64
 * (uint32,) -> uint64
 * (int32,) -> int64
 * (int16,) -> int64
 * (int64,) -> int64
 * (int8,) -> int64
 * (float32,) -> float32
 * (float64,) -> float64
 * (complex64,) -> complex64
 * (complex128,) -> complex128
 * parameterized
File "<ipython-input-8-cf308a6bf2b5>", line 2
[1] During: typing of intrinsic-call at <ipython-input-8-cf308a6bf2b5> (2)

Failed at nopython (nopython frontend)
Invalid usage of + with parameters (str, str)
Known signatures:
 * (int64, int64) -> int64
 * (int64, uint64) -> int64
 * (uint64, int64) -> int64
 * (uint64, uint64) -> uint64
 * (float32, float32) -> float32
 * (float64, float64) -> float64
 * (complex64, complex64) -> complex64
 * (complex128, complex128) -> complex128
 * (uint64,) -> uint64
 * (uint16,) -> uint64
 * (uint8,) -> uint64
 * (uint32,) -> uint64
 * (int32,) -> int64
 * (int16,) -> int64
 * (int64,) -> int64
 * (int8,) -> int64
 * (float32,) -> float32
 * (float64,) -> float64
 * (complex64,) -> complex64
 * (complex128,) -> complex128
 * parameterized
File "<ipython-input-8-cf308a6bf2b5>", line 2
[1] During: typing of intrinsic-call at <ipython-input-8-cf308a6bf2b5> (2)

## Other compilation flags

There are two other main compilation flags for `@jit`

```python
cache=True
```

if you don't want to always want to get dinged by the compilation time for every run. This will actually save the compiled function into something like a `pyc` file in your `__pycache__` directory, so even between sessions you should have nice fast performance.

```python
nogil=True
```

This releases the GIL.  Note, however, that it doesn't do anything else, like make your program threadsafe.  You have to manage all of those things on your own (use `concurrent.futures`).