# **Numba & Dask Workshop - Introduction**

Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code.

Dask is a library which offers practical parallelism primitives to scale your computation logic

Numba -> http://numba.pydata.org/

Dask -> https://dask.org/

### Is python slow? Why?

- Global Interpreter Lock (no shared memory multi-threading)
- No compiler optimizations (because its dynamically typed)

### Lets look at two functions which could be slow in python 

- The first function computes $ y = \sum_{i=1}^n {x_i^{2} + x_i^{3}} $

- The second function computes $ y = \sum_{i=1}^n {tan(x_i) + atan(x_i)} $

In [1]:
def first_slow_function(n):
    return sum((x ** 2 + x ** 3) for x in range(n ** 7))

%time first_slow_function(10)

Wall time: 6.6 s


2499999833333308333335000000

In [None]:
from math import tan, atan

def second_slow_function(n):
    return sum(tan(x) * atan(x) for x in range(n ** 7))

%time second_slow_function(10)

### What does Python's ecosystem offer?

numpy
=====

### What exactly does numpy offer?

- [A multidimensional, homogeneous array type, along with a type system for array elements (called "dtypes")](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html)
- [Ufuncs with Broadcasting](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
- Lots of prebuilt functions oriented towards scientific computing, including but not limited to
    - [General Math](https://docs.scipy.org/doc/numpy/reference/routines.math.html)
    - [Linear Algebra](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html)
    - [Statistics](https://docs.scipy.org/doc/numpy/reference/routines.statistics.html)

![](./img/cint_vs_pyint.png)
***
![](./img/array_vs_list.png)

### Import numpy and start creating numpy ndarrays. It's as simple as a python list

In [None]:
import numpy as np
x = np.array([1, 2, 3])
x.dtype

### But do remember you can only create homogenous (same type) arrays**

In [None]:
y = np.array([1,2,3], dtype=np.float)
y.dtype
y

### Lets try re-writing the slow functions we wrote earlier leveraging numpy

In [None]:
def func1(x):
    return x ** 2 + x ** 3

def first_slow_function_in_numpy(n):
    vf = np.vectorize(func1)
    return np.sum(vf(np.arange(n ** 7, dtype=np.float64)))

%time first_slow_function_in_numpy(10)

### NOTE: <i>np.vectorize</i> is a function which allows us to loop over a numpy array, sort of like map in python. 
### It's used for convenience and not for performance. We are doing this to avoid for-loops.

In [None]:
from math import tan, atan

def func2(x):
    return tan(x) * atan(x)

def second_slow_func_in_numpy(n):
    vf = np.vectorize(func2)
    return np.sum(vf(np.arange(n ** 7)))

%time second_slow_func_in_numpy(10)

In [None]:
import numpy as np

def func3(x):
    return np.tan(x) * np.arctan(x)

def third_slow_func_in_numpy(n):
    vf = np.vectorize(func3)
    # notice i am raising by 6 not 7 - numpy's tan and arctan are slow
    return np.sum(vf(np.arange(n ** 6)))

%time third_slow_func_in_numpy(10)

### **You are restricted by the usecases for which numpy was designed**
### **Can we do better?**

# **Exercise**

Try creating a slow CPU bound operation in python for yourself to test. This program should ideally have:

- Loops
- Floating point arithmetic

You can try speeding up this function of yours in the next sections 