# Day 3 - Performant Python
* Python has a reputation for being slow. But it's not necessarily true. Today we're going to talk about how to make Python very fast.
* This session has two parts. First, we'll learn why Python is slow unless one is careful and how to use numpy and numba to make it go fast.
* For the second part we'll give a you a few problems where you can experiment with these tools.
* I personally find this very exciting.

## Jitting
* Compiling functions can bring big performance gains.
* Traditional languages like C are compiled ahead of time before the program can be started.
* Jitting is the process of compiling functions just-in-time during program execution.
* Jitting can bring the benefits of compilation that languages like C enjoy without sacrificing the convenience of Python.
* We're going to use the Numba JIT-compiler that is developed for Python.

### Summing an Array

In [None]:
import time
import random
import numpy as np
from numba import jit, njit

@njit
def mysum(l):
    s = 0
    for e in l:
        s += e
    return s

def benchmark(n=10000, m=100):
    l = np.fromiter([random.random() for _ in range(n)], dtype=float)
    mysum(l)
    start = time.time()
    for _ in range(m):
        s = mysum(l)
    elapsed = time.time() - start
    per_iteration = elapsed / m
    print(f'time per iteration: {per_iteration}s')
    
benchmark()

### Computing Derivatives

In [None]:
# derivative code

## Numba Shenanigans
* Numba is a very powerful tool for writing arbitrary performant code. This is what you need when numpy doesn't cover what you want to do.
* Writing custom value containers (similar to structs in C or objects with no functions).
* Writing custom functions that operate on these value containers.
* Automagically (it's a real word, I promise) parallelize your code to exploit multi-core CPUs and even GPUs.

### Custom dtypes

In [None]:
from numba import vectorize, float64, int64

point_dtype = np.dtype([
    ('x', np.float64),
    ('y', np.float64),
])

def create_point(x, y):
    return np.array([(x, y)], dtype=point_dtype)[0]

@njit
def magnitude(p):
    return np.sqrt(np.power(p['x'], 2) + np.power(p['y'], 2))

@njit
def distance(p1, p2):
    '''return the distance between points p1 and p2'''
    return np.sqrt(np.power(p1['x'] - p2['x'], 2) + np.power(p1['y'] - p2['y'], 2))

p1 = create_point(0, 0)
p2 = create_point(-1, 1)
print(p)
d = distance(p1, p2)
print(d)

m = magnitude(p2)
print('magnitude', m)

@vectorize([float64(float64, float64)], target='parallel')
def magnitude(x, y):
    return np.sqrt(np.power(x, 2) + np.power(y, 2))

x = np.array([1, 2, 3])
y = np.array([3, 2, 5])
m = magnitude(x, y)
print(m)

### Custom ufuncs

### Automagic Parallelization
* Automagically parallelize for loops.
* Exploit multi-core CPUs and even GPUs.
* https://numba.pydata.org/numba-doc/dev/user/vectorize.html#dynamic-universal-functions