# High performance Python 🚀
### Zbyszek, Seetha, & Jakob
### ASPP LatAm 2023, CDMX, Mexico

fork and clone the respository now please! :)

## Outline

* Introduction
* Analyze what makes your code slow with **profiling** (Jakob)
* Speed & *convenience* with **Numba** (Seetha)
* Speed & *flexibility* with **Cython** (Zbyszek)
* Outroduction

## Introduction

* By now you are the *Master of Research*(TM).
![Master of research](figures/mor.png)
* Using your new skills you can confidently transform any idea into a great manuscript!

* It seems like the only thing that's holding you back is the **execution speed** of your scripts!
* Both Cython and Numba are tools to make your code faster -> **optimization**.

## Exercise

Who thinks that they would benefit from faster code?

Please raise your hand.

Who has spent countless hours fiddeling with code to make it faster for questionable benefits?

Please raise your hand.

## The three rules of optimization
(adapted from Sebastian Witowski, EuroPython 2016)

#### 1. Don't.
- Likely you don't need it.
- Optimization comes with costs.

## Exercise

What are costs associated with optimization?

#### 2. Don't yet.
- Is your code finished?
- Did you write tests?
- Are you sure it's worth the investment?

#### 3. Profile
- Collect data - don't guess which part of your code you should optimize!

## Collect basic data
- while optimizing it's a good idea to keep track of the total runtime of your script
- even though modern profilers introduce little overhead this makes sure that your code changes translate into actual speedups
- the simplest way to do this is via `time` (or the equivalent on your OS), e.g., `time python myscript.py`
  - you're typically interested in "user time"

## Collect fine grained data

- **profilers** monitor the execution of your script, record statistics, and thus can **provide an understanding of the performance characteristics of your code**
- here we consider [py-spy](https://github.com/benfred/py-spy), a sampling-based runtime profiler for Python
  - simply speaking `py-spy` examines your program at regular intervals and records which function (or rather line) is currently being executed
- it does not require any modification of your code!
![sampling](./figures/sampling.svg)

## Using `py-spy`
- you can simply profile your script with `py-spy record -o profile.svg python myprogram.py`
  - to make timings accurate it needs to collect enough of data; you can control the "sampling rate" using the `-r` argument
  - get more info on arguments with `-h`
- `py-spy` will will produce a "flamegraph" like the following (here `profile.svg`; you can open it with, e.g., firefox)
![flamegraph](./figures/flamegraph.svg)

## Demo

Using a simple script, Jakob will explain how to read flamegraphs.

1. show code
2. draw call graph on the whiteboard
3. execute py-spy
4. students should analyze flamegraph

## Exercise

It's time to put theory into practice. We have prepared an example script (see [./profiling/numerical_integration.py](./profiling/numerical_integration.py)) which numerically computes the integral of a function and measures the error with respect to analytical integration.

0. Fork & clone this repository.
1. Familarize yourself with the script and exectute it (`python numerical_integration.py -h` for some help).
2. Use the workflow (_time -> py-spy -> edit_, and repeat) to reduce the script's execution time. **Make sure not to break the tests.**
3. Commit your changes in a new branch and create a PR. Include the duration before/after optimization in the PR message.

#### Hints
- plateaus indicate optimization opportunities
- focus on your code, for now, ignore time spent in other libraries

Afterwards we will discuss the results jointly.

## Refresher: numerical integration

![RiemannSum](figures/MidRiemann2.svg)

Riemann sum: $\int_a^b dx f(x) \approx \sum_{i = 0}^{n - 1} f(a + (i + 0.5) \Delta x) \Delta x$ with $\Delta x = (b - a)/n$

here $a=0, b=2, n=4$

## Exercise discussion

What did we learn?
- ...

## Profiling conclusion

- Before optimizing, first finish your code & write tests!
- Then *measure* to find functions(/lines) that take up most of the time.
- Only optimize the relevant functions(/lines), measure again, and *know when to stop*!
  - 1min script you run 5 times
  - 8h script you run 1000 times
- To gain some basic data, you can use builtin tools
  - `time` (commandline)
  - `%timeit` (ipython, jupyter)
  - `import timeit; timeit.time('some_func()')` (requires code changes)
- profilers collect more fine grained data

## Beyond `py-spy`
- [py-spy](https://github.com/benfred/py-spy) is just one of many profilers; alternatives include
  - [cProfile](https://docs.python.org/3/library/profile.html) (builtin!) + [snakeviz](https://github.com/jiffyclub/snakeviz)
  - [pyinstrument](https://github.com/joerick/pyinstrument)
  - [austin](https://github.com/P403n1x87/austin)
- Here we focus on profiling *runtime*, but maybe you are limited by *memory*
  - [memray](https://github.com/bloomberg/memray)
 - With modern tools, **profiling is easy! Use it!**

![InhalaExhala](figures/inhala.jpg)

### Optimization: what to do (in order of [subjective] increasing complexity)

- **Do nothing**
- Vectorization (`numpy`!!)
- Data structures and algorithms
- Memoization / caching
- Non-Python libraries (`blas`, `openblas`, `blis`, `atlas`, Intel `mkl`, ...)
- Buy better hardware
- **Numba**
- **Cython** / pythran
- **Parallelization** (->tomorrow)
- GPUs (`cuda`, `opencl`, `directml`, ...)
- Low-level code