# Spatial Processing Benchmarks in Python
### Krzysztof Dyba

In [1]:
# display multiple outputs from single cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

## Introduction

The primary way to perform a benchmark in Python is to use the `Timer` class from `{timeit}` library.
You can view the documentation interactively in Jupyter Notebook by using the question mark "?" before the function or by using the dedicated `help()` function.
However, first we need to import the library.

In [2]:
import timeit
?timeit.Timer

The use of this function is a bit more complicated than in R.
Let's test it as before with the example of sampling numbers with replacement.
For this purpose, we can use `choices()` function from the `{random}` library.

In [3]:
import random
random.choices(range(1, 100), k = 5)

[62, 82, 75, 53, 57]

Now we will use the `repeat()` function which performs the benchmark multiple times.
We need to define:
  1. expression as text
  2. global namespace
  3. number of executions of the expression
  4. number of repetitions of the test
The result will be a list with timings.

In [4]:
n = 1_000_000
t = timeit.repeat("random.choices(range(1, 100), k = n)",
                  globals = globals(), number = 1, repeat = 5)
t

[0.4491137099999998,
 0.3680256270000002,
 0.36668360799999977,
 0.38022859900000006,
 0.3628546049999999]

From these values, we can calculate the basic statistics (like mean and standard deviation).
Statistical functions can be found in the `{statistics}` library.

In [5]:
import statistics
round(statistics.mean(t), 2)
round(statistics.stdev(t), 4)

0.39

0.0362

For Jupyter Notebooks, there is an alternative easier way.
We can use the `timeit()` function with a percent symbol:
  - `%timeit` to benchmark a single function
  - `%%timeit` to benchmark an entire cell

In [None]:
%timeit -r 5 -n 1 random.choices(range(1, 100), k = n)

In [None]:
%%timeit -r 5 -n 1
random.choices(range(1, 100), k = n)

**Exercise**

Compare computing the mean value of the list using `statistics.mean()` and `sum() / len()` functions.

## Part I: Raster Data

