# Continuous performance Analysis for Python

*Following the conference By Arthur Pastel*

This Notebook contains some samples of code. They are used to illustrate My blog article that summarizes the conferences I attended at the (very great) Pycon Fr 2023, in Bordeaux, France.

The code samples shown here were mostly provided in the conference and I rewrote them with little change, and of course, added commentaries.

/!\ This notebook does not cover the main part of Arthur's presentation, which is the presentation of [Codspeed](https://codspeed.io/), which is still in development and is not thought for a data analysis / notebook kind of use. Codspeed is thought to test performance of programs that will run in production servers. With this being said, the other elements that were presented during his speech can be used with a notebook (and they are currently available), so this is what we present here.

## Creating a Demo function (Toy Algorithm)

We create a function that calculates the [Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number) in a recursive manner. We will be doing our performance tests on it.

In [1]:
def fibonacci(n: int) -> int:
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

In [2]:
fibonacci(6)

8

## The basic approach

The easiest way to test for performance is to measure the execution time before and after.

For that, we can use ```time.perf_counter```, which is a high resolution timer, as you can see below.

*We also have ```time.time```, which uses the system clock, but is less precise, so always prefer the above function to measure performance of functions*

In [3]:
import time

start = time.perf_counter()
fibonacci(10)
end = time.perf_counter()

elapsed_time = round((end - start) * 10**6, 3)
print(f"Elapsed time : {elapsed_time} µs")

Elapsed time : 161.0 µs


A (big) problem of this approach, is that it depends on hardware and another problem is that it depends on whatever is being executed by the system at the time. A quick fix for the latter is to use basic statistics: run the test many times and do a mean.

In [4]:
samples = []
for sample in range(100):
    start = time.perf_counter()
    fibonacci(10)
    end = time.perf_counter()
    samples.append(end - start)

mean_perf = sum(samples) / len(samples)
mean_perf_micros = round(mean_perf * 10**6, 3)
print(f"Mean : {mean_perf_micros} µs")

Mean : 38.44 µs


As you can see, the differnce can be very big. At the conference, Arthur's slides showed around 12 to 13 µs, about half the time. Maybe he has a faster computer than mine, or maybe jupyter lab affects code performance.

## Using specific libraries

[pytest-benchmark](https://anaconda.org/anaconda/pytest-benchmark) is a specific library that allows us tho run performance tests. it takes care of all the previously defined steps (it runs the function several times and applies a statistical approach), plus it does some other fixees to the python environment that could affect performance. You can also [check the offical documentation](https://pytest-benchmark.readthedocs.io/en/stable/).

The ipytest library allows us to use pytest with jupyter notebooks:

In [5]:
import ipytest
import pytest

ipytest.autoconfig()

In [6]:
%%ipytest -qq
def test_fibo_5(benchmark):
    @benchmark
    def _():
        fibonacci(5)


def test_fibo_10(benchmark):
    @benchmark
    def _():
        fibonacci(10)


def test_fibo_15(benchmark):
    @benchmark
    def _():
        fibonacci(15)

[32m.[0m[32m.[0m[32m.[0m[32m                                                                                          [100%][0m

[33m---------------------------------------------------------------------------------------- benchmark: 3 tests ----------------------------------------------------------------------------------------[0m
Name (time in us)          Min                   Max                Mean              StdDev              Median                 IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
[33m----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------[0m
test_fibo_5         [32m[1m    2.4000 (1.0)    [0m[32m[1m  1,373.8000 (1.0)    [0m[32m[1m    2.9909 (1.0)    [0m[32m[1m    8.2103 (1.0)    [0m[32m[1m    2.5000 (1.0)    [0m[32m[1m    0.0000 (1.0)    [0m 136;20825[32m[1m      334.3497 

The advantage of using the ```@benchmark``` decorator is that you could add code that executes before the test without affecting the performance of it:

```python
def some_test(benchmark):
    # add some code here
    x = whatever
    @benchmark
    def _():
        some_function(x)
```