### How to run benchmarks

Benchmarks are generated by a matrix-like combination (see below) that writes all the commands in the file `run_benchmarks.sh`

You can now run these benchmarks using `/bin/bash run_benchmarks.sh`.

The results file is a list of JSON objects generated by the `atomic_performance_benchmark.py` script.

Once you're done running the benchmarks, check the `Statistical analysis of experiments.ipynb` to see if the [Coefficient of variation](https://en.wikipedia.org/wiki/Coefficient_of_variation) wasn't too high (indicating noise).

### How to generate benchmarks

To generate your own benchmarks (changing number of threads or iterations), change the values from the matrix below and execute the cells. It'll write the lines in `run_benchmarks.sh`:

```bash
python -X gil=0 performance/atomic_performance_benchmark.py remove_elements --collection list --num_threads 100 >> results/results_83b8f2.txt
python -X gil=0 performance/atomic_performance_benchmark.py remove_elements --collection list --num_threads 100 >> results/results_83b8f2.txt
python -X gil=0 performance/atomic_performance_benchmark.py remove_elements --collection list --num_threads 100 >> results/results_83b8f2.txt
```

### Recommendations

OSs will schedule multiple threads on multiple cores and interweave them with other processes running on your system, so these benchmarks will always suffer from some "noise". Make sure you don't have heavy computing processes consuming CPU before doing so.

### How to install Python 3.13 free threaded?

I'm using [`uv`](https://github.com/astral-sh/uv) to install free threaded python in replicable environments. To do so, follow the install instructions in [`uv`'s repo](https://github.com/astral-sh/uv) and then run:

```bash
$ uv python install cpython-3.13.0+freethreaded-macos-x86_64-none
```

### A note about hyperfine and other benchmark tools

I'm trying to scope the benchmarks to ONLY the multi-threaded portion of the code. Each benchmark has some considerable time to setup, so I can't use a tool that looks as the process as a whole. The time is measured as "wall clock time" using `time.monotonic()` inside the process of the benchmark.

I decided against `timeit` or `pyperf` because I don't know how the internals work. I'm not sure if there are subprocesses created or something else. That's why I prefer to write the bash commands to start the process by my shell.

In [73]:
import json
import pandas as pd
import matplotlib.pyplot as plt
import uuid

In [2]:
import seaborn as sns

In [3]:
import subprocess
from itertools import product

In [31]:
from IPython.display import display, HTML, clear_output, Markdown

In [None]:
RUNS = 10

THREADS = [1, 10, 100]

In [72]:
benchmarks = [{
    "subcommand": "create_elements",
    "matrix": {
        "gil": [0, 1],
        "collection": ["list", "set", "dict"],
        "num_threads": THREADS,
    },
}, {
    "subcommand": "remove_elements",
    "matrix": {
        "gil": [0, 1],
        "collection": ["list", "set", "dict"],
        "num_threads": THREADS,
    },
}, {
    "subcommand": "check_length",
    "matrix": {
        "gil": [0, 1],
        "collection": ["list", "set", "dict"],
        "num_threads": THREADS,
    },
}, {
    "subcommand": "move_elements",
    "matrix": {
        "gil": [0, 1],
        "collection": ["list", "set", "dict"],
        "num_threads": THREADS,
    },
}, {
    "subcommand": "increment_elements",
    "matrix": {
        "gil": [0, 1],
        "collection": ["list", "dict"],
        "num_threads": THREADS,
        "num_elements": [100000]
    },
}, {
    "subcommand": "increment_integer",
    "matrix": {
        "gil": [0, 1],
        "collection": ["list"],
        "num_threads": THREADS,
        "num_iterations": [10000]
    },
}]

In [75]:
OUTPUT_FILE = "run_benchmarks.sh"

In [86]:
# WARNING! Create the dir

!mkdir -p results

In [83]:
results_short_uuid = str(uuid.uuid4()).split("-")[0]
command_template = (
    "python -X gil={gil} performance/atomic_performance_benchmark.py "
    "{subcommand} --num_threads {num_threads} --collection {collection} " + f"results/results_{results_short_uuid}.txt"
)
del command_template
base_template = (
    "python -X gil={gil} performance/atomic_performance_benchmark.py "
    "{subcommand}"
)
combinations_per_benchmark = {}
fp = open(OUTPUT_FILE, "w")
for benchmark in benchmarks:
    configuration_params = benchmark['matrix'].keys()
    command_template = base_template
    for param in configuration_params:
        if param in {'gil', 'subcommand'}:
            continue
        command_template += f" --{param} " + "{" + f"{param}" + "}"

    subcommand = benchmark['subcommand']
    matrix = benchmark['matrix']
    combinations = [
        dict(zip(matrix.keys(), values))
        for values in product(*benchmark["matrix"].values())
    ]
    combinations_per_benchmark[subcommand] = combinations
    
    for combo in combinations:
        tpl = command_template
        params = {**combo, "subcommand": subcommand}
        cmd = tpl.format(**params)
        params.setdefault('num_iterations', 1)
        message_md = f"**Benchmark**: {subcommand}"
        for run in range(RUNS):
            print(cmd + f" >> results_{results_short_uuid}.txt", file=fp)

In [84]:
!head run_benchmarks.sh

python -X gil=0 performance/atomic_performance_benchmark.py create_elements --collection list --num_threads 1 >> results_42c72135.txt
python -X gil=0 performance/atomic_performance_benchmark.py create_elements --collection list --num_threads 1 >> results_42c72135.txt
python -X gil=0 performance/atomic_performance_benchmark.py create_elements --collection list --num_threads 1 >> results_42c72135.txt
python -X gil=0 performance/atomic_performance_benchmark.py create_elements --collection list --num_threads 1 >> results_42c72135.txt
python -X gil=0 performance/atomic_performance_benchmark.py create_elements --collection list --num_threads 1 >> results_42c72135.txt
python -X gil=0 performance/atomic_performance_benchmark.py create_elements --collection list --num_threads 1 >> results_42c72135.txt
python -X gil=0 performance/atomic_performance_benchmark.py create_elements --collection list --num_threads 1 >> results_42c72135.txt
python -X gil=0 performance/atomic_performance_benchmark.py cr

In [85]:
!wc -l run_benchmarks.sh

     881 run_benchmarks.sh
