# Scalability Analysis

Parameters to this notebook that you might want to tweak:

In [None]:
benchmark_name = "scalability_benchmark"  # The name of the benchmark as defined with Google Benchmark
output_filename = "scaling.png"  # The image name to save the result to
benchmark_program = "./bench"  # The path to the compiled benchmark program
hyperthreading = True  # Whether hyperthreading is enabled on the machine (will halve the number of threads)

In [None]:
import json
import matplotlib.pyplot as plt
import os
import pandas
import subprocess

Create the environment for our benchmark run:

In [None]:
env = os.environ.copy()
count = os.cpu_count()
if hyperthreading:
    count = count // 2
env["OMP_NUM_THREADS"] = str(count)
env.setdefault("OMP_PROC_BIND", "spread")

Run the actual benchmark and load the generated data into a JSON data structure:

In [None]:
process = subprocess.run(
    f"{benchmark_program} --benchmark_filter={benchmark_name}/* --benchmark_format=json".split(),
    env=env,
    stdout=subprocess.PIPE,
)

In [None]:
data = json.loads(process.stdout.decode())

Parse the scalability data into a pandas dataframe:

In [None]:
df = pandas.read_json(json.dumps(data["benchmarks"]))

Do some processing that adds the relevant columns:

In [None]:
df = df[df.run_type == "iteration"]
df["num_threads"] = df["per_family_instance_index"] + 1
tseq = df.loc[lambda df: df["num_threads"] == 1]["cpu_time"][0]
df["speedup"] = tseq / df["cpu_time"]

Plot in Jupyter notebook:

In [None]:
fig, ax = plt.subplots()
ax.plot(df["num_threads"], df["num_threads"], linestyle="--", label="Perfect Speedup")
df.plot("num_threads", "speedup", ax=ax, label="Measured Speedup")
ax = ax.legend()

Additionally, save to an image file:

In [None]:
fig.savefig(output_filename)