---
title: "End-to-End Benchmarking"
---

## Processing benchmark results

### Import dependencies

The CarbonPlan team put together some utilities for parsing, processing, and visualizing the benchmarking results in [carbonplan_benchmarks](https://github.com/carbonplan/benchmark-maps). We'll use those utilities along with the [Holoviz](https://holoviz.org/) HoloViz suite of tools for visualization and [Pandas](https://pandas.pydata.org/) as the underlying analysis tool.

In [1]:
import carbonplan_benchmarks.analysis as cba
import hvplot
import holoviews as hv
import pandas as pd

pd.options.plotting.backend = "holoviews"

In [2]:
%load_ext autoreload
%autoreload 2

from carbonplan_benchmarks import analysis as cba


### Show individual results

In [3]:
baseline_fp = "s3://carbonplan-benchmarks/benchmark-data/482049d/baselines.json"
metadata_base_fp = "s3://carbonplan-benchmarks/benchmark-data"
url_filter = 'carbonplan-benchmarks.s3.us-west-2.amazonaws.com/data/'

In [18]:
plt_opts={'width': 500, 'height': 200}
def plot_individual_result(metadata_fp, snapshot_fp):
    metadata, trace_events = cba.load_data(metadata_path=metadata_fp, run=0)
    snapshots = cba.load_snapshots(snapshot_path=snapshot_fp)
    data = cba.process_run(metadata=metadata, trace_events=trace_events, snapshots=snapshots)
    summary = cba.create_summary(metadata=metadata, data=data)
    xlims = (data['action_data'].loc[0,'start_time'], data['action_data'].loc[0,'end_time'])
    requests_plt = cba.plot_requests(data['request_data'], url_filter=url_filter).opts(**plt_opts, xlim=xlims)
    frames_plt = cba.plot_frames(data['frames_data'], yl=2.5).opts(**plt_opts, xlim=xlims)
    rmse_plt = cba.plot_screenshot_rmse(screenshot_data=data['screenshot_data'], metadata=metadata).opts(**plt_opts)
    return (requests_plt + frames_plt + rmse_plt).cols(1)

#### Results from one of the V2 runs

In [19]:
plot_individual_result(f"{metadata_base_fp}/207af76/data-2023-08-06T20-16-27.json", baseline_fp)

#### Results from one of the V3 runs

In [21]:
plot_individual_result(f"{metadata_base_fp}/4c65e4e/data-2023-08-07T03-07-19.json", baseline_fp)

### Load all benchmark results

First, define the paths to the baseline images that the tests will be compared against and paths to the metadata files associated with each benchmarking run.

In [6]:
metadata_files = [
    "207af76/data-2023-08-06T20-16-27.json",
    "75a6745/data-2023-08-06T23-04-38.json",
    "75a6745/data-2023-08-06T22-08-04.json",
    "75a6745/data-2023-08-06T21-32-42.json",
    "4c65e4e/data-2023-08-07T03-07-19.json",
    "4c65e4e/data-2023-08-07T03-16-10.json",
    "4c65e4e/data-2023-08-07T03-34-57.json",
    "4c65e4e/data-2023-08-07T04-04-31.json",
]

Now, use the utilities from `carbonplan_benchmarks` to load the metadata and baseline images into DataFrames, process those results, and create a summary DataFrame for all runs. Given the large size of the trace files, this may take up to several minutes.

In [7]:
snapshots = cba.load_snapshots(snapshot_path=baseline_fp)
summary_dfs = []
nruns = 50
for file in metadata_files:
    fp = f"{metadata_base_fp}/{file}"
    for run in range(nruns):
        if (run % 10) == 0:
            print(f"Processing run {run} of {file}")
        metadata, trace_events = cba.load_data(metadata_path=fp, run=run)
        data = cba.process_run(
            metadata=metadata, trace_events=trace_events, snapshots=snapshots
        )
        summary_dfs.append(cba.create_summary(metadata=metadata, data=data, url_filter=url_filter))
summary = pd.concat(summary_dfs)

Processing run 0 of 207af76/data-2023-08-06T20-16-27.json
Processing run 10 of 207af76/data-2023-08-06T20-16-27.json
Processing run 20 of 207af76/data-2023-08-06T20-16-27.json
Processing run 30 of 207af76/data-2023-08-06T20-16-27.json
Processing run 40 of 207af76/data-2023-08-06T20-16-27.json
Processing run 0 of 75a6745/data-2023-08-06T23-04-38.json
Processing run 10 of 75a6745/data-2023-08-06T23-04-38.json
Processing run 20 of 75a6745/data-2023-08-06T23-04-38.json
Processing run 30 of 75a6745/data-2023-08-06T23-04-38.json
Processing run 40 of 75a6745/data-2023-08-06T23-04-38.json
Processing run 0 of 75a6745/data-2023-08-06T22-08-04.json
Processing run 10 of 75a6745/data-2023-08-06T22-08-04.json
Processing run 20 of 75a6745/data-2023-08-06T22-08-04.json
Processing run 30 of 75a6745/data-2023-08-06T22-08-04.json
Processing run 40 of 75a6745/data-2023-08-06T22-08-04.json
Processing run 0 of 75a6745/data-2023-08-06T21-32-42.json
Processing run 10 of 75a6745/data-2023-08-06T21-32-42.json
P

## Visualize results

In [8]:
summary[['approach','zarr_version','chunk_size','zoom','duration','fps','filtered_requests','request_duration','request_percent']]

Unnamed: 0,approach,zarr_version,chunk_size,zoom,duration,fps,filtered_requests,request_duration,request_percent
0,direct-client,v2,1,0.0,1044.629,54.564826,3,891.465,85.337953
0,direct-client,v2,1,0.0,965.975,38.303269,3,825.523,85.460079
0,direct-client,v2,1,0.0,892.225,43.710947,3,761.602,85.359859
0,direct-client,v2,1,0.0,956.304,36.599240,3,819.009,85.643164
0,direct-client,v2,1,0.0,876.892,42.194478,3,756.874,86.313252
...,...,...,...,...,...,...,...,...,...
0,direct-client,v3,25,0.0,2824.118,27.265150,13,2559.694,90.636935
0,direct-client,v3,25,0.0,6430.100,7.931447,13,6177.064,96.064820
0,direct-client,v3,25,0.0,2626.289,20.180567,13,2367.954,90.163497
0,direct-client,v3,25,0.0,2610.521,17.621004,13,2366.830,90.665043


In [9]:
cmap=['#E1BE6A', '#40B0A6']
plt_opts={'width': 600, 'height': 400}


In [10]:
summary.hvplot.box(
    y="duration",
    by=["chunk_size", 'zarr_version'],
    c="zarr_version",
    cmap=cmap,
    ylabel="Action duration (ms)",
    xlabel="Chunk size (MB)",
    legend=False
).opts(**plt_opts)


In [11]:
summary.hvplot.box(
    y="request_duration",
    by=["chunk_size", 'zarr_version'],
    c="zarr_version",
    cmap=cmap,
    ylabel="Time spent on requests (ms)",
    xlabel="Chunk size (MB)",
    legend=False
).opts(**plt_opts)


In [12]:
summary.hvplot.box(
    y="filtered_requests",
    by=["chunk_size", 'zarr_version'],
    c="zarr_version",
    cmap=cmap,
    ylabel="Number of dataset requests",
    xlabel="Chunk size (MB)",
    legend=False
).opts(**plt_opts)

In [13]:
summary.hvplot.box(
    y="total_requests",
    by=["chunk_size", 'zarr_version'],
    c="zarr_version",
    cmap=cmap,
    ylabel="Total number of requests",
    xlabel="Chunk size (MB)",
    legend=False
).opts(**plt_opts)

In [14]:
summary.hvplot.box(
    y="non_request_duration",
    by=["chunk_size", 'zarr_version'],
    c="zarr_version",
    cmap=['#E1BE6A', '#40B0A6'],
    ylabel="Time not spent on requests (ms)",
    xlabel="Chunk size (MB)",
    legend=False
).opts(**plt_opts)

In [15]:
summary.hvplot.box(
    y="request_percent",
    by=["chunk_size", 'zarr_version'],
    c="zarr_version",
    cmap=['#E1BE6A', '#40B0A6'],
    ylabel="Percent of time spend on requests",
    xlabel="Chunk size (MB)",
    legend=False
).opts(**plt_opts)

In [16]:
summary.hvplot.box(
    y="fps",by=["chunk_size", 'zarr_version'], c="zarr_version", cmap=cmap, ylabel="Frames per second (FPS)", xlabel="Chunk size (MB)", legend=False
).opts(**plt_opts)
