---
title: "End-to-End Benchmarking"
---

## Processing benchmark results

### Import dependencies

The CarbonPlan team put together some utilities for parsing, processing, and visualizing the benchmarking results in [carbonplan_benchmarks](https://github.com/carbonplan/benchmark-maps). We'll use those utilities along with the [Holoviz](https://holoviz.org/) HoloViz suite of tools for visualization and [Pandas](https://pandas.pydata.org/) as the underlying analysis tool.

In [None]:
import carbonplan_benchmarks.analysis as cba
import hvplot
import pandas as pd
pd.options.plotting.backend = 'holoviews'


### Load benchmark results

First, define the paths to the baseline images that the tests will be compared against and paths to the metadata files associated with each benchmarking run.

In [None]:
baseline_fp = "s3://carbonplan-benchmarks/benchmark-data/baselines.json"
metadata_base_fp = 's3://carbonplan-benchmarks/benchmark-data'
metadata_files = [
    'data-2023-08-04T01-14-24.json',
    'data-2023-08-04T01-15-30.json',
    'data-2023-08-04T01-16-27.json',
    'data-2023-08-04T01-17-25.json',
    'data-2023-08-04T01-18-37.json',
    'data-2023-08-04T01-19-47.json',
    'data-2023-08-04T01-21-02.json',
    'data-2023-08-04T01-22-08.json'
]

Now, use the utilities from `carbonplan_benchmarks` to load the metadata and baseline images into DataFrames, process those results, and create a summary DataFrame for all runs.

In [None]:
snapshots = cba.load_snapshots(snapshot_path=baseline_fp)
summary_dfs = []
for file in metadata_files:
    fp = f"{metadata_base_fp}/{file}"
    metadata, trace_events = cba.load_data(metadata_path=fp, run=0)
    data = cba.process_run(metadata=metadata, trace_events=trace_events, snapshots=snapshots)
    summary_dfs.append(cba.create_summary(metadata=metadata, data=data))
summary = pd.concat(summary_dfs)    

In [None]:
summary.head(n=8)


## Visualize results

First, let's see how the duration of each action changes as a function of the zoom level. An important piece of context is that the underlying dataset only has four pyramid levels, so zoom=4 does not need to fetch any new data.

In [None]:
summary.plot.scatter(x='zoom', y='duration', by='zarr_version')

Now, let's instead show the duration as a function of the chunk size.

In [None]:
summary.plot.scatter(x='chunk_size', y='duration', by='zarr_version')

Now, let's look at the request duration as a funciton of the chunk size.

In [None]:
summary.plot.scatter(x='chunk_size', y='request_duration', by='zarr_version')

Lastly, let's look at the fraction of time that's spent fetching data as a function of the chunk size.

In [None]:
summary.plot.scatter(x='chunk_size', y='request_percent', by='zarr_version').opts(ylim=(0, 100))