datacube-benchmark

Utilities for benchmarking Zarr datacubes — generate synthetic stores with different chunking schemes, compressors, and dtypes, then measure read performance under realistic access patterns.

Companion package to the Datacube Guide, which documents common pitfalls when producing and consuming multi-dimensional data products.

Installation

pip install datacube-benchmark

Python 3.12+ is required.

Quickstart

Create a synthetic Zarr store on local disk and time a few random-access patterns against it:

from pathlib import Path

import obstore as obs
import zarr

import datacube_benchmark

path = Path.cwd() / "data" / "test.zarr"
path.mkdir(parents=True, exist_ok=True)
store = obs.store.LocalStore(str(path))
zarr_store = datacube_benchmark.create_zarr_store(store)

arr = zarr.open_array(zarr_store, zarr_version=3, path="data")
results = datacube_benchmark.benchmark_access_patterns(arr, num_samples=10)
print(results)

create_zarr_store takes target sizes and chunk shapes as strings or pint quantities (e.g. "1 GB", "10 MB"), and writes through an obstore store — so the same call works against a local directory, S3, GCS, or Azure by swapping the store.

What's in the box

create_zarr_store, create_or_open_zarr_store, create_or_open_zarr_array, create_empty_dataarray — build synthetic Zarr datacubes at a target size, resolution, and chunk shape.
benchmark_zarr_array — time random reads against one access pattern ("point", "time_series", "spatial_slice", "full") and return summary statistics with units attached.
benchmark_access_patterns — run all four access patterns and return the combined results as a pandas.DataFrame.
benchmark_dataset_open — time xarray.open_dataset on a Zarr store.
Config — a dataclass collecting the common knobs (compressor, target array size, sample counts, concurrency).

See the API reference for the full signatures and parameter docs.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
docs		docs
src/datacube_benchmark		src/datacube_benchmark
tests		tests
.codespellrc		.codespellrc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.txt		LICENSE.txt
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

datacube-benchmark

Installation

Quickstart

What's in the box

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

datacube-benchmark

Installation

Quickstart

What's in the box

License

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages