python-cli-bench

a benchmark harness for python CLIs using hyperfine.

features

wall-time benchmarks with hyperfine (warm/cold cache)
import time analysis with python -X importtime
call tree profiling with pyinstrument
memory profiling with tracemalloc
Welch's t-test for statistically significant regression detection
terminal visualization with plotext
JSON export for CI integration

install

# with uv
uv tool install python-cli-bench

# with pip
pip install python-cli-bench

# hyperfine is required for benchmarks
brew install hyperfine  # macOS

quickstart

# create a bench.toml in your project
cd your-project
cli-bench init

# edit bench.toml to configure your CLI commands
# then run benchmarks
cli-bench run

# save results for comparison
cli-bench run -o baseline.json

# compare against baseline (with Welch's t-test)
cli-bench run -c baseline.json

configuration

create a bench.toml in your project root:

[project]
name = "mycli"
import_path = "mycli.cli"          # for profile/imports analysis
version_module = "mycli"           # to detect version
typer_app_path = "mycli.cli:app"   # for memory profiling (if using typer)

[[commands]]
name = "mycli --help"
args = ["mycli", "--help"]
category = "startup"
description = "baseline import cost"

[[commands]]
name = "mycli --version"
args = ["mycli", "--version"]
category = "startup"
description = "minimal output"

[[commands]]
name = "mycli some-command"
args = ["mycli", "some-command"]
category = "local"
description = "local command without network"

usage

# run benchmarks
cli-bench run                    # all commands
cli-bench run --category startup # specific category
cli-bench run -o results.json    # save results
cli-bench run -c baseline.json   # compare against baseline

# analysis tools
cli-bench profile                # pyinstrument call tree
cli-bench imports                # import time breakdown
cli-bench imports --top 50       # show more modules

# visualization
cli-bench plot results.json
cli-bench plot results.json -c baseline.json

# CI-friendly compare outputs
cli-bench compare baseline.json head.json --fail-on-regression --threshold 15
cli-bench compare baseline.json head.json --summary-md compare.md --digest-json digest.json

statistical comparison

when comparing against a baseline, the tool uses Welch's t-test to determine if differences are statistically significant (p < 0.05). results marked with ? are not statistically significant and should be interpreted with caution.

license

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
src/cli_bench		src/cli_bench
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

python-cli-bench

features

install

quickstart

configuration

usage

statistical comparison

license

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

python-cli-bench

features

install

quickstart

configuration

usage

statistical comparison

license

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages