refactor(bench): migrate benchmarks to codspeed exec CLI by not-matthias · Pull Request #16 · CodSpeedHQ/valgrind-codspeed

not-matthias · 2026-06-02T09:39:57Z

Replace the pytest-codspeed harness (bench.py) with a small generator
(generate_config.py) that emits a codspeed.yml of exec-harness targets,
run via codspeed run. Each target's name is kept byte-identical to the
old pytest test id (test_valgrind[, , ]) so the
dashboard names are unchanged.

The exec-harness reports the URI as exec_harness::<name> (prefix is
hardcoded in the runner) instead of bench.py::<name>; the platform keys
history on the full URI, so this starts a fresh history series with the
same display names.

CI now matrixes over the Valgrind version only (one job per version) and
generates that version's config from valgrind --version. Per-version
jobs preserve failure isolation: the exec-harness aborts the whole batch
on a single non-zero command, so upstream versions stay separate and keep
continue-on-error.

greptile-apps · 2026-06-02T09:43:53Z

Greptile Summary

This PR replaces the per-command matrix + bench.py pytest-codspeed harness with a generator script (generate_config.py) that emits a codspeed.yml consumed by the CodSpeed exec harness. The CI matrix now iterates over Valgrind versions only, with each job generating its own config from valgrind --version, keeping benchmark names byte-identical to the old pytest IDs.

Workflow simplification: The matrix collapses from cmd × valgrind to valgrind only; fail-fast: false is added so matrix jobs are isolated, and the timeout is increased from 20 to 30 minutes to accommodate the larger per-job benchmark batch (5 commands × 4 configs).
Harness migration: bench.py and pytest.ini are deleted; generate_config.py produces a JSON-in-YAML config that the CodSpeed action picks up via config: bench/codspeed.yml.
URI change: The platform will start a fresh history series under the new exec_harness::<name> URI prefix while preserving display names — this is acknowledged and intentional in the PR description.

Confidence Score: 5/5

The change is a clean harness swap with no functional risk to the benchmarks themselves; no data-flow or logic paths are altered.

All three matrix variants (3.26.0, 3.25.1, local) now run independently with fail-fast: false. The version-label and benchmark-name derivation in generate_config.py mirrors the old bench.py logic exactly. The exec_cmd string construction is correct for shell execution. The only observable side-effect is a fresh history series on the CodSpeed dashboard, which is explicitly documented in the PR description.

No files require special attention.

Important Files Changed

Filename	Overview
.github/workflows/codspeed.yml	Matrix simplified to valgrind versions only; adds fail-fast: false, bumps timeout to 30 min, adds a Generate CodSpeed config step, and uses the new config-based CodSpeed action invocation. No logic errors found.
bench/generate_config.py	New script generates a codspeed.yml from hardcoded COMMANDS/CONFIGS. Version label derivation mirrors old bench.py logic exactly. exec_cmd construction via join() is correct for shell execution.
bench/bench.py	Deleted — fully replaced by generate_config.py and codspeed exec harness.
.gitignore	Adds /bench/codspeed.yml and /bench/codspeed.yaml to .gitignore since the file is now generated at CI time.
bench/pytest.ini	Deleted — no longer needed after removal of pytest-based bench harness.

Sequence Diagram

sequenceDiagram
    participant CI as GitHub Actions (matrix: valgrind version)
    participant Gen as generate_config.py
    participant VG as valgrind --version
    participant CS as CodSpeedHQ/action
    participant EH as exec harness

    CI->>VG: valgrind --version
    VG-->>Gen: e.g. valgrind-3.26.0
    Gen->>Gen: build COMMANDS x CONFIGS targets
    Gen->>CI: writes bench/codspeed.yml
    CI->>CS: mode walltime config bench/codspeed.yml
    loop each benchmark target
        CS->>EH: "exec_cmd valgrind --tool=callgrind cmd"
        EH-->>CS: callgrind output
    end
    CS-->>CI: results uploaded under exec_harness name

_{Reviews (3): Last reviewed commit: "refactor(bench): migrate benchmarks to c..." | Re-trigger Greptile}

codspeed-hq · 2026-06-02T09:47:55Z

Merging this PR will create unknown performance changes

🆕 60 new benchmarks
⏩ 60 skipped benchmarks¹

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
🆕	`test_valgrind[valgrind-3.25.1, python3 testdata/test.py, full-no-inline]`	N/A	6.8 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, full-with-inline]`	N/A	5.5 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, no-inline]`	N/A	2.5 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, python3 testdata/test.py, no-inline]`	N/A	4.3 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, full-with-inline]`	N/A	7.7 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, echo Hello, World!, no-inline]`	N/A	496.5 ms	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, inline]`	N/A	4.5 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, full-no-inline]`	N/A	5.3 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, python3 testdata/test.py, inline]`	N/A	4.5 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, echo Hello, World!, full-with-inline]`	N/A	729.7 ms	N/A
🆕	`test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, inline]`	N/A	7.2 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, full-no-inline]`	N/A	3 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, echo Hello, World!, inline]`	N/A	663.9 ms	N/A
🆕	`test_valgrind[valgrind-3.25.1, python3 testdata/test.py, full-with-inline]`	N/A	6.9 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, full-no-inline]`	N/A	8.3 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, echo Hello, World!, full-no-inline]`	N/A	563.1 ms	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, full-with-inline]`	N/A	8.5 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, no-inline]`	N/A	3.1 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, inline]`	N/A	3.3 s	N/A
🆕	`test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, no-inline]`	N/A	4.1 s	N/A
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

_{Comparing cod-2766-migrate-valgrind-benchmarks-to-exec-cli (d2dd609) with master (4a64bb2)}

60 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Replace the pytest-codspeed harness (bench.py) with a small generator (generate_config.py) that emits a codspeed.yml of exec-harness targets, run via `codspeed run`. Each target's name is kept byte-identical to the old pytest test id (test_valgrind[<version>, <cmd>, <config>]) so the dashboard names are unchanged. The exec-harness reports the URI as `exec_harness::<name>` (prefix is hardcoded in the runner) instead of `bench.py::<name>`; the platform keys history on the full URI, so this starts a fresh history series with the same display names. CI now matrixes over the Valgrind version only (one job per version) and generates that version's config from `valgrind --version`. Per-version jobs preserve failure isolation: the exec-harness aborts the whole batch on a single non-zero command, so upstream versions stay separate and keep continue-on-error.

not-matthias force-pushed the cod-2766-migrate-valgrind-benchmarks-to-exec-cli branch from dece53d to 0b22c7b Compare June 2, 2026 09:40

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread bench/generate_config.py

Comment thread bench/generate_config.py

not-matthias requested review from GuillaumeLagrange and art049 June 2, 2026 10:16

not-matthias force-pushed the cod-2766-migrate-valgrind-benchmarks-to-exec-cli branch from 1799be9 to d2dd609 Compare June 2, 2026 10:30

GuillaumeLagrange approved these changes Jun 2, 2026

View reviewed changes

not-matthias merged commit d2dd609 into master Jun 2, 2026
9 checks passed

not-matthias deleted the cod-2766-migrate-valgrind-benchmarks-to-exec-cli branch June 2, 2026 12:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(bench): migrate benchmarks to codspeed exec CLI#16

refactor(bench): migrate benchmarks to codspeed exec CLI#16
not-matthias merged 1 commit into
masterfrom
cod-2766-migrate-valgrind-benchmarks-to-exec-cli

not-matthias commented Jun 2, 2026

Uh oh!

greptile-apps Bot commented Jun 2, 2026 •

edited

Loading

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

codspeed-hq Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

not-matthias commented Jun 2, 2026

Uh oh!

greptile-apps Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

codspeed-hq Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will create unknown performance changes

Performance Changes

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Jun 2, 2026 •

edited

Loading

codspeed-hq Bot commented Jun 2, 2026 •

edited

Loading