Skip to content

refactor(bench): migrate benchmarks to codspeed exec CLI#16

Merged
not-matthias merged 1 commit into
masterfrom
cod-2766-migrate-valgrind-benchmarks-to-exec-cli
Jun 2, 2026
Merged

refactor(bench): migrate benchmarks to codspeed exec CLI#16
not-matthias merged 1 commit into
masterfrom
cod-2766-migrate-valgrind-benchmarks-to-exec-cli

Conversation

@not-matthias
Copy link
Copy Markdown
Member

Replace the pytest-codspeed harness (bench.py) with a small generator
(generate_config.py) that emits a codspeed.yml of exec-harness targets,
run via codspeed run. Each target's name is kept byte-identical to the
old pytest test id (test_valgrind[, , ]) so the
dashboard names are unchanged.

The exec-harness reports the URI as exec_harness::<name> (prefix is
hardcoded in the runner) instead of bench.py::<name>; the platform keys
history on the full URI, so this starts a fresh history series with the
same display names.

CI now matrixes over the Valgrind version only (one job per version) and
generates that version's config from valgrind --version. Per-version
jobs preserve failure isolation: the exec-harness aborts the whole batch
on a single non-zero command, so upstream versions stay separate and keep
continue-on-error.

@not-matthias not-matthias force-pushed the cod-2766-migrate-valgrind-benchmarks-to-exec-cli branch from dece53d to 0b22c7b Compare June 2, 2026 09:40
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR replaces the per-command matrix + bench.py pytest-codspeed harness with a generator script (generate_config.py) that emits a codspeed.yml consumed by the CodSpeed exec harness. The CI matrix now iterates over Valgrind versions only, with each job generating its own config from valgrind --version, keeping benchmark names byte-identical to the old pytest IDs.

  • Workflow simplification: The matrix collapses from cmd × valgrind to valgrind only; fail-fast: false is added so matrix jobs are isolated, and the timeout is increased from 20 to 30 minutes to accommodate the larger per-job benchmark batch (5 commands × 4 configs).
  • Harness migration: bench.py and pytest.ini are deleted; generate_config.py produces a JSON-in-YAML config that the CodSpeed action picks up via config: bench/codspeed.yml.
  • URI change: The platform will start a fresh history series under the new exec_harness::<name> URI prefix while preserving display names — this is acknowledged and intentional in the PR description.

Confidence Score: 5/5

The change is a clean harness swap with no functional risk to the benchmarks themselves; no data-flow or logic paths are altered.

All three matrix variants (3.26.0, 3.25.1, local) now run independently with fail-fast: false. The version-label and benchmark-name derivation in generate_config.py mirrors the old bench.py logic exactly. The exec_cmd string construction is correct for shell execution. The only observable side-effect is a fresh history series on the CodSpeed dashboard, which is explicitly documented in the PR description.

No files require special attention.

Important Files Changed

Filename Overview
.github/workflows/codspeed.yml Matrix simplified to valgrind versions only; adds fail-fast: false, bumps timeout to 30 min, adds a Generate CodSpeed config step, and uses the new config-based CodSpeed action invocation. No logic errors found.
bench/generate_config.py New script generates a codspeed.yml from hardcoded COMMANDS/CONFIGS. Version label derivation mirrors old bench.py logic exactly. exec_cmd construction via join() is correct for shell execution.
bench/bench.py Deleted — fully replaced by generate_config.py and codspeed exec harness.
.gitignore Adds /bench/codspeed.yml and /bench/codspeed.yaml to .gitignore since the file is now generated at CI time.
bench/pytest.ini Deleted — no longer needed after removal of pytest-based bench harness.

Sequence Diagram

sequenceDiagram
    participant CI as GitHub Actions (matrix: valgrind version)
    participant Gen as generate_config.py
    participant VG as valgrind --version
    participant CS as CodSpeedHQ/action
    participant EH as exec harness

    CI->>VG: valgrind --version
    VG-->>Gen: e.g. valgrind-3.26.0
    Gen->>Gen: build COMMANDS x CONFIGS targets
    Gen->>CI: writes bench/codspeed.yml
    CI->>CS: mode walltime config bench/codspeed.yml
    loop each benchmark target
        CS->>EH: "exec_cmd valgrind --tool=callgrind cmd"
        EH-->>CS: callgrind output
    end
    CS-->>CI: results uploaded under exec_harness name
Loading

Reviews (3): Last reviewed commit: "refactor(bench): migrate benchmarks to c..." | Re-trigger Greptile

Comment thread bench/generate_config.py
Comment thread bench/generate_config.py
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jun 2, 2026

Merging this PR will create unknown performance changes

🆕 60 new benchmarks
⏩ 60 skipped benchmarks1

Performance Changes

Benchmark BASE HEAD Efficiency
🆕 test_valgrind[valgrind-3.25.1, python3 testdata/test.py, full-no-inline] N/A 6.8 s N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, full-with-inline] N/A 5.5 s N/A
🆕 test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, no-inline] N/A 2.5 s N/A
🆕 test_valgrind[valgrind-3.25.1, python3 testdata/test.py, no-inline] N/A 4.3 s N/A
🆕 test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, full-with-inline] N/A 7.7 s N/A
🆕 test_valgrind[valgrind-3.25.1, echo Hello, World!, no-inline] N/A 496.5 ms N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, inline] N/A 4.5 s N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, full-no-inline] N/A 5.3 s N/A
🆕 test_valgrind[valgrind-3.25.1, python3 testdata/test.py, inline] N/A 4.5 s N/A
🆕 test_valgrind[valgrind-3.25.1, echo Hello, World!, full-with-inline] N/A 729.7 ms N/A
🆕 test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, inline] N/A 7.2 s N/A
🆕 test_valgrind[valgrind-3.25.1, testdata/take_strings-aarch64 varbinview_non_null, full-no-inline] N/A 3 s N/A
🆕 test_valgrind[valgrind-3.25.1, echo Hello, World!, inline] N/A 663.9 ms N/A
🆕 test_valgrind[valgrind-3.25.1, python3 testdata/test.py, full-with-inline] N/A 6.9 s N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, full-no-inline] N/A 8.3 s N/A
🆕 test_valgrind[valgrind-3.25.1, echo Hello, World!, full-no-inline] N/A 563.1 ms N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, full-with-inline] N/A 8.5 s N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, no-inline] N/A 3.1 s N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 4 --cpu-ops 10, inline] N/A 3.3 s N/A
🆕 test_valgrind[valgrind-3.25.1, stress-ng --cpu 1 --cpu-ops 10, no-inline] N/A 4.1 s N/A
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.


Comparing cod-2766-migrate-valgrind-benchmarks-to-exec-cli (d2dd609) with master (4a64bb2)

Open in CodSpeed

Footnotes

  1. 60 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Replace the pytest-codspeed harness (bench.py) with a small generator
(generate_config.py) that emits a codspeed.yml of exec-harness targets,
run via `codspeed run`. Each target's name is kept byte-identical to the
old pytest test id (test_valgrind[<version>, <cmd>, <config>]) so the
dashboard names are unchanged.

The exec-harness reports the URI as `exec_harness::<name>` (prefix is
hardcoded in the runner) instead of `bench.py::<name>`; the platform keys
history on the full URI, so this starts a fresh history series with the
same display names.

CI now matrixes over the Valgrind version only (one job per version) and
generates that version's config from `valgrind --version`. Per-version
jobs preserve failure isolation: the exec-harness aborts the whole batch
on a single non-zero command, so upstream versions stay separate and keep
continue-on-error.
@not-matthias not-matthias force-pushed the cod-2766-migrate-valgrind-benchmarks-to-exec-cli branch from 1799be9 to d2dd609 Compare June 2, 2026 10:30
@not-matthias not-matthias merged commit d2dd609 into master Jun 2, 2026
9 checks passed
@not-matthias not-matthias deleted the cod-2766-migrate-valgrind-benchmarks-to-exec-cli branch June 2, 2026 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants