Symptom
CPU Tests CI fails on PRs with a consistent pattern: the push-triggered run passes, the pull_request-triggered run on the same SHA fails. Reproduced across two distinct commits on PR #232:
In both cases push starts a few seconds earlier and succeeds; PR runs concurrently and fails. The failure is not random — it is systematically the second-to-start.
Root cause
.github/workflows/cpu_test.yml triggers on both push and pull_request, and on a PR branch both events fire. There is no concurrency: block, so two full ~8-minute CPU-test jobs run in parallel at the same SHA against the same external resources (HF Hub auth + dataset downloads, libero assets, etc.). The second job to start loses the race against pre-existing flaky tests in the datasets/HF-Hub network family — the same flake class @shuheng-liu has documented in #229 ("pre-existing failures: datasets/HF-Hub network, test_optional_keys's SimpleNamespace mock missing camera_keys, test_hub").
Fix
Add a concurrency: block at the top of cpu_test.yml so duplicate runs at the same ref dedupe instead of racing:
concurrency:
group: cpu-test-${{ github.ref }}
cancel-in-progress: true
This:
- eliminates the duplicate-run race (and the systematic PR-trigger failure it causes),
- saves ~8 min of CPU compute per PR push,
- behaves correctly for force-pushes (cancels the in-flight run for the previous SHA on the same ref).
The same pattern probably also belongs on pre-commit.yml and the claude bot workflows for the same compute-saving reason, but those don't seem to suffer from the flake.
Out of scope
Repro / evidence
PR #232 (a no-op test deletion) hit this twice. The PR has been left as-is so the failure is still observable in CI logs.
Symptom
CPU Tests CI fails on PRs with a consistent pattern: the
push-triggered run passes, thepull_request-triggered run on the same SHA fails. Reproduced across two distinct commits on PR #232:0bf66d5c5a7e30In both cases push starts a few seconds earlier and succeeds; PR runs concurrently and fails. The failure is not random — it is systematically the second-to-start.
Root cause
.github/workflows/cpu_test.ymltriggers on bothpushandpull_request, and on a PR branch both events fire. There is noconcurrency:block, so two full ~8-minute CPU-test jobs run in parallel at the same SHA against the same external resources (HF Hub auth + dataset downloads, libero assets, etc.). The second job to start loses the race against pre-existing flaky tests in thedatasets/HF-Hub networkfamily — the same flake class @shuheng-liu has documented in #229 ("pre-existing failures: datasets/HF-Hub network,test_optional_keys's SimpleNamespace mock missingcamera_keys,test_hub").Fix
Add a
concurrency:block at the top ofcpu_test.ymlso duplicate runs at the same ref dedupe instead of racing:This:
The same pattern probably also belongs on
pre-commit.ymland the claude bot workflows for the same compute-saving reason, but those don't seem to suffer from the flake.Out of scope
datasets/HF-Hub network/test_optional_keys/test_hubflakes. Those are pre-existing and tracked separately in PR fix(pi07): gate optional prefix tokens in low- and high-level planners #229's notes; the concurrency fix mitigates them by reducing parallel pressure on HF Hub but doesn't eliminate them when a single run hits an HF-side hiccup.Repro / evidence
PR #232 (a no-op test deletion) hit this twice. The PR has been left as-is so the failure is still observable in CI logs.