Deferred from PR #1032 review.
Original reviewer comment: #1032 (comment)
Context:
scripts/update-incremental-report.ts computes ↑/↓ trend annotations against the immediately prior release via findPrevRelease(history, i), but when that prior release has null metrics for a given column (e.g. 3.9.5 had null build/rebuild data because both engine workers were SIGKILL'd) the trend cell is silently emitted blank.
For 3.9.6 wasm, this hides large regressions vs 3.9.4 (the last release with build data):
- Full build: 7.6s → 14.0s (~84% slower)
- No-op rebuild: 19ms → 131ms (~589% slower)
A casual reader of the summary table sees no annotation and may assume parity with the prior release.
Fix:
In scripts/update-incremental-report.ts (engineRow and trend callsites), when the previous release's metric is null, walk further back through history and compare against the most recent prior release that has a non-null value for that specific metric. Either inline the search per-metric or extend findPrevRelease with a metric-aware variant.
Also worth investigating separately: are the 3.9.6 wasm numbers themselves a real regression, or environmental noise from the runner that ran the 3.9.6 benchmark?
Deferred from PR #1032 review.
Original reviewer comment: #1032 (comment)
Context:
scripts/update-incremental-report.tscomputes↑/↓trend annotations against the immediately prior release viafindPrevRelease(history, i), but when that prior release has null metrics for a given column (e.g. 3.9.5 had null build/rebuild data because both engine workers were SIGKILL'd) the trend cell is silently emitted blank.For 3.9.6 wasm, this hides large regressions vs 3.9.4 (the last release with build data):
A casual reader of the summary table sees no annotation and may assume parity with the prior release.
Fix:
In
scripts/update-incremental-report.ts(engineRowandtrendcallsites), when the previous release's metric is null, walk further back through history and compare against the most recent prior release that has a non-null value for that specific metric. Either inline the search per-metric or extendfindPrevReleasewith a metric-aware variant.Also worth investigating separately: are the 3.9.6 wasm numbers themselves a real regression, or environmental noise from the runner that ran the 3.9.6 benchmark?