Summary
Memory note (d1_benchmark_status.md) from 2026-05-12 claimed the D1 benchmark gate code (runner, metrics, baseline, CI workflow) was not yet built in the extension repo despite issue #86 being closed. As of PR #111 (2026-05-15), the runner / metrics / baseline / per-type gate logic clearly DO exist (we used them locally during PR-4). What's unclear is whether the gate is wired into a GitHub Actions workflow that fails CI on regression.
What to verify
- Inspect
.github/workflows/ for a job that runs npm run benchmark against benchmark/baseline.json on PR.
- Confirm the threshold check (
--threshold 1.0 per runner.ts) and the per-type gate (TP+FP >= 3 per runner.ts:265) both trip the workflow on a real regression.
- If the workflow is missing, add one — runs on PR, blocks merge on metric drops.
- If the workflow exists but isn't required, update branch protection.
Acceptance criteria
Reference
Memory d1_benchmark_status.md (stale).
Summary
Memory note (
d1_benchmark_status.md) from 2026-05-12 claimed the D1 benchmark gate code (runner, metrics, baseline, CI workflow) was not yet built in theextensionrepo despite issue #86 being closed. As of PR #111 (2026-05-15), the runner / metrics / baseline / per-type gate logic clearly DO exist (we used them locally during PR-4). What's unclear is whether the gate is wired into a GitHub Actions workflow that fails CI on regression.What to verify
.github/workflows/for a job that runsnpm run benchmarkagainstbenchmark/baseline.jsonon PR.--threshold 1.0perrunner.ts) and the per-type gate (TP+FP >= 3perrunner.ts:265) both trip the workflow on a real regression.Acceptance criteria
.github/workflows/contains a benchmark job triggered on PR.d1_benchmark_status.mdupdated or deleted depending on findings.Reference
Memory
d1_benchmark_status.md(stale).