Skip to content

feat(ci): harden e2e diff bot (coverage, regression, retries, safety)#2593

Merged
esengine merged 1 commit into
main-v2from
feat/e2e-bot-harden
Jun 1, 2026
Merged

feat(ci): harden e2e diff bot (coverage, regression, retries, safety)#2593
esengine merged 1 commit into
main-v2from
feat/e2e-bot-harden

Conversation

@esengine
Copy link
Copy Markdown
Owner

@esengine esengine commented Jun 1, 2026

Closes the gaps from review:

  • Changed-line coverage — % of the PR's changed source lines the tests execute.
  • go build ./... regression gate — folded into pass.
  • Best-of-N/e2e diff xN retries up to N (≤5) until a pass; single run labelled a sample.
  • Safety — trigger login via env (not shell interpolation); pricing now configurable via repo vars.
    All smoke-tested locally (coverage 50%-partial case, by-assertion vs compile-only, retry loop).

- Changed-line coverage: report what % of the PR's changed source lines
  the tests actually execute (parses a coverprofile against the diff's
  new line ranges), turning "N tests added" into an adequacy signal.
- Regression gate: run `go build ./...` after the agent and fold it into
  the pass criterion, so the agent can't green its tests while breaking
  the build elsewhere.
- Best-of-N: the agent is stochastic, so `/e2e diff xN` retries up to N
  times (≤5) until a run passes, resetting the tree between attempts and
  keeping the best; a single run is labelled as one sample.
- Safety: pass the triggering login through an env var instead of
  interpolating it into the shell; make provider pricing configurable via
  repo variables (cost was a hardcoded placeholder).
@esengine esengine merged commit da59857 into main-v2 Jun 1, 2026
@esengine esengine deleted the feat/e2e-bot-harden branch June 1, 2026 13:20
@github-actions github-actions Bot added the v2 Go rewrite (1.x) — main-v2 branch, active development label Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant