fix(triggers): revalidate work-item freshness before implementation dispatch (MNG-1053)#1378
Conversation
…ispatch (MNG-1053)
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
nhopeatall
left a comment
There was a problem hiding this comment.
Summary
Requesting changes: the freshness gate is in the right place, but two traced paths still allow incorrect dispatch behavior.
Code Issues
Blocking
- src/triggers/shared/implementation-freshness-gate.ts:359 - PR verification assumes the
runAgentWithCredentialsGitHub token scope, but the shared pipeline is also used by dashboard manual/retry jobs throughtriggerManualRun, which only establishes PM credentials. For an implementation retry with a recent successful run and PR URL,githubClient.getPR()throwsNo GitHub client in scope; the gate then returnsneeds_human_reconciliation, blocking manual reimplementation even when the PR is closed-unmerged. If the only signal ispr_work_items, the same missing token leaveshaveOtherEvidencefalse and the existing open PR is ignored. - src/triggers/shared/implementation-freshness-gate.ts:481 - Checklist read failures fall open when there is no separate DB/PR evidence. A stale PM job whose live
Implementation Stepschecklist was completed while queued, but whosegetChecklists()call transiently fails before any PR/run evidence exists, returnsdispatchableand starts the duplicate implementation. The work item explicitly required checklist read uncertainty to fail closed before starting from stale context.
🕵️ codex · gpt-5.5 · run details
| let inspectedPRs: InspectedPR[] = []; | ||
| let erroredPRs: CandidatePR[] = []; | ||
| if (input.project.repo && candidates.length > 0) { | ||
| const verification = await inspectPullRequests(input.project.repo, candidates); |
There was a problem hiding this comment.
This assumes every caller of the shared pipeline is already inside withGitHubToken, but dashboard manual/retry jobs call runAgentExecutionPipeline through triggerManualRun with only PM credentials. In that path githubClient.getPR() throws No GitHub client in scope; with a recent successful run plus PR URL the gate turns that into needs_human_reconciliation, so an intentional manual reimplementation is blocked even if the PR is closed-unmerged. With only a pr_work_items candidate, haveOtherEvidence is false and an existing open PR can be missed. Please either guarantee GitHub scope for all implementation pipeline callers or avoid using the scoped singleton from paths that cannot satisfy it.
| } | ||
| } | ||
|
|
||
| if (!signals.checklistsResult && otherSignalsBesidesChecklist) { |
There was a problem hiding this comment.
This still dispatches when getChecklists() fails and there is no other DB/PR evidence. That misses the core stale-checklist case: a PM coalesced implementation job can sit in the queue, a human completes Implementation Steps, and then a transient PM read failure makes checklistsResult null with no other signals yet, so the duplicate implementation starts. The work item called for checklist read uncertainty to fail closed before dispatching from stale context.
|
@aaight address code review concerns |
|
@zbigniewsobiecki addressed the review concerns and pushed
Verification: 🕵️ codex · gpt-5.5 · run details |
nhopeatall
left a comment
There was a problem hiding this comment.
LGTM - the follow-up commit resolves the prior blockers: PR verification now establishes the implementation GitHub persona scope itself, checklist read uncertainty fails closed, and the pipeline short-circuits before budget checks, work-item persistence, lifecycle mutation, or agent startup.
🕵️ codex · gpt-5.5 · run details
Summary
Adds an implementation freshness gate inside the shared worker execution pipeline so stale or duplicate
implementationdispatches stop before the agent starts mutating work-item state. PM router adapters intentionally embed a pre-resolvedTriggerResultfor delayed/coalesced PM jobs, which means the router-level work-item lock cannot see a sibling implementation run that already completed via post-completion review chaining or a manual run, a terminal checklist an operator finished while the job sat in the PM coalesce window, or a PR that already exists for the work item. The gate runs the last-mile check using live PM + DB + GitHub state.This addresses MNG-1053 and closes the duplicate-agent-run class observed on MNG-1013 (see MNG-1032).
What changed
src/triggers/shared/implementation-freshness-gate.ts(new)dispatchable | already_implemented | active_implementation | implementation_pr_exists | needs_human_reconciliationwith a human-readable message and structured evidence.dispatchableimmediately whenagentType !== 'implementation'or noworkItemIdis resolved, soreview/respond-to-review/respond-to-ci/respond-to-pr-commentfollow-up agents keep their existing dispatch path.provider.getChecklists(workItemId). Terminal checklists are recognised by exact heading match against"Implementation Steps"and"Acceptance Criteria"— narrow on purpose so unrelated checklists (dependencies, friction, etc.) cannot accidentally block reimplementation. Non-empty fully-complete terminal checklists returnalready_implemented.countActiveRuns({ projectId, workItemId, agentType: 'implementation' })within the existingDEFAULT_STALE_RUN_THRESHOLD_MSwindow and inspects recent successful runs viagetRunsByWorkItem.pr_work_items+agent_runs.prUrl, de-duplicates by PR number, and verifies each throughgithubClient.getPR()using the existing persona-token scope established byrunAgentWithCredentials. Open PRs returnimplementation_pr_exists; merged PRs returnalready_implemented; closed-unmerged PRs do not permanently block reimplementation.needs_human_reconciliation. A successful implementation run without aprUrlis unexpected →needs_human_reconciliation. When evidence is empty and helpers fail, the gate staysdispatchableso we never block on the gate's own bugs.postFreshnessSkipNoticeupdates the existingackCommentIdwhen present, falls back toaddComment, and never throws.src/triggers/shared/agent-execution.tsrunFreshnessGate(context)helper called betweenvalidateAgentExecutionLifecycleandcheckPreRunBudget/persistAgentWorkItemLinks/prepareAgentExecutionLifecycle. Returnstrueto short-circuit; the pipeline exits normally so router cleanup releases locks without retrying.dispatchablerather than blocking a legitimate run.src/triggers/shared/agent-execution-runtime.tsandagent-execution-types.tsAgentExecutionContextnow carries the activePMProvider(already created forPMLifecycleManager) so the gate reuses one provider instance.Tests
tests/unit/triggers/implementation-freshness-gate.test.ts(new, 23 tests): agent-type gating, terminal checklist coverage, partial / empty / unrelated checklist handling, active run blocking, open/merged/closed-unmerged PR semantics, recent-run-derived PR candidates, fail-closed branches for PR lookup failure + checklist read failure + successful run without PR, ack comment update vs fallback addComment, and never-throws guarantee for the comment poster.tests/unit/triggers/agent-execution.test.ts: 8 new pipeline integration tests verifying the gate runs only for implementation+workItemId, blocks beforerunAgent/prepareForAgent/checkBudgetExceeded, does not invokehandleFailure/cleanupProcessingon normal skips, posts the durable skip notice, and survives evaluator errors without crashing the pipeline.Documentation
docs/architecture/03-trigger-system.md: Mermaid pipeline diagram updated with the freshness gate node and a paragraph describing its scope, evidence, and uncertainty semantics.docs/architecture/10-resilience.md: new "Implementation freshness gate (worker-side)" subsection under Concurrency Controls covering layering with the router-level work-item lock.CLAUDE.md(andAGENTS.mdvia symlink): one-paragraph operator overview alongside the existing concurrency notes.Test plan
npx vitest run tests/unit/triggers/implementation-freshness-gate.test.ts— 23 tests pass.npx vitest run tests/unit/triggers/agent-execution.test.ts— 39 tests pass (8 new freshness-gate integration tests).npx vitest run tests/unit/pm/webhook-handler.test.ts tests/unit/router/work-item-lock.test.ts tests/unit/router/agent-type-lock.test.ts— existing concurrency guards stay green.npx vitest run tests/unit/api/router.test.ts tests/unit/api/routers/projects.test.ts— appRouter import chain unchanged.npx vitest run tests/unit/architecture-docs.test.ts— doc drift guards stay green after the doc updates.npm test— full unit suite 9712 passed / 28 skipped.npm run typecheck— clean.npm run lint— clean (only preexisting non-fixable warnings remain).🕵️ claude-code · claude-opus-4-7 · run details