fix(workflow-executor): reject duplicate triggers of in-flight runs (PRD-468) by Scra3 · Pull Request #1634 · ForestAdmin/agent-nodejs

Scra3 · 2026-06-08T13:21:21Z

What

A duplicate trigger for a run already executing on this instance was silently accepted (200) after a wasted orchestrator round-trip — the inFlightRuns check sat after getAvailableRun and only returned.

triggerPoll now checks inFlightRuns up front (assertRunNotInFlight), before the orchestrator round-trip, and throws the boundary RunAlreadyInFlightError → handleTrigger maps it to 400 so the front knows the trigger was rejected, not silently accepted.

Important: this is an optimization, not a concurrency guard

inFlightRuns is per-instance and a deployment can run several executors, so an executor-side check can never be the duplicate-execution guard. Genuine concurrency dedup is the orchestrator's job: it atomically claims a pending run (UPDATE … WHERE runState='pending' … FOR UPDATE SKIP LOCKED), so concurrent triggers — on this executor or another — get nothing back.

The up-front check is purely a best-effort local short-circuit to skip a useless orchestrator round-trip when this instance already knows it's running the run.

Changes

assertRunNotInFlight(runId) early in triggerPoll (+ explanatory comment on the intent)
new boundary RunAlreadyInFlightError, mapped to 400 in handleTrigger
tests: the concurrent-trigger test asserts a single execution because the orchestrator denies the second dispatch (the dedup is orchestrator-driven by design)

fixes PRD-468

🤖 Generated with Claude Code

Note

Reject duplicate triggers of in-flight workflow runs with a 400 error

Runner.triggerPoll previously silently skipped duplicate triggers; it now throws RunAlreadyInFlightError via a new assertRunNotInFlight helper when the run ID is already being processed.
The POST /runs/:runId/trigger endpoint in executor-http-server.ts catches RunAlreadyInFlightError and returns a 400 response with the error message.
Behavioral Change: callers that previously received a silent no-op on duplicate trigger requests will now receive a 400 error response.

^{Macroscope summarized 1e0f685.}

linear-code · 2026-06-08T13:21:25Z

PRD-468

qltysh · 2026-06-08T13:22:27Z

2 new issues

Tool	Category	Rule	Count
qlty	Structure	Function with many returns (count = 5): handleTrigger	1
qlty	Structure	Function with high complexity (count = 11): handleTrigger	1

qltysh · 2026-06-08T13:27:26Z

Coverage Impact

Unable to calculate total coverage change because base branch coverage was not found.

Modified Files with Diff Coverage (3)

Rating	File	% Diff	Uncovered Line #s
	packages/workflow-executor/src/runner.ts	100.0%
	packages/workflow-executor/src/http/executor-http-server.ts	100.0%
	packages/workflow-executor/src/errors.ts	100.0%
	Total	100.0%

🚦 See full report on Qlty Cloud »

🛟 Help

Diff Coverage: Coverage for added or modified lines of code (excludes deleted files). Learn more.
Total Coverage: Coverage for the whole repository, calculated as the sum of all File Coverage. Learn more.
File Coverage: Covered Lines divided by Covered Lines plus Missed Lines. (Excludes non-executable lines including blank lines and comments.)
- Indirect Changes: Changes to File Coverage for files that were not modified in this PR. Learn more.

matthv · 2026-06-08T15:29:40Z

Nice work — the boundary error + the assertRunNotInFlight helper + the 400 mapping are clean, and the early short-circuit (no orchestrator round-trip on a re-trigger) is a real improvement. One thing I'd like to settle before merging though, around the concurrency guarantee.

Description ↔ code mismatch

The description says the in-flight check is done twice (early + late) and that "the late check is the correctness guard … removing it in favor of the early one alone would reintroduce double execution". But the diff actually removes the late check (if (inFlightRuns.has(step.runId)) return) and keeps only the early assertRunNotInFlight(runId). Macroscope's summary also describes a single check. So either the description is stale, or the late check was dropped unintentionally.

Why this matters — getAvailableRun is only atomic for pending runs

I checked the orchestrator side. GET /api/workflow-orchestrator/available-run/:id only claims atomically when the run is pending:

if (runState === Pending)  → claimPendingRunById()   // atomic, see below
if (runState === Started)  → no claim, just hydrate + return the run

claimPendingRunById is solid for the pending case:

UPDATE "workflowRuns" SET "lockedAt"=NOW(), "runState"='loading'
WHERE id IN ( SELECT wr.id ... WHERE wr."runState"='pending' AND wr.id=:runId ...
              FOR UPDATE SKIP LOCKED )
RETURNING *

→ two concurrent calls: one claims, the other gets null. 👍

But for a started run there's no claim — two concurrent getAvailableRun both return the step. So:

pending run → orchestrator dedups → early-check-only is enough ✅
started run + truly simultaneous triggers → both pass the early check (before either registers in inFlightRuns) and the orchestrator returns the step to both → double execution. This is exactly the case the removed late check covered.

The test gives false confidence here: getAvailableRun.mockResolvedValueOnce(step).mockResolvedValueOnce(null) reproduces the pending-claim behavior. A started run would resolve step both times, so the current test wouldn't catch the regression.

Suggested resolution (either):

Restore the late check (cheap, closes the started race), and reconcile the description; or
Prove the trigger path never hits a started run concurrently (always goes through pending) — then early-check-only is correct, but please fix the description and add a test that exercises the started case (getAvailableRun returning the step twice).

Minor (non-blocking): 400 vs 409 — "already being processed" reads more like a 409 Conflict, though I see the ticket explicitly asked for 400, so up to you.

…PRD-468) A duplicate trigger for a run already executing on this instance was silently accepted (200) after a wasted orchestrator round-trip — the inFlightRuns check sat after getAvailableRun and only `return`ed. triggerPoll now checks inFlightRuns up front via assertRunNotInFlight, before the orchestrator round-trip, throwing the new boundary RunAlreadyInFlightError (mapped to 400 by handleTrigger, so the front knows the trigger was rejected, not silently accepted). This is a best-effort local optimization, NOT a concurrency guard: inFlightRuns is per-instance and a deployment can run several executors, so genuine duplicate-execution prevention is the orchestrator's job — it atomically claims a pending run, and concurrent triggers (on this or another executor) get nothing back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Scra3 · 2026-06-08T17:47:08Z

Thanks for the deep read — you're right that the commit message was stale (it described a late check that the diff removed), I've fixed the description + commit message.

On the substance: I'm intentionally not restoring the late check. inFlightRuns is per-instance and we run multiple executors, so an executor-side check can't be the concurrency guard for the started-run race you describe — it would only close it same-instance while giving false confidence about the cross-executor case. Genuine duplicate-execution prevention has to live in the orchestrator's atomic claim, which is the real guarantee.

So the up-front assertRunNotInFlight is scoped as what it is: a best-effort local short-circuit to avoid a useless orchestrator round-trip when this instance already knows it's running the run. I added a comment to that effect, and the concurrent test now reads as "one execution because the orchestrator denies the second dispatch" (dedup is orchestrator-driven).

Re 400 vs 409: agreed 409 reads better, keeping 400 per the ticket.

Base automatically changed from fix/prd-442-activity-log-target to feat/prd-214-server-step-mapper June 8, 2026 13:22

Scra3 force-pushed the feature/prd-468-executor-runs-the-same-workflow-run-twice-on-concurrent branch 4 times, most recently from fe27c82 to 431cb08 Compare June 8, 2026 13:41

Scra3 force-pushed the feature/prd-468-executor-runs-the-same-workflow-run-twice-on-concurrent branch from 431cb08 to 1e0f685 Compare June 8, 2026 17:46

matthv approved these changes Jun 9, 2026

View reviewed changes

Scra3 merged commit 5969656 into feat/prd-214-server-step-mapper Jun 9, 2026
30 checks passed

Scra3 deleted the feature/prd-468-executor-runs-the-same-workflow-run-twice-on-concurrent branch June 9, 2026 08:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(workflow-executor): reject duplicate triggers of in-flight runs (PRD-468)#1634

fix(workflow-executor): reject duplicate triggers of in-flight runs (PRD-468)#1634
Scra3 merged 1 commit into
feat/prd-214-server-step-mapperfrom
feature/prd-468-executor-runs-the-same-workflow-run-twice-on-concurrent

Scra3 commented Jun 8, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

linear-code Bot commented Jun 8, 2026

Uh oh!

qltysh Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

qltysh Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

matthv commented Jun 8, 2026

Uh oh!

Scra3 commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Scra3 commented Jun 8, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Important: this is an optimization, not a concurrency guard

Changes

Reject duplicate triggers of in-flight workflow runs with a 400 error

Uh oh!

linear-code Bot commented Jun 8, 2026

Uh oh!

qltysh Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

2 new issues

Uh oh!

qltysh Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matthv commented Jun 8, 2026

Uh oh!

Scra3 commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Scra3 commented Jun 8, 2026 •

edited by macroscopeapp Bot

Loading

qltysh Bot commented Jun 8, 2026 •

edited

Loading

qltysh Bot commented Jun 8, 2026 •

edited

Loading