(fix): prevent duplicate wfIDs in syncer#21524
Conversation
|
👋 justinkaseman, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
|
✅ No conflicts with other open PRs targeting |
|
I see you updated files related to
|
There was a problem hiding this comment.
Pull request overview
This PR fixes race conditions caused by duplicate workflow IDs appearing in reconciliation metadata when workflows are associated with multiple DON families or multiple sources. It adds deduplication guards in generateReconciliationEvents and clears stale pending events that could cause permanent invariant violations. Additionally, it removes the unused Hooks struct and redundant WaitForDon call at startup.
Changes:
- Added
workflowsSeenchecks to skip duplicate workflow IDs for both active (no engine) and paused status paths ingenerateReconciliationEvents. - Added stale pending event cleanup when an active engine is already registered (the
case truepath), preventing invariant violations across sources. - Removed unused
Hooksstruct, its initialization, and a redundantWaitForDoncall that is already performed on every tick insyncUsingReconciliationStrategy.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
core/services/workflows/syncer/v2/workflow_registry.go |
Adds duplicate-ID guards, stale pending event cleanup, and removes unused Hooks type and redundant startup logic |
core/services/workflows/syncer/v2/workflow_registry_test.go |
Adds three test cases covering duplicate IDs with pending events, stale pending events with active engines, and duplicate paused workflows |
|
| for _, wfMeta := range workflowMetadata { | ||
| workflowMetadataMap[wfMeta.WorkflowID.Hex()] = wfMeta | ||
| } | ||
|
|
There was a problem hiding this comment.
Consider deduping here by just dumping it into a map and logging a warn if there's a duplicate




Changes
Since the Workflow Syncer invokes the event handler via goroutines, duplicate events can cause races.
Three scenarios can lead to unintended workflow loading behavior:
These happen because of a lack of checks for
workflowsSeen[id]. Adds guards for these cases.Additionally this PR:
OnStartFailure, and it is not used. The place where it is invoked is in an initialw.workflowDonNotifier.WaitForDon(ctx)call. This is redundant, becausesyncUsingReconciliationStrategycalls this on every tick.