-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(core): Move event and telemetry handling into workers in queue mode #7138
refactor(core): Move event and telemetry handling into workers in queue mode #7138
Conversation
…-with-current-state # Conflicts: # packages/cli/src/commands/worker.ts
…-with-current-state
…-with-current-state
…-with-current-state # Conflicts: # packages/cli/src/AbstractServer.ts
…-with-current-state
…-with-current-state # Conflicts: # packages/cli/src/eventbus/eventBus.controller.ts
…-with-current-state
…rker # Conflicts: # packages/cli/src/commands/worker.ts # packages/cli/src/eventbus/MessageEventBus/MessageEventBus.ts
Great PR! Please pay attention to the following items before merging: Files matching
Files matching
Make sure to check off this list before asking for review. |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #7138 +/- ##
=======================================
Coverage 32.40% 32.40%
=======================================
Files 3276 3278 +2
Lines 198005 198019 +14
Branches 21652 21646 -6
=======================================
+ Hits 64162 64167 +5
- Misses 132782 132791 +9
Partials 1061 1061
☔ View full report in Codecov by Sentry. |
…rker # Conflicts: # packages/cli/src/commands/worker.ts # packages/cli/src/worker/workerCommandHandler.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, just suggesting some leftover code that was still in this PR.
@@ -193,7 +199,7 @@ export class Worker extends BaseCommand { | |||
); | |||
await additionalData.hooks.executeHookFunctions('workflowExecuteAfter', [failedExecution]); | |||
} | |||
return { success: true }; | |||
return { success: true, error: error as ExecutionError }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also remove this property since this is not in use at the moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that one was actually being used but i just noticed that merging master in removed it...
@@ -23,6 +23,7 @@ export interface JobData { | |||
|
|||
export interface JobResponse { | |||
success: boolean; | |||
error?: ExecutionError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe remove also this one?
|
1 flaky test on run #2165 ↗︎
Details:
cypress/e2e/24-ndv-paired-item.cy.ts • 1 flaky test
This comment has been generated by cypress-bot as a result of this project's GitHub integration settings. |
|
2 similar comments
|
|
✅ All Cypress E2E specs passed |
Got released with |
This PR adds `status` to run data so that `determineFinalExecutionStatus` resolves correctly on execution failure and removes the cleanup that is being duplicated in a worker hook. Followup to #7138 Should fix: - #7705 - https://linear.app/n8n/issue/PAY-964/no-execution-found-after-execution-fails - https://linear.app/n8n/issue/PAY-1010/execution-deletion-in-queue-mode-not-complying-with-settings
Motivation
In Queue mode, finished executions would cause the main instance to always pull all execution data from the database, unflatten it and then use it to send out event log events and telemetry events, as well as required returns to Respond to Webhook nodes etc.
This could cause OOM errors when the data was large, since it had to be fully unpacked and transformed on the main instance’s side, using up a lot of memory (and time).
This PR attempts to limit this behaviour to only happen in those required cases where the data has to be forwarded to some waiting webhook, for example.
Changes
Execution data is only required in cases, where the active execution has a
postExecutePromise
attached to it. These usually forward the data to some other endpoint (e.g. a listening webhook connection).By adding a helper
getPostExecutePromiseCount()
, we can decide that in cases where there is nothing listening at all, there is no reason to pull the data on the main instance.Previously, there would always be postExecutePromises because the telemetry events were called. Now, these have been moved into the workers, which have been given the various InternalHooks calls to their hook function arrays, so they themselves issue these telemetry and event calls.
This results in all event log messages to now be logged on the worker’s event log, as well as the worker’s eventbus being the one to send out the events to destinations. The main event log does…pretty much nothing.
We are not logging executions on the main event log any more, because this would require all events to be replicated 1:1 from the workers to the main instance(s) (this IS possible and implemented, see the worker’s
replicateToRedisEventLogFunction
- but it is not enabled to reduce the amount of traffic over redis).Partial events in the main log could confuse the recovery process and would result in, ironically, the recovery corrupting the execution data by considering them crashed.
Refactor
I have also used the opportunity to reduce duplicate code and move some of the hook functionality into
packages/cli/src/executionLifecycleHooks/shared/sharedHookFunctions.ts
in preparation for a future full refactor of the hooks