New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Task Manager] cancel expired tasks as part of the available workers check #88483
[Task Manager] cancel expired tasks as part of the available workers check #88483
Conversation
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* master: (33 commits) [Security Solution][Case] Fix patch cases integration test with alerts (elastic#88311) [Security Solutions][Detection Engine] Removes duplicate API calls (elastic#88420) Fix log msg (elastic#88370) [Test] Add tag cloud visualization to dashboard in functional test for reporting (elastic#87600) removing kibana-core-ui from codeowners (elastic#88111) [Alerting] Migrate Event Log plugin to TS project references (elastic#81557) [Maps] fix zooming while drawing shape filter logs errors in console (elastic#88413) Porting fixes 1 (elastic#88477) [APM] Explicitly set environment for cross-service links (elastic#87481) chore(NA): remove mocha junit ci integrations (elastic#88129) [APM] Only display relevant sections for rum agent in service overview (elastic#88410) [Enterprise Search] Automatically mock shared logic files (elastic#88494) [APM] Disable Create custom link button on Transaction details page for read-only users [Docs] clean-up vega map reference documenation (elastic#88487) [Security Solution] Fix Timeline event details layout (elastic#88377) Change DELETE to POST for _bulk_delete to avoid incompatibility issues (elastic#87914) [Monitoring] Change cloud messaging on no data page (elastic#88375) [Uptime] clear ping state when PingList component in unmounted (elastic#88321) [APM] Consistent terminology for latency and throughput (elastic#88452) fix copy (elastic#88481) ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticmachine merge upstream |
1 similar comment
@elasticmachine merge upstream |
💚 Build SucceededMetrics [docs]
History
To update your PR or re-run it, just comment with: |
…check (elastic#88483) When a task expires it continues to reside in the queue until `TaskPool.cancelExpiredTasks()` is called. We call this in `TaskPool.run()`, but `run` won't get called if there is no capacity, as we gate the poller on `TaskPool.availableWorkers()` and that means that if you have as many expired tasks as you have workers - your poller will continually restart but the queue will remain full and that Task Manager is then in capable of taking on any more work. This is what caused `[Task Poller Monitor]: Observable Monitor: Hung Observable...`
…check (#88483) (#88874) When a task expires it continues to reside in the queue until `TaskPool.cancelExpiredTasks()` is called. We call this in `TaskPool.run()`, but `run` won't get called if there is no capacity, as we gate the poller on `TaskPool.availableWorkers()` and that means that if you have as many expired tasks as you have workers - your poller will continually restart but the queue will remain full and that Task Manager is then in capable of taking on any more work. This is what caused `[Task Poller Monitor]: Observable Monitor: Hung Observable...`
Summary
Addresses 1 of 3 problems identified in #87874
When a task expires it continues to reside in the queue until
TaskPool.cancelExpiredTasks()
is called. We call this inTaskPool.run()
, butrun
won't get called if there is no capacity, as we gate the poller onTaskPool.availableWorkers()
and that means that if you have as many expired tasks as you have workers - your poller will continually restart but the queue will remain full and that Task Manager is then in capable of taking on any more work. This is what caused[Task Poller Monitor]: Observable Monitor: Hung Observable...
Checklist
Delete any items that are not applicable to this PR.
Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n supportDocumentation was added for features that require explanation or tutorialsAny UI touched in this PR is usable by keyboard only (learn more about keyboard accessibility)Any UI touched in this PR does not create any new axe failures (run axe in browser: FF, Chrome)If a plugin configuration key changed, check if it needs to be whitelisted in the cloud and added to the docker listThis renders correctly on smaller devices using a responsive layout. (You can test this in your browser)This was checked for cross-browser compatibilityFor maintainers