(WIP) Wait for initialization during KubernetesTaskRunner startup#15041
Closed
georgew5656 wants to merge 2 commits intoapache:masterfrom
Closed
(WIP) Wait for initialization during KubernetesTaskRunner startup#15041georgew5656 wants to merge 2 commits intoapache:masterfrom
georgew5656 wants to merge 2 commits intoapache:masterfrom
Conversation
added 2 commits
September 26, 2023 11:51
|
This pull request has been marked as stale due to 60 days of inactivity. |
|
This pull request/issue has been closed due to lack of activity. If you think that |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This fix attempts to bring the KubernetesTaskRunner more into line with the HttpRemoteTaskRunner (https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/hrtr/HttpRemoteTaskRunner.java#L560) w.r.t startup initialization.
Right now when the overlord becomes a leader using the KubernetesTaskRunner it adds all of the running tasks to its mapping, but doesn't wait for the underlying thread pool to finish syncing state from Kubernetes. This change attempts to do this (although it doesn't fail if it is unable to completely finish syncing)
Description
Best-effort attempt to sync state from Kubernetes completely before becoming the overlord leader when running mm-less ingestion.
In the start() method, after adding all the jobs in kubernetes to the tasks map, try to wait for the underlying thread pool to finish syncing state from K8s.
Release note
Improvments to overlord lifecycle when running mm-less ingestion
Key changed/added classes in this PR
KubernetesTaskRunnerThis PR has: