-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Kafka supervisor adopting running tasks between versions #6958
Comments
I'm working on this issue |
is this the cause of #6854 |
It sounds like it probably is. |
the documentation for rolling updates should then atleast reflect this info that there can be several minutes of query downtime when doing a rolling update to the overlord. |
It looks like there is intent to fix this before the next release anyway. |
correct me if i am wrong, but this is scheduled for 0.15 release right? |
upgrading overlord nodes in 0.14.0 can result in few minutes of query downtime for KIS tasks, as noticed in apache#6854 this behavior should be documented until this is fixed by apache#6958 in 0.15.
Motivation
A Kafka supervisor expects to 'own' all
index_kafka
type tasks running on its configured datasource. When the supervisor starts up, it checksTaskStorage
to get a list of all the active tasks, and for each running Kafka task for the datasource, it checks whether the task is 'adoptable' or has to be terminated because it does not match the supervisor's configuration. A task is adoptable by a supervisor if it was created with the same strategy for assigning Kafka partitions, has a set of starting offsets that matches the state described in the metadata store, and has the same data schema and tuning config. All of this information is represented in thesequenceName
hash that is calculated on task creation and is stored as part of the task metadata.Currently, when checking for adoptability, the supervisor generates a hash based on its expected starting offsets and its data schema / tuning configuration, and then compares the hash to the one generated when the task was created. If they match, the supervisor adopts and starts tracking the lifecycle of the task; otherwise the task is killed and a new one is created from the supervisor's configuration.
If between Druid versions the definition of
DataSchema
orKafkaTuningConfig
changes (for example a new field is added), this implementation will cause the calculated hash to differ and the task to be rejected, even if the ingestion specs are 'equivalent'. This is because the logic compares the stored hash generated using the old ingestion schema definition with one generated with the new schema definition which may be backward-compatible but generates a different hash. As a result, doing a rolling update of the overlord may cause indexing tasks to terminate unnecessarily.Proposed Changes
Instead of the supervisor computing the sequenceName hash for its own configuration and comparing this to the hash the task was created with, the supervisor should re-compute the task's hash using the task's data schema and tuning config. This will have the effect of changes in the data schema and tuning config class definitions being represented in the hash so that a proper comparison can be made.
Changed Interfaces
None
Migration
None
Alternatives
An alternative implementation might be to exclude the data schema and tuning config from the hash altogether, meaning that a supervisor would adopt an existing running task without consideration of the ingestion parameters, and would just let it run to completion. Once that task completes, the supervisor would then spawn subsequent tasks that match the supervisor's new configuration regardless of how the previous task was configured.
I think the complexity here comes in when you consider replica tasks and what a supervisor should do if there are replicas and a task fails. Since replicas are expected to do the same thing and generate identical segments independently, the supervisor would need to spawn a replacement task that has a configuration that matches the old configuration being used by the replica rather than using its own configuration.
There may be other complications in not including the ingestion spec as part of the sequenceName hash. Ultimately, the proposed change feels more straightforward and accomplishes the objective of supporting task adoption across versions.
The text was updated successfully, but these errors were encountered: