Support Kafka supervisor adopting running tasks between versions #6958

dclim · 2019-01-30T04:03:56Z

Motivation

A Kafka supervisor expects to 'own' all index_kafka type tasks running on its configured datasource. When the supervisor starts up, it checks TaskStorage to get a list of all the active tasks, and for each running Kafka task for the datasource, it checks whether the task is 'adoptable' or has to be terminated because it does not match the supervisor's configuration. A task is adoptable by a supervisor if it was created with the same strategy for assigning Kafka partitions, has a set of starting offsets that matches the state described in the metadata store, and has the same data schema and tuning config. All of this information is represented in the sequenceName hash that is calculated on task creation and is stored as part of the task metadata.

Currently, when checking for adoptability, the supervisor generates a hash based on its expected starting offsets and its data schema / tuning configuration, and then compares the hash to the one generated when the task was created. If they match, the supervisor adopts and starts tracking the lifecycle of the task; otherwise the task is killed and a new one is created from the supervisor's configuration.

If between Druid versions the definition of DataSchema or KafkaTuningConfig changes (for example a new field is added), this implementation will cause the calculated hash to differ and the task to be rejected, even if the ingestion specs are 'equivalent'. This is because the logic compares the stored hash generated using the old ingestion schema definition with one generated with the new schema definition which may be backward-compatible but generates a different hash. As a result, doing a rolling update of the overlord may cause indexing tasks to terminate unnecessarily.

Proposed Changes

Instead of the supervisor computing the sequenceName hash for its own configuration and comparing this to the hash the task was created with, the supervisor should re-compute the task's hash using the task's data schema and tuning config. This will have the effect of changes in the data schema and tuning config class definitions being represented in the hash so that a proper comparison can be made.

Changed Interfaces

None

Migration

None

Alternatives

An alternative implementation might be to exclude the data schema and tuning config from the hash altogether, meaning that a supervisor would adopt an existing running task without consideration of the ingestion parameters, and would just let it run to completion. Once that task completes, the supervisor would then spawn subsequent tasks that match the supervisor's new configuration regardless of how the previous task was configured.

I think the complexity here comes in when you consider replica tasks and what a supervisor should do if there are replicas and a task fails. Since replicas are expected to do the same thing and generate identical segments independently, the supervisor would need to spawn a replacement task that has a configuration that matches the old configuration being used by the replica rather than using its own configuration.

There may be other complications in not including the ingestion spec as part of the sequenceName hash. Ultimately, the proposed change feels more straightforward and accomplishes the objective of supporting task adoption across versions.

The text was updated successfully, but these errors were encountered:

justinborromeo · 2019-02-27T01:04:34Z

I'm working on this issue

pdeva · 2019-03-11T23:25:56Z

is this the cause of #6854

gianm · 2019-03-11T23:38:01Z

It sounds like it probably is.

pdeva · 2019-03-11T23:39:46Z

the documentation for rolling updates should then atleast reflect this info that there can be several minutes of query downtime when doing a rolling update to the overlord.
if any configuration value can be adjusted to make this downtime shorter, it should be listed in said documentation

gianm · 2019-03-11T23:42:19Z

It looks like there is intent to fix this before the next release anyway.

pdeva · 2019-03-11T23:43:41Z

correct me if i am wrong, but this is scheduled for 0.15 release right?
shouldn't we atleast update the 0.14 release where this issue will still exist

upgrading overlord nodes in 0.14.0 can result in few minutes of query downtime for KIS tasks, as noticed in apache#6854 this behavior should be documented until this is fixed by apache#6958 in 0.15.

dclim added the Proposal label Jan 30, 2019

justinborromeo mentioned this issue Mar 8, 2019

Support Kafka supervisor adopting running tasks between versions #7212

Merged

pdeva mentioned this issue Mar 12, 2019

document query downtime behavior of upgrading overlords in 0.14 #7241

Closed

jon-wei mentioned this issue Mar 15, 2019

0.14.0-incubating release notes #7126

Closed

jon-wei closed this as completed in #7212 Apr 11, 2019

clintropolis mentioned this issue Apr 25, 2019

0.14.1-incubating release notes #7553

Closed

jihoonson added this to the 0.14.1 milestone May 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Kafka supervisor adopting running tasks between versions #6958

Support Kafka supervisor adopting running tasks between versions #6958

dclim commented Jan 30, 2019

justinborromeo commented Feb 27, 2019

pdeva commented Mar 11, 2019

gianm commented Mar 11, 2019

pdeva commented Mar 11, 2019

gianm commented Mar 11, 2019

pdeva commented Mar 11, 2019

Support Kafka supervisor adopting running tasks between versions #6958

Support Kafka supervisor adopting running tasks between versions #6958

Comments

dclim commented Jan 30, 2019

Motivation

Proposed Changes

Changed Interfaces

Migration

Alternatives

justinborromeo commented Feb 27, 2019

pdeva commented Mar 11, 2019

gianm commented Mar 11, 2019

pdeva commented Mar 11, 2019

gianm commented Mar 11, 2019

pdeva commented Mar 11, 2019