Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NPE in TaskLockbox that prevents overlord leadership #6512

Merged
merged 2 commits into from
Oct 25, 2018

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Oct 24, 2018

This error prevents the overlord from assuming leadership if extension that provides indexing task related jackson modules is not loaded, causing errors such as

2018-10-24T00:56:46,415 ERROR [LeaderSelector[/demo/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class java.lang.NullPointerException, exceptionMessage=null}
java.lang.NullPointerException
    at org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:105) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98) [druid-server-0.13.0-incubating.jar:0.13.0-incubating]
..

accompanied by:

2018-10-24T00:56:46,407 ERROR [LeaderSelector[/demo/overlord/_OVERLORD]] org.apache.druid.metadata.SQLMetadataStorageActionHandler - Encountered exception while deserializing task payload, setting task to null
com.fasterxml.jackson.databind.JsonMappingException: Could not resolve type id 'missingType' into a subtype of [simple type, class org.apache.druid.data.input.impl.ParseSpec]: known type ids = [ParseSpec, csv, javascript, json, jsonLowercase, regex, timeAndDims, tsv]
 at [Source: N/A; line: -1, column: -1] (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser["parseSpec"])
    at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) ~[jackson-databind-2.6.7.jar:2.6.7]
    at com.fasterxml.jackson.databind.DeserializationContext.unknownTypeException(DeserializationContext.java:967) ~[jackson-databind-2.6.7.jar:2.6.7]

This is done by filtering out null results from the list returned by implementations of TaskStorage.getActiveTasks, which are now treated as not active I guess since there is nothing we can do with them.

…that provides indexing task type is not loaded
@clintropolis clintropolis changed the title Fix NPE in TaskLockbox that prevents overlord leadership Fix NPE in TaskLockbox that prevents overlord leadership Oct 24, 2018
Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit.

@@ -183,7 +183,7 @@ public void setStatus(TaskStatus status)
try {
final ImmutableList.Builder<Task> listBuilder = ImmutableList.builder();
for (final TaskStuff taskStuff : tasks.values()) {
if (taskStuff.getStatus().isRunnable()) {
if (taskStuff.getStatus().isRunnable() && taskStuff.getTask() != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess task is never null in HeapMemoryTaskStorage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point 👍

@jihoonson jihoonson added the Bug label Oct 24, 2018
@jihoonson jihoonson added this to the 0.13.1 milestone Oct 24, 2018
@dclim
Copy link
Contributor

dclim commented Oct 31, 2018

@clintropolis could you backport this to 0.13.0-incubating?

clintropolis added a commit to clintropolis/druid that referenced this pull request Oct 31, 2018
* fix NPE that prevents overlord from assuming leadership if extension that provides indexing task type is not loaded

* heh
fjy pushed a commit that referenced this pull request Nov 1, 2018
* fix NPE that prevents overlord from assuming leadership if extension that provides indexing task type is not loaded

* heh
clintropolis added a commit to implydata/druid-public that referenced this pull request Feb 5, 2019
apache#6564)

* fix NPE that prevents overlord from assuming leadership if extension that provides indexing task type is not loaded

* heh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants