New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_cat/pending_tasks returns "unknown" tasks #9354
Labels
:Data Management/Stats
Statistics tracking and retrieval APIs
>enhancement
good first issue
low hanging fruit
help wanted
adoptme
v1.5.0
Comments
Apparently the reason for this is as follows: when a task times out, the timeout action is placed onto the pending tasks queue, without an insertion order and source unknown. This works fine on the master, but serialization to other nodes fails (which is not actually a problem, but isn't pretty). We should:
|
clintongormley
added
>enhancement
help wanted
adoptme
:Data Management/Stats
Statistics tracking and retrieval APIs
good first issue
low hanging fruit
v1.5.0
labels
Jan 20, 2015
bleskes
added a commit
to bleskes/elasticsearch
that referenced
this issue
Feb 12, 2015
…ds that go into InternalClusterService.updateTasksExecutor At the moment we sometime submit generic runnables, which make life slightly harder when generated pending task list which have to account for them. This commit adds an abstract TimedPrioritizedRunnable class which should always be used. This class also automatically measures time in queue, which is needed for the pending task reporting. Relates to elastic#8077 Closes elastic#9354
bleskes
added a commit
to bleskes/elasticsearch
that referenced
this issue
Feb 12, 2015
…ds that go into InternalClusterService.updateTasksExecutor At the moment we sometime submit generic runnables, which make life slightly harder when generated pending task list which have to account for them. This commit adds an abstract TimedPrioritizedRunnable class which should always be used. This class also automatically measures time in queue, which is needed for the pending task reporting. Relates to elastic#8077 Closes elastic#9354 Closes elastic#9671
bleskes
added a commit
to bleskes/elasticsearch
that referenced
this issue
Feb 12, 2015
…ds that go into InternalClusterService.updateTasksExecutor At the moment we sometime submit generic runnables, which make life slightly harder when generated pending task list which have to account for them. This commit adds an abstract TimedPrioritizedRunnable class which should always be used. This class also automatically measures time in queue, which is needed for the pending task reporting. Relates to elastic#8077 Closes elastic#9354 Closes elastic#9671
mute
pushed a commit
to mute/elasticsearch
that referenced
this issue
Jul 29, 2015
…ds that go into InternalClusterService.updateTasksExecutor At the moment we sometime submit generic runnables, which make life slightly harder when generated pending task list which have to account for them. This commit adds an abstract TimedPrioritizedRunnable class which should always be used. This class also automatically measures time in queue, which is needed for the pending task reporting. Relates to elastic#8077 Closes elastic#9354 Closes elastic#9671
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Data Management/Stats
Statistics tracking and retrieval APIs
>enhancement
good first issue
low hanging fruit
help wanted
adoptme
v1.5.0
We experienced a buildup of pending tasks in the cluster this weekend after some nodes dropped. Eventually, the cluster completely stopped processing tasks altogether. When I went to check the tasks with _cat/pending_tasks, I saw some tasks that looked to be invalid:
7083 12m URGENT shard-started ([session-2014-12-27][24], node[RlRevMXMSOqc3qORtp3xew], [P], s[INITIALIZING]), reason [after recovery from gateway]
-12 12m unknown
7092 12m URGENT shard-started ([session-2014-12-31][8], node[vEgbZbjXTVixtbpmh6Dh5A], [P], s[INITIALIZING]), reason [after recovery from gateway]
When calling _cat/pending_tasks from a node that wasn't the current master, we would occasionally see get a 500 saying "can't deserialize task - no priority exists for [-12]".
It was suggested to us that this has been when different JVM versions are running in the same cluster, but this isn't the case in our deployment. There is an ES support ticket open for this overall outage (6726) that may provide some more context.
The text was updated successfully, but these errors were encountered: