New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
index.unassigned.node_left.delayed_timeout not working stably in 1.7 #12566
Comments
@mkliu when the node left the cluster, do you see the log message about the delay in the ES logs, it should look like:
(where N is a number), can you paste what it says? |
|
@dakrone hmm, it's actually not doing reroute, as described in the first post, I had the manually kick start in the end. The
goes on and on and on and on. |
@mkliu can you increase the logging level for your cluster to DEBUG and make the master log available so I can take a look? |
No further feedback. Closing |
For example in the case below (data retrieved from _cluster/health)
Right after I kill the node:
I set the timeout to 30s. The node is back at around 10s later. But shards only gradually start recovering at until ~1.5 min later. And it's not at the speed I’m expecting. And I don’t know why it’s relocating_shards.
And worst is sometimes after a while it looks as if it stopped recovering, and I need to manually reroute unassigned.
The text was updated successfully, but these errors were encountered: