New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nodes leaving cluster #17501
Comments
It think it's important to understand why they nodes left the cluster. Do see anything in the logs like long GC pauzes? do you monitor the memory usage of the nodes? Also, when the need leaves, do you see anything of note in the node's logs? |
I just checked my logs and I don't see anything specific for leaving the cluster. And I have to do something like a query or checking KOPF to realize they are missing. I remember when I just started setting it up, the nodes would leave if they couldn't find an active host in: discovery.zen.ping.unicast.hosts About memory, I've set an ES_HEAP_SIZE of 32g but I don't know if it's relevant. If one of node leave the cluster again, I'll try to get more info |
Just happened again, this is what my logs says
It tells me memory problem but I've set 32g which should be enough and from KOPF, i'm not using a lof of memory either. Each of my instances have a discovery zen so I don't know why they can't find the master anymore which didnt leave my cluster. Also, I noticed that the only nodes that are leaving everytime are those I had set on triple instances on one machine, with 32g for each instances, but I should have at least 100g unused RAM on these machines. |
This line indicates you are indeed having memory problems:
Once a node goes out of memory it becomes unreliable and indeed needs to be restarted. You mention KOPF indicates you have enough memory. Can you elaborate? |
When you see the message "unable to create new native thread" this generally means that you have an issue limiting the number of processes that the elasticsearch user can create. The exact resolution varies from system to system, but in general on Linux you would look at |
It's not relevant to the issue that you're seeing in the logs ("unable to create new native threads"). However, you'll actually see better results if you drop the heap slightly below 32g. On the version of Elasticsearch that you're running, when a node starts up you'll see a message that says
but if you drop the heap below 32g you'll be able to take advantage of compressed oops
This actually gives you more useable heap, and the smaller pointers are friendlier to memory bandwidth and CPU caches. |
On KOPF, I can check the HEAP usage where my instances usually are using around 1Gb for 31.81Gb Max, it's far from that. That's why I find this really weird. i do have some nodes with a low max RAM but they're not those who leave my cluster and are actually quite stable Haven't check limit.conf yet, will do it ASAP |
@jasontedor has already provided the answer, and given that this topic is more suited to the forums than github, I'm going to close |
Hello,
I'm using Elasticsearch 2.3 with JVM 1.7 on CentOS 6.6.
I recently set up an Elasticsearch cluster with 14 nodes but I met a little problem I can't understand.
Sometimes I have nodes leaving the cluster for no apparent reason and they won't come back unless I restart the instance.
It's kind of annoying because it messes up all the sharding and things like that.
I don't really understand because of every single instance, I filled
discovery.zen.ping.unicast.hosts:
with all the instance's hosts.
The text was updated successfully, but these errors were encountered: