New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elastic Server automatically shutdown #13600
Comments
Hi, I am having the same problem. I have a 4 nodes cluster and node seems to randomly crash. The nodes are running on Ubuntu 15.04, kernel version 3.19.0.28. This is a small installation, so each node is configured with 512MB of RAM. I have tried increasing it and it didn't make any difference.
Below is the content of the /var/crash/_usr_lib_jvm_java-7-oracle_jre_bin_java.112.crash:
Below is the content of my /etc/elasticsearch/elasticsearch.yml:
Below is the output from "curl -XGET http://192.168.1.131:9200/_cluster/health?pretty
Any idea about what is going on? I am running Elasticsearch 1.7.1, installed from the debian .deb file, pulled directly from the elasticsearch.org website. Thanks a lot, |
If it is always the same server that is crashing and all servers have the same version of JVM and os, it might be faulty memory on this server. Could you try running a memory diagnostics test to rule faulty memory out? |
Unfortunately, it isn't always the same server. Also, all servers are running as a Virtual Machine on the same physical server, so if it was faulty memory, wouldn't that affect all the virtual machines? |
yes correct the same situation is with me as well. Just for clarification my nodes are running on windows server 2008 R2 virtual machines |
There is a good chance it is libsigar causing this. try deleting the 2.0 doesn't rely on sigar at all |
OK... I have removed the lib/sigar directory on 2 of my 4 nodes. Let see if it makes any difference. |
Hi, so far, my cluster has been pretty stable. Unfortunately, the host, that runs it, crashed and I had to restart everything, but for the last 48 hours, it has been stable. I'll keep monitoring it and will report back. |
I have looked at my cluster this morning and again, it looks like the java executable crashed again.
Below is the content of the .crash file:
I have been searching for further information and, unfortunately, can't find anything as to why the java process crashed. Thanks a lot, |
I'd suggest upgrading your JVM. Also, giving Elasticsearch 512MB is pretty damn limited, especially with 300 indices. That is probably unrelated to this JVM crash though. |
I have upgraded the RAM on the VMs to 1GB, although I don't see any out of memory error messages. |
@dbblackdiamond Have you upgraded the JVM, which was my primary advice? |
@clintongormley: yes I have upgraded my virtual machines from 512MB of RAM to 1GB of RAM. I haven't touched the Java settings though. |
@clintongormley: I used debug diagnostic tool by Microsoft which gave me exception report occurred in java.exe. |
@clintongormley: I have increased the memory even more on one of my servers. I increased it to 4GB of RAM for the server and changed the JVM Heap Settings to use 2GB of RAM. So far, things have been stable for all servers. I'll keep monitoring it. |
@clintongormley: despite increasing the JVM Heap Setting from 512MB to 2GB of RAM, the java process still crashed. Is there anyway to debug what is causing the java process to crash? |
@dbblackdiamond I repeat: Have you upgraded the JVM, which was my primary advice? |
@clintongormley: My apologies, but I must have misunderstood what you meant by "upgrading the JVM". I thought you were talking about upgrading the resources available to the JVM, ie RAM, which I did, but it seems that you meant upgrading the JDK software version. What version would you recommend? The latest Java 8 version? |
You want java 8 update 40 or above. For the record, when JVM crashes, you want to first find the crash log, otherwise nobody can debug anything :) See http://docs.oracle.com/javase/7/docs/webnotes/tsg/TSG-VM/html/felog.html#gbwcy for some hints at where it might be. |
@rmuir: Thanks a lot for the recommendation. I have updated 2 out of my 3 servers with Java 8 update 60 and I'll monitor the cluster to see if it is more stable.
Hopefully that makes some sense to you. |
hope your upgrade helps. I can't find any exact bug matching that, except https://bugs.openjdk.java.net/browse/JDK-8129961... |
@clintongormley : I have upgraded java on 2 out of my 3 elasticsearch servers:
and found them down this morning, so the upgrade hasn't help, as the only server that stayed up was the non-upgraded one. Any other idea? What do you think I should do? Thanks a lot in advance, |
@dbblackdiamond try removing the entire |
@rmuir: thanks a lot for the suggestion, but I have already done that. |
Ok id report the crash to oracle. Not much we can do here i am afraid. |
Maybe even try hotspot-gc-use (http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use) directly and mention that the crash looks a lot like https://bugs.openjdk.java.net/browse/JDK-8129961. Perhaps it affects more than just java 9... |
Based on what @rmuir said, it doesn't sound like there is much we can do here. Closing |
I have three elastic servers all masters having total 32 GB ram each,16 GB allocated to elastic servers.
From last few days any one of three servers randomly automatically goes down and the java.exe is killed automatically.Even no error is logged in elastic search logs.
Am using virtual environment with network file system to store data and logs
Elastic search version is 1.7.0
Indices count:around 300
Total documents in whole elastic search : around 60 million
Below is my config
cluster.name: Cluster1
cluster.routing.allocation.disk.threshold_enabled: false
script.disable_dynamic: false
node.name: "Master1"
node.master: true
node.data: true
index.query.bool.max_clause_count: 50100
indices.fielddata.cache.size: 25%
indices.fielddata.cache.expire: 5m
action.disable_delete_all_indices: true
indices.cluster.send_refresh_mapping: false
index.cache.field.type: soft
path.data: \nas5\Elasticsearch\Data
path.logs: \nas5\Elasticsearch\Logs\Master1
bootstrap.mlockall: true
http.max_content_length: 999mb
indices.recovery.max_bytes_per_sec: 100mb
indices.recovery.concurrent_streams: 5
above config goes same for all three servers Master1,Master2,Master3.
The text was updated successfully, but these errors were encountered: