-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to regulate off-heap usage? #26269
Comments
That would not be the limit because Netty (our underlying networking framework) uses its own direct buffer space. With the settings we provide by default, that should lead to up to another
|
Elasticsearch version ( Plugins installed: [] JVM version ( OS version ( Description of the problem including expected versus actual behavior: The HEAP is fine, it seems to be non HEAP. With 5.3.0, the RSS was always ~100% the HEAP size. With 5.5.1, the RSS will grow until ~200% the HEAP size and then vary between ~150% and ~200%. We confirmed this by reducing the HEAP size. We used to have OOM because our heap size was half of RAM, but now it's fine with heap size at the quarter of RAM. Steps to reproduce:
Provide logs (if relevant):
|
@Shizuu Can you verify the JVM options that Elasticearch is started with? They are printed in the logs on startup, can you share the log line here? |
I disabled the disabling of explicit GC and enabled that netty debug logging:
My environment doesn't allow me to enable & query NMT. Instead of enabling explicit GC's, does adding [1] https://netty.io/4.0/xref/io/netty/util/internal/PlatformDependent.html#141 |
And in case this helps, this is what cmd-line looks like:
|
You still have disabling of explicit GC there. |
Here it is :
Explicit GC is disabled (I took care to check in jvm.option when I updated to 5.5.1). Before posting the issue here I tried to :
|
You have JMX and an agent installed. Would you please remove those and see if the issue continues to reproduce? I want to be clear: I am not suggesting these are the cause (I doubt they are), rather I need to get this down to the essence that I can reproduce before I can make progress on this. The more variables we remove, the simpler our effort will be. And maybe we get lucky before rolling up our sleeves. |
No problem, I can remove them. We use them mostly to monitor the JVM.
But RSS is still growing above 6G. |
Thanks for the quick turnaround. Let's try another thing: from Elasticsearch 5.3.0 to 5.5.1, we upgraded Netty 4 from 4.1.7 to 4.1.11, and Lucene from 6.4.1 to 6.6.0. Those are quite some changes, on top of new features and so on (but the most likely culprits to leaking native memory are Netty and Lucene). Let's start by seeing if it could be if it's something new in Netty 4. Can you start Elasticsearch with |
I think you nailed it. The RSS stabilized around 4.7G. 👍 |
I would not call it nailing it yet. We only confirmed a strong hunch I had that it was Netty 4, now I need to understand what changed in Netty 4 that is doing this. 😄 |
So let's go back to the beginning: now we have a culprit, now we need to find what in Netty 4 is causing this. What can you offer in the way of a reproduction? Any straightforward steps I can follow that reliably reproduces the issue? With a reproduction in hand, and knowing where to look, it would be time to roll up our sleeves and find the root cause. |
I don't know exactly how to reproduce. Another thing I just noticed : our active master is using way more CPU than before. |
I tried a few different approaches tonight to see if I could come up with a reproduction and I do not have one. What is your method for ingesting documents into Elasticsearch? |
We use bulk indexing across all 24 datanodes (in the process on building ingest nodes). I had an idea since the CPU usage is anormally high, I checked the hot_threads API on the active master and I have this : |
With netty 4, same threads taking 100% of a core : |
@Shizuu how big is your cluster state at the moment? Looking at the hot threads, all these threads are doing is sending mappings over the network. |
Yes we have a pretty big cluster state, because we use a lot of types, even for similar documents we can have dozens of types (i know it's bad, we plan to reduce that to be ready for 6.0 & 7.0). |
Little update : I reproduced a equivalent cluster state in another environment, and doing heavy indexing and triggering a lot of mapping changes, but didn't manage to reproduce the issue. Then I asked myself if the issue is really linked to indexing and mapping changes. And the result is the issue was still there on the active master :
So I don't know, maybe my cluster is a weird state after the upgrade. |
I found the culprit ! and managed to reproduce the issue (kinda) I searched why the master was always spreading mapping information on the network, even when the mapping was not changing. What is on these nodes ? Kibana. 15 instances of Kibana on each nodes, so 30 total. I managed to isolate what query Kibana was doing that was generating this load on the master, here it is:
And each Kibana instance is doing this query every 4 seconds. I tried to spam this query on our staging environment that has elasticsearch 5.5.1, and I observed the same behavior : RSS increasing and a lot of threads on the master. Side note though; Kibana 5.3.0 is not doing this request. In conclusion :
I'm still wondering why this specific requests is working with mapping information, because the response has little to do with mapping (even without the filter_path). What do you think of all of this ? (and sorry @erik-stephens I may have hijacked your issue for something completely unrelated) |
Thank you, this is massively helpful! I will see if it reproduces on my side as well and start serious debugging now that we have a possible reproduction. |
@Shizuu No worries, your involvement is welcome! To help summarize things on our end:
|
Sadly, the leak does not reproduce as reported. Is there something special about your cluster state (aside from being large)? |
We have some very long types, but I don't think this is it. Did you test with " |
I'm trying to replicate the issue on another cluster we upgraded in 5.5.1, without success for now, but it has a small cluster state. I'm trying to generate a bigger state and to figure what in it might be triggering the issue. |
Yes, I was using |
Okayyyy, I have done countless tests, tried to generate many indices, mappings, etc. playing with options we have activated etc.... no luck... But I think this time I have a reliable reproduction (if not well... nevermind!) So you need :
Then :
In the end, I found the issue is caused by indices with too many types, and with very long types. (tried to put the same mapping with shorter types like 'type-001, type-002 etc.. but the issue wasn't triggering). |
First, thank you for all your time on this. Second, I'm right there with you the last few days. I will immediately try your reproduction report. |
This happened. That's okay though, it's expected. Did something problematic happen after this? |
It didn't happen in 5.3.0 so we were surprised. |
I tested your reproduction on 5.3.0 and it behaved the same as on 5.5.2. |
I do not think there is a leak here. I think that Netty and the JVM are behaving as expected, with Netty using a pool of direct byte buffers up to the maximum direct memory. I compiled The reason that Netty 3 does not behave this way is due to it using unpooled heap byte buffers for the large writes that result from sending the compressed cluster state across the wire. These heap byte buffers will be copied to direct byte buffers managed by the JVM as opposed to being pooled by Netty (the JVM will use a cached direct buffer, or allocate a new one if no cached direct buffer is available). If you start the JVM with The reason that you only see this with You can control this by setting max direct memory, or by tuning components of the Netty buffer pool. Thank you for the report, and I appreciate the responsiveness and the effort to work through this. Issues like this can be serious and nasty, so we investigate them thoroughly treating them as guilty until proven innocent. In this case, I am going to close this as a non-issue. |
With
Before focusing on netty, I used this script to quantify how much attributed to elasticsearch/lucene, it looks to not be honoring that setting:
That analysis look correct? Any other ideas/settings to regulate that usage? Thanks! |
To address the question of whether re-allowing explicit GC's will help, should a non-explicit GC attempt to free the native buffers? We are seeing full GC's without a corresponding release of native memory. I was able to re-enable explicit GC's via
Also, I appreciate that this might not be an elasticsearch issue and perhaps more a lucene or jvm issue. Bugging elasticsearch first hoping someone has already tackled this kind of problem. It sounds like most people are running in a dedicated machine (as opposed to a container) and that letting it grow unbounded, managed by the OS is acceptable but it does beg the question of why |
I just learned that mmap'd files are not considered direct memory, so that at least explains why MaxDirectMemorySize wasn't having any impact. The memory isn't locked (according to our configs & smaps output), so still trying to figure out why OOM'ing when so much available in the fs cache. I need to look at system level and stop bugging y'all :) |
@xgwu For background, our elasticsearch process is co-hosted with another process that could attempt to allocate a lot of memory quickly. If your elasticsearch process is the only major process, then I doubt you'll have much risk of an oom situation. My understanding is that this usage is a good thing in that the system is putting unused memory to good use. It becomes not so good once that usage starts putting pressure on the kernel to allocate new pages quick enough. If it can't, then it will oom-kill processes, looking at your large elasticsearch process as a good target. I experimented with increasing In short, it's a linux memory mgmt issue and not an elasticsearch issue. Some references that helped me: |
@erik-stephens Thank you very much for the enlightenment! Then I have nothing to worry about since our elasticsearch is the only major process on the host. |
@erik-stephens Our Elasticsearch process eventually consumed over 100GB of memory and got OOM killed by OS. It's the only major process running on the box, so I think there is still something wrong. After more investigation, I suspect it's related to java-bytebuffer-leak described in this blog ->java-bytebuffer-lea. I just added |
FYI. For my particular case, the culprit of off heap memory leak seems to be G1 GC on the JVM version ( |
I see to have similar issue with Elasticsearch 7.7.0
This is the Java -version. My RAM parcent shows its 99% full and Buffer/Cache has 80-90 GB. openjdk 14 2020-03-17 Here are all 10 nodes in the ES cluster. All of them show almost 99% ram.percent
|
Elasticsearch version (
bin/elasticsearch --version
):Version: 5.5.0, Build: 260387d/2017-06-30T23:16:05.735Z, JVM: 1.8.0_131
Plugins installed: []
JVM version (
java -version
):openjdk version "1.8.0_66-internal"
OpenJDK Runtime Environment (build 1.8.0_66-internal-b17)
OpenJDK 64-Bit Server VM (build 25.66-b17, mixed mode)
OS version (
uname -a
if on a Unix-like system):Linux 4.4.0-72-generic #93-Ubuntu SMP Fri Mar 31 14:07:41 UTC 2017 x86_64 GNU/Linux
Docker version:
Docker version 1.6.2, build 7c8fca2
Description of the problem including expected versus actual behavior:
I expect the process RSS to not greatly exceed the sum of -Xmx & -XX:MaxDirectMemorySize. On a system with 128G of system memory, the elasticsearch process quickly reaches that sum (30G + 32G) and increases slowly until a host level OOM (OS, not JVM). I've ruled out MetaSpace and things related to threads (num+stack size, thread memory pools). Please refer to this thread for more background.
Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: