Possible memory issue with jruby / java / linux. #4367

Open
TheWudu opened this Issue Dec 7, 2016 · 8 comments

Projects

None yet

3 participants

@TheWudu
TheWudu commented Dec 7, 2016

Hi!

We experience some problems with the memory management of jruby / jvm / or linux?. As we are not sure if it belongs to jruby or not, I just want to know your opinion on that. We do not know how we can proceed with that issue any more.

Our environment:

  • JRuby 9.1.5.0
  • Linux 3.13.0-95-generic #142~precise1-Ubuntu SMP Fri Aug 12 18:20:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • Options:
    -Djruby.shell=/bin/sh
    -Djffi.boot.library.path=
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.authenticate=false
    -Djava.rmi.server.hostname=localhost
    -Djava.security.egd=file:/dev/./urandom
    -Djruby.home=//.rvm/rubies/jruby-9.1.5.0
    -Djruby.lib=//.rvm/rubies/jruby-9.1.5.0/lib
    -Djruby.script=jruby
    -Djruby.daemon.module.name=Trinidad
    -Xmx3193m
    -Xms3193m
    -XX:PermSize=512m
    -XX:+UseG1GC
    -XX:MaxMetaspaceSize=768m
    -XX:MetaspaceSize=512m
    -Xss2048k
    -Xbootclasspath/a://.rvm/rubies/jruby-9.1.5.0/lib/jruby.jar
    -Dcom.sun.management.jmxremote
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.authenticate=false
    -Dcom.sun.management.jmxremote.port=1098
    -Djava.rmi.server.hostname=
    -Dfile.encoding=UTF-8
    -Dcommons.daemon.process.id=11164
    -Dcommons.daemon.process.parent=11163
    -Dcommons.daemon.version=1.0.8 abort
  • Framework: Triniad + Sinatra

When we look at the htop output, or at our graphs the server needs more and more memory over time. E.g. currently it shows
VIRT: 11,2G
RES: 6738M
running since last friday. The server itself has 7984 MB RAM.

When i connect the process using visual vm, I see that it uses ~ 1.5 gb of memory on heap, which is below the given amount (3193 mb), it uses about 175 mb metaspace, which is far below the given amount of 768 mb). It has between 58 and 62 threads (daemon and live), quite constant with a few spikes up to 72, which are opened and closed immediately.

The problem is, the system believes the process needs that much memory, but the java monitor shows a completely different status. When i restart the service (restart the process) the memory is freed again.

Can you give any hint on how to debug that further? Or maybe you know something else which could cause such an issue?

Any help would be appreciated,
Martin

@kares
Member
kares commented Dec 7, 2016

there's been couple of sim.reports such as these lately but somehow always turned out to be app specific.
my first advise is to get a heap dump and start poking around or setup a JVM monitoring tool and monitor when memory usage increases. you should not need that much metaspace unless doing hot redeploys.

we can certainly look at heap dump but its time-consuming ;( and sometimes needs source code access.
... are you seeing the "leak-like" behaviour just lately, did you change something, does it get more load?
(all relevant to consider/think about if you're running into this just lately)

@TheWudu
TheWudu commented Dec 7, 2016

We did not have these issues with jruby-1.7.x before and the load did not increase really. We have about 30-40k rpm in average the last 7 days with peaks up to ~ 150k rpms. The heap dumps does not show anything specific. If i do a dump file now it has about 1 gb, even if htop tells about 11.3gb VIRT memory, and MAT tells me that its about 300 mb of data on heap. Which matches with the visual vm output.

@TheWudu
TheWudu commented Dec 7, 2016 edited

Additional the Leak Suspect from MAT tells me:
20,790 instances of "org.jruby.ir.IRMethod", loaded by "" occupy 71,470.79 KB (21.69%) bytes.

8,804 instances of "org.jruby.MetaClass", loaded by "" occupy 64,254.32 KB (19.50%) bytes.

Biggest instances:

org.jruby.MetaClass @ 0x725658710 - 9,759.79 KB (2.96%) bytes.
org.jruby.MetaClass @ 0x72d22c4e8 - 6,511.81 KB (1.98%) bytes.

4,938 instances of "org.jruby.RubyClass", loaded by "" occupy 38,600.81 KB (11.72%) bytes.

Biggest instances:

org.jruby.RubyClass @ 0x700a7ebb0 - 4,300.45 KB (1.31%) bytes.

But as these "leak suspects" are part of the 300 mb on the heap, i does not expect them to be an issue.

@kares
Member
kares commented Dec 7, 2016

thus if you're sure there's nothing crazy going on during requests can you let it (does it) fail with an OoME?
... that would confirm if there's a heap problem or one elsewhere.

The problem is, the system believes the process needs that much memory, but the java monitor shows a completely different status. When i restart the service (restart the process) the memory is freed again.

so it must be native memory than? just guessing - there's only so much one could do without looking into it.
naively, did you try going to 9.1.6.0 ?

@headius
Member
headius commented Dec 11, 2016

It does sound like a native memory leak based on the numbers so far, especially if Java tools report a significantly smaller heap.

@TheWudu
TheWudu commented Dec 13, 2016

Yes I expect the same. Any clue how to find the leak? Any cool tools maybe?

@kares we did not try to use jruby 9.1.6.0 yet.

@headius
Member
headius commented Dec 14, 2016

First thing would be to confirm 9.1.6.0 has the problem. There were many fixes in that release.

After that, there's a link on the wiki to flags for profiling memory, tools for getting and analyzing heap dumps, etc.

@headius
Member
headius commented Dec 14, 2016

Oh, realized now some of those tools may not be super helpful for native memory leaks, and you already mentioned you've used MAT to analyze a heap dump. Ok...so let's get you on 9.1.6.0 and see if there's still a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment