Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.OutOfMemoryError: Direct buffer memory error with artery #22133

Open
tpantelis opened this issue Jan 11, 2017 · 7 comments
Open

java.lang.OutOfMemoryError: Direct buffer memory error with artery #22133

tpantelis opened this issue Jan 11, 2017 · 7 comments
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted t:remoting:artery

Comments

@tpantelis
Copy link

Saw this exception with artery enabled:

15:30:51 2017-01-10 15:21:02,070 | ERROR | ult-dispatcher-3 | kka://opendaylight-cluster-data) | 203 - com.typesafe.akka.slf4j - 2.4.16 | Aeron error, Direct buffer memory
15:30:51 java.lang.OutOfMemoryError: Direct buffer memory
15:30:51 at java.nio.Bits.reserveMemory(Bits.java:693)[:1.8.0_111]
15:30:51 at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)[:1.8.0_111]
15:30:51 at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)[:1.8.0_111]
15:30:51 at akka.remote.artery.EnvelopeBufferPool.acquire(BufferPool.scala:40)[215:com.typesafe.akka.remote:2.4.16]
15:30:51 at akka.remote.artery.AeronSource$Fragments$$anon$2.onFragment(AeronSource.scala:59)[215:com.typesafe.akka.remote:2.4.16]
15:30:51 at io.aeron.FragmentAssembler.onFragment(FragmentAssembler.java:82)[218:io.aeron.client:1.0.4]
15:30:51 at io.aeron.logbuffer.TermReader.read(TermReader.java:74)[218:io.aeron.client:1.0.4]
15:30:51 at io.aeron.Image.poll(Image.java:211)[218:io.aeron.client:1.0.4]
15:30:51 at io.aeron.Subscription.poll(Subscription.java:132)[218:io.aeron.client:1.0.4]
15:30:51 at akka.remote.artery.AeronSource$$anonfun$akka$remote$artery$AeronSource$$pollTask$1.apply$mcZ$sp(AeronSource.scala:38)[215:com.typesafe.akka.remote:2.4.16]
15:30:51 at akka.remote.artery.TaskRunner.executeTasks(TaskRunner.scala:171)[215:com.typesafe.akka.remote:2.4.16]
15:30:51 at akka.remote.artery.TaskRunner.run(TaskRunner.scala:150)[215:com.typesafe.akka.remote:2.4.16]
15:30:51 at java.lang.Thread.run(Thread.java:745)[:1.8.0_111]

I had set maximum-frame-size higher to allow for larger messages which tripped the JVM's MaxDirectMemorySize which defaults to 64M. The settings can be adjusted to alleviate but I think the EnvelopeBufferPool should provide some protection against this. Even assuring that maximum-frame-size * buffer-pool-size is comfortably less than MaxDirectMemorySize doesn't seem safe as buffer-pool-size "is not a hard upper limit on number of created buffers. Additional buffers will be created if needed".

Perhaps if ByteBuffer.allocateDirect throws OutOfMemoryError, use a HeapByteBuffer via ByteBuffer.allocate.

@patriknw
Copy link
Member

patriknw commented Jan 11, 2017

We should absolutely document this (I know about it, but it fell through the cracks).

Is there a way to get the max limit in runtime so we could issue an warning?
JVM args could be retrieved from RuntimeMXBean but that is not so nice.

Trying to catch OutOfMemoryError doesn't sound attractive. If that is filled up it will probably start throwing at other places where we are not in control.

@tpantelis
Copy link
Author

The only viable way is via RuntimeMXBean. There is sun.misc.VM but you wouldn't want to use that of course.

@drewhk
Copy link
Member

drewhk commented Jan 12, 2017

Nevertheless, we can keep a counter on created DirectByteBuffers rather easily and fall back to heap based buffers once that limit is reached (and likely log a warning). This allows for some graceful degradation compared to a OOME.

@drewhk drewhk added 1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted t:remoting:artery labels Jan 12, 2017
@drewhk drewhk added this to the 2.4.17 milestone Jan 12, 2017
@tpantelis
Copy link
Author

Agree - sounds like a good solution.

@patriknw
Copy link
Member

Wouldn't that mean that we must always return the buffers to the pool? I don't think we have that as a hard requirement for all error situations currently.

@patriknw patriknw modified the milestone: 2.4.17 Feb 10, 2017
@nodefactory-bk
Copy link
Contributor

Just saw this on akka 2.6. 14 in a very memory constrained app.
It happens pretty rarely though.

Not sure if it is the same problem but this was the closest open ticket google found.

I have this set for frame sizes:

        artery {
            advanced {
                maximum-frame-size = 8MB
            }
        }

And the process is limited to 64MB heap.

The exception:

[2021-05-25 09:11:59.707 354145529] [xxx-next-xxx.dispatchers.logging-4] [akka.actor.ActorSystemImpl(xxx-next)] ERROR akka.actor.ActorSystemImpl  - Uncaught error from thread [xxx-next-akka.remote.default-remote-dispatcher-5]: Direct buffer memory, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[xxx-next]
java.lang.OutOfMemoryError: Direct buffer memory
	at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118)
	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317)
	at akka.remote.artery.EnvelopeBufferPool.acquire(EnvelopeBufferPool.scala:34)
	at akka.remote.artery.Encoder$$anon$1.onPush(Codecs.scala:117)
	at akka.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:541)
	at akka.stream.impl.fusing.GraphInterpreter.processEvent(GraphInterpreter.scala:495)
	at akka.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:390)
	at akka.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:625)
	at akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:502)
	at akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:600)
	at akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:775)
	at akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:790)
	at akka.actor.Actor.aroundReceive(Actor.scala:537)
	at akka.actor.Actor.aroundReceive$(Actor.scala:535)
	at akka.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:691)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:579)
	at akka.actor.ActorCell.invoke(ActorCell.scala:547)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
	at akka.dispatch.Mailbox.run(Mailbox.scala:231)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)

@johanandren
Copy link
Member

The direct buffers are allocated outside of the heap (for performance/efficiency), what was previously mentioned was the MaxDirectMemorySize which sets a total limit for such direct memory allocations for the JVM.

Before deciding to allocate a new direct buffer for artery, the encoder tries to share buffers in a pool, default size of that pool is 128 buffers, so that's 8mb * 128 to start with, probably not a great idea in a "memory constrained" environment. (Could be worth looking into using the large message channel if you can isolate the large messages and leave the regular channel closer to the default values)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted t:remoting:artery
Projects
None yet
Development

No branches or pull requests

5 participants