Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfDirectMemoryError/large number of PooledUnsafeDirectByteBuf allocated #6343

Closed
lkoe opened this issue Feb 10, 2017 · 11 comments
Closed

Comments

@lkoe
Copy link

lkoe commented Feb 10, 2017

We're using Netty via Apache Camel (camel-netty4). The component acts as server and is only supposed to read messages from clients.
Wenn starting our system a number of clients (around 20-25) connect to the TCP server. During that process we started to experience issues with available direct memory being exhausted quickly.

[io.netty.util.internal.OutOfDirectMemoryError - failed to allocate 16777216 byte(s) of direct memory (used: 469762048, max: 477626368)]io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 469762048, max: 477626368) at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:624) at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:578) at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:686) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:675) at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:237) at io.netty.buffer.PoolArena.allocate(PoolArena.java:221) at io.netty.buffer.PoolArena.allocate(PoolArena.java:141) at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:262) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:170) at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:131) at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:73) at io.netty.channel.socket.nio.NioDatagramChannel.doReadMessages(NioDatagramChannel.java:242) at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:75) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:610) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:551) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:465) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:437) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873) at java.lang.Thread.run(Thread.java:745)

After tinkering with JVM and Netty parameters (e.g. "-XX:MaxDirectMemorySize=1G -Dio.netty.allocator.pageSize=8192 -Dio.netty.allocator.maxOrder=10") the whole shebang can start successfully.
However, I am very sceptical about the amount of direct memory netty allocates. When pulling a heap dump of the server application I can see that there are 63 (!) instances of PooledUnsafeDirectByteBuf held in memory.
Even after extended no-load periods none of those PooledUnsafeDirectByteBuf ever seem to be released.

Activating leak detection ("-Dio.netty.leakDetectionLevel=paranoid") yielded nothing.

Am I missing something very obvious? Is this behaviour expected? What is the recommended way to cope with this?

Netty version

4.1.5

JVM version (e.g. java -version)

Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

OS version (e.g. uname -a)

Linux *** 3.12.59-60.45-default #1 SMP Sat Jun 25 06:19:03 UTC 2016 (396c69d) x86_64 x86_64 x86_64 GNU/Linux

@Scottmitch
Copy link
Member

Please update to the latest Netty (4.1.8) and report back if you can still reproduce.

We retain the objects from a the PooledByteBufAllocator via the Recycler so that in itself is not necessarily an indication of a problem. You can use -Dio.netty.recycler.maxCapacityPerThread to tweak the limits or disable the Recycler.

@johnou
Copy link
Contributor

johnou commented Feb 11, 2017

You can disable the recycler with -Dio.netty.recycler.maxCapacity=0 -Dio.netty.recycler.maxCapacity.default=0

@lkoe
Copy link
Author

lkoe commented Feb 14, 2017

I can observe the same behaviour using latest Netty (4.1.8). I also tested the Recycler options, but they had no noticable effect on my problem.

To reproduce the behaviour I created a small test project.
It contains a JUnit test, which sets up a simple Netty TCP Server and then uses simple Socket connections to send message.

Running the "testMultipleClients"-Test and using some well placed breakpoints (e.g. io.netty.util.internal.PlatformDependent.incrementMemoryCounter(int)) I can determine that for each new client connection a new PooledUnsafeDirectByteBuf is reserved - up to a limit of 9 on my local machine.
So the first 9 connections each lead to a new buffer allocation, all following connections are content with that and don't reserve anything additional. This means - as far as I understand - that 9*16MB=144MB of direct memory have been allocated.

Rifling through the Netty code I couldn't quite figure out which factors/properties lead to that particular number (9) of PooledUnsafeDirectByteBuf.

In any case, in our real application scenario the number of reserved PooledUnsafeDirectByteBuf grows supposedly beyond 64 - although none of the clients put particular load onto the server (after all the error happens when the system is just starting up). Due to that rather high number of buffers, we run into the OutOfDirectMemoryError originally reported. With 1GB of MaxDirectMemory 64*16MB buffers just fit, hence the conclusion that > 64 must have been tried to have been reserved (and none freed).

What I can't figure out:

  1. Am I up to something here or hunting ghosts?
  2. What settings/machine specifics (# of CPU) factor into the number of reserved PooledUnsafeDirectByteBuf?
  3. Is it ok to just tweak netty settings to get over the startup scenario - where the OutOfDirectMemoryError occurs?

To add to the last point - once all is started normally I can put quite some pressure onto the server application (fo instance using tcpkali) with a sizable number of clients (> 100) sending a sizable number of messages/second (>100) without encountering this particular issue (or any issue at all).

@Scottmitch
Copy link
Member

thanks for reporting back and the reproducer @lkoe ... we will investigate and get back to you

@normanmaurer
Copy link
Member

normanmaurer commented Feb 15, 2017

@lkoe this is most likely because of the default configuration of PooledByteBufAllocator which will create cores * 2 number of PoolArena each of this arena will at the end at least have 16MB of allocated direct memory. This is done to reduce lock-condition on the arena itself but also have a "high" memory overhead depending on your memory constraint. You can overwrite this by either specify it in the constructor of PooledByteBufAllocator or by using the following system properties:

-Dio.netty.allocator.numHeapArenas=
-Dio.netty.allocator.numDirectArenas=

Hope this helps.

@Viyond
Copy link

Viyond commented Dec 8, 2017

so,what is the conclusion? I encountered the same problem. @lkoe

@lkoe
Copy link
Author

lkoe commented Jan 16, 2018

@Viyond no real conclusion. In the end we tweaked some parameters to make the immediate exception go away in our environments.

We set those JVM params: -XX:MaxDirectMemorySize=1G -Dio.netty.allocator.pageSize=8192 -Dio.netty.allocator.maxOrder=10
In my (limited) understanding this results to 8 MB of reserved direct memory per allocated buffer, thus fitting 128 buffers in 1G of direct memory. This number was high enough to not trigger the exception in our environment anymore. However, this is still all quite mysterious to me and in no way a recommendation of sorts.

@sumitchawla
Copy link

sumitchawla commented Mar 3, 2018

Is there a tool which can print what all is allocated on Direct Memory? For My case with 1 G DirectMemory, I am getting similar exception. I am using above settings -Dio.netty.allocator.pageSize=8192 -Dio.netty.allocator.maxOrder=10, and when i obtain a heap dump, I see just 30 instances of io.netty.buffer.PooledUnsafeDirectByteBuf.

io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 8388608 byte(s) of direct memory (used: 1023410176, max: 1029177344)
	at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:506)
	at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:460)
	at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:701)
	at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:690)

My machine has 32 cores.

@konsultaner
Copy link

@normanmaurer Could this issue have something to do with an unreleased slice of that incoming byteBuf?

protected void decode(ChannelHandlerContext channelHandlerContext, ByteBuf byteBuf, List<Object> list) {
    while(byteBuf.isReadable()){
         list.add(byteBuf.slice(...,...)) <- not completely released?
    }
}

@normanmaurer
Copy link
Member

@konsultaner actually you would need to call byteBuf.retainedSlice(...) here to ensure you retain the buffer as it will be released once decode completes.

@javisst
Copy link

javisst commented Mar 27, 2018

We had the same issue for months now together with the async MongoDB driver and with activated SSL.
The issue only occurred while bulk operations on MongoDB driver. The JVM settings

-XX:MaxDirectMemorySize=1G -Dio.netty.allocator.pageSize=8192 -Dio.netty.allocator.maxOrder=10

finally helped. Thank you very much!

I attached my stack trace. In case you're interested I can provide you with a sample project.

2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT org.springframework.dao.InvalidDataAccessResourceUsageException: Unexpected exception; nested exception is com.mongodb.MongoInternalException: Unexpected exception 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:91) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at org.springframework.data.mongodb.core.ReactiveMongoTemplate.potentiallyConvertRuntimeException(ReactiveMongoTemplate.java:2210) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at org.springframework.data.mongodb.core.ReactiveMongoTemplate.lambda$translateException$55(ReactiveMongoTemplate.java:2193) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at reactor.core.publisher.Flux.lambda$onErrorMap$24(Flux.java:5420) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:88) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:120) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.reactivestreams.client.internal.ObservableToPublisher$1.onError(ObservableToPublisher.java:73) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.async.client.AbstractSubscription.onError(AbstractSubscription.java:123) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.async.client.MongoIterableSubscription$1.onResult(MongoIterableSubscription.java:52) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.async.client.MongoIterableSubscription$1.onResult(MongoIterableSubscription.java:48) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:49) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.async.client.AsyncOperationExecutorImpl$1$1.onResult(AsyncOperationExecutorImpl.java:70) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:49) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.operation.FindOperation$3.onResult(FindOperation.java:819) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.operation.OperationHelper$ReferenceCountedReleasingWrappedCallback.onResult(OperationHelper.java:353) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.operation.CommandOperationHelper$1.onResult(CommandOperationHelper.java:385) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:49) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor$2.onResult(DefaultServer.java:205) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:49) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.CommandProtocolImpl$1.onResult(CommandProtocolImpl.java:100) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.DefaultConnectionPool$PooledConnection$1.onResult(DefaultConnectionPool.java:458) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.UsageTrackingInternalConnection$2.onResult(UsageTrackingInternalConnection.java:110) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:49) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.InternalStreamConnection$2$1.onResult(InternalStreamConnection.java:364) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.InternalStreamConnection$2$1.onResult(InternalStreamConnection.java:359) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.InternalStreamConnection$MessageHeaderCallback$MessageCallback.onResult(InternalStreamConnection.java:628) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.InternalStreamConnection$MessageHeaderCallback$MessageCallback.onResult(InternalStreamConnection.java:618) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.InternalStreamConnection$5.failed(InternalStreamConnection.java:493) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.netty.NettyStream.readAsync(NettyStream.java:232) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.netty.NettyStream.handleReadResponse(NettyStream.java:266) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.netty.NettyStream.access$600(NettyStream.java:66) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.netty.NettyStream$InboundBufferHandler.exceptionCaught(NettyStream.java:333) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.handler.ssl.SslHandler.exceptionCaught(SslHandler.java:1030) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1381) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:933) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:112) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:157) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at java.lang.Thread.run(Thread.java:748) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT Caused by: com.mongodb.MongoInternalException: Unexpected exception 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.InternalStreamConnection.translateReadException(InternalStreamConnection.java:542) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at com.mongodb.connection.InternalStreamConnection.access$1300(InternalStreamConnection.java:74) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT ... 25 common frames omitted 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 65536 byte(s) of direct memory (used: 10438830, max: 10485760) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:640) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:594) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.allocateDirect(UnpooledUnsafeNoCleanerDirectByteBuf.java:30) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:68) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.<init>(UnpooledUnsafeNoCleanerDirectByteBuf.java:25) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.UnsafeByteBufUtil.newUnsafeDirectByteBuf(UnsafeByteBufUtil.java:595) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:327) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) 2018-03-26T17:08:00.29+0200 [APP/PROC/WEB/0] OUT ... 7 common frames omitted

netty version 4.1.22.Final

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants