Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid contention on netty channel promise #1321

Closed
wants to merge 1 commit into from

Conversation

merlimat
Copy link
Contributor

@merlimat merlimat commented Apr 6, 2018

With profiler, I have seen there can be heavy contention between BK threads and Netty IO thread due the the checking for channel write condition that was recently added for monitoring purpose.

The problem relies in that there is one BK thread that is doing the writeAndFlush() on the PCBC and getting the ChannelFuture, adding a listener to the future.

The write operation, though, is completed in the Netty IO thread and the promise gets also triggered from that thread. So, there is contention between current thread adding the listener and the IO threads completing the promise.

If we add the listener before doing the write on channel, we can avoid the contention. Another option could be to do the write from the Netty IO thread as well.

@merlimat merlimat added this to the 4.7.0 milestone Apr 6, 2018
@merlimat merlimat self-assigned this Apr 6, 2018
@sijie
Copy link
Member

sijie commented Apr 6, 2018

@merlimat what is the impact of this change? like how much performance change we can get? do you have a microbenchmark for that?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good catch!

@merlimat
Copy link
Contributor Author

merlimat commented Apr 6, 2018

what is the impact of this change? like how much performance change we can get? do you have a microbenchmark for that?

I cannot quantify directly the impact yet. I'm working to remove other contention points in Pulsar code. The results are only visible when most contention points are removed.

Monitor Class Total Blocked Time Maximum Blocked Time Average Blocked Time Std Dev Blocked Time Distinct Threads Count Distinct Addresses
io.netty.channel.DefaultChannelPromise 7.53794229E8 ns 41358171 ns 1.6750982866666667E7 ns 6124170.655041036 ns 15 45 45

Basically this means that this mutex has blocked IO threads for a total of 753 millis (the profiling was done over 5minutes at 100K writes/s). That is all added to latency.

void io.netty.util.concurrent.DefaultPromise.checkNotifyWaiters()	45
boolean io.netty.util.concurrent.DefaultPromise.setValue0(Object)	45
boolean io.netty.util.concurrent.DefaultPromise.setSuccess0(Object)	45
boolean io.netty.util.concurrent.DefaultPromise.trySuccess(Object)	45
void io.netty.util.internal.PromiseNotificationUtil.trySuccess(Promise, Object, InternalLogger)	45
void io.netty.channel.ChannelOutboundBuffer.safeSuccess(ChannelPromise)	45
boolean io.netty.channel.ChannelOutboundBuffer.remove()	45
void io.netty.channel.ChannelOutboundBuffer.removeBytes(long)	45
int io.netty.channel.epoll.AbstractEpollStreamChannel.writeBytesMultiple(ChannelOutboundBuffer, IovArray)	45
int io.netty.channel.epoll.AbstractEpollStreamChannel.doWriteMultiple(ChannelOutboundBuffer)	45
void io.netty.channel.epoll.AbstractEpollStreamChannel.doWrite(ChannelOutboundBuffer)	45
void io.netty.channel.AbstractChannel$AbstractUnsafe.flush0()	45
void io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.flush0()	45
void io.netty.channel.AbstractChannel$AbstractUnsafe.flush()	45
void io.netty.channel.DefaultChannelPipeline$HeadContext.flush(ChannelHandlerContext)	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush0()	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush()	45
ChannelHandlerContext io.netty.channel.AbstractChannelHandlerContext.flush()	45
void io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelHandlerContext)	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush0()	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush()	45
ChannelHandlerContext io.netty.channel.AbstractChannelHandlerContext.flush()	45
void io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelHandlerContext)	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush0()	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush()	45
ChannelHandlerContext io.netty.channel.AbstractChannelHandlerContext.flush()	45
void io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelHandlerContext)	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush0()	45
void io.netty.channel.AbstractChannelHandlerContext.invokeFlush()	45
ChannelHandlerContext io.netty.channel.AbstractChannelHandlerContext.flush()	45
void io.netty.channel.ChannelDuplexHandler.flush(ChannelHandlerContext)	45
void org.apache.bookkeeper.proto.AuthHandler$ClientSideHandler.write(ChannelHandlerContext, Object, ChannelPromise)	45
void io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(Object, ChannelPromise)	45
void io.netty.channel.AbstractChannelHandlerContext.invokeWrite(Object, ChannelPromise)	45
void io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext, Object, ChannelPromise)	45
void io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext, Object, ChannelPromise)	45
void io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext, Object, ChannelPromise)	45
void io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run()	45
void io.netty.util.concurrent.AbstractEventExecutor.safeExecute(Runnable)	45
boolean io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(long)	45
void io.netty.channel.epoll.EpollEventLoop.run()	45
void io.netty.util.concurrent.SingleThreadEventExecutor$5.run()	45
void io.netty.util.concurrent.FastThreadLocalRunnable.run()	45
void java.lang.Thread.run()	45

@merlimat merlimat closed this in 956b37d Apr 6, 2018
@merlimat merlimat deleted the channel-promise-contention branch April 6, 2018 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants