Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release ping/pong command buffer #283

Merged
merged 1 commit into from
Mar 10, 2017
Merged

Conversation

saandrews
Copy link
Contributor

@saandrews saandrews commented Mar 9, 2017

Motivation

cmdPing and cmdPong are static ByteBuf which ideally does not required to be ref counted. We are wrapping it with another byteBuf which is released by netty. However, the original cmd buffer ref count increases to int max and server throws the following exception. This causes clients to close connection and reconnect and this cycle repeats until broker restart:

c.y.pulsar.broker.service.ServerCnx  - [-] Got exception: refCnt: 2147483647, increment: 1
io.netty.util.IllegalReferenceCountException: refCnt: 2147483647, increment: 1
        at io.netty.buffer.AbstractReferenceCountedByteBuf.retain(AbstractReferenceCountedByteBuf.java:66) ~[netty-all-4.0.39.Final.jar:4.0.39.Final]
        at io.netty.buffer.AbstractRecyclableDerivedByteBuf.init(AbstractRecyclableDerivedByteBuf.java:27) ~[pulsar-common-1.15.7.jar:4.0.39.Final]
        at io.netty.buffer.RecyclableDuplicateByteBuf.init(RecyclableDuplicateByteBuf.java:59) ~[pulsar-common-1.15.7.jar:4.0.39.Final]
        at io.netty.buffer.RecyclableDuplicateByteBuf.create(RecyclableDuplicateByteBuf.java:42) ~[pulsar-common-1.15.7.jar:4.0.39.Final]
        at com.yahoo.pulsar.common.api.Commands.newPong(Commands.java:416) ~[pulsar-common-1.15.7.jar:na]
        at com.yahoo.pulsar.common.api.PulsarHandler.handlePing(PulsarHandler.java:77) ~[pulsar-common-1.15.7.jar:na]
        at com.yahoo.pulsar.common.api.PulsarDecoder.channelRead(PulsarDecoder.java:183) ~[pulsar-common-1.15.7.jar:na]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366) [netty-all-4.0.39.Final.jar:4.0.39.Final]

Modifications

Release the command buffer before returning.

Result

Command buffer will not reach int max ref count and fail ping request.

@merlimat merlimat changed the title Rlease ping/pong command buffer Release ping/pong command buffer Mar 9, 2017
static ByteBuf newPing() {
return RecyclableDuplicateByteBuf.create(cmdPing).retain();
public static ByteBuf newPing() {
ByteBuf pingCmdBuf = RecyclableDuplicateByteBuf.create(cmdPing).retain();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't already the pingCmdBuf be released when it gets written on the channel? That release should then be propagated to the cmdBuf buffer.

Another option here would be to make the cmdPing buffer to be unpooled, given that anyway we're going to have 1 single instance of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats what I was wondering. With one static instance of cmdPing, why do we need pooled buffer. Can we just "return cmdPing"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot be just return cmdPing because the buffer has a read pointer, once you write on the channel, the read pointer will be moved and the buffer will look empty next time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we still need to create a duplicate of the original cmdPing, though if the cmdPing itself is not pooled, the refcounting will be disabled there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's better to unpool cmdPing and keep one ByteBuf for ping/pong as they are static.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be multiple threads sending the ping on different connections. The ByteBuf itself is not thread safe, so that's why you need a duplicate buffer object (referencing the same bytes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh. of course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return RecyclableDuplicateByteBuf.create(cmdPing)

seems to work as well. With the old logic, it's ref count is initialized to 1 when we get it from recycler"RecyclableDuplicateByteBuf buf = RECYCLER.get();". The additional retain() is causing netty not to release it since it's ref count is not 0 after decrementing by 1?

Does it mean we have a leak? I don't see anything reported by leak detector though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a real leak, just a recycler leak. It means that objects are getting GCed instead of being reused in the recycler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated it to remove the retain().

@rdhabalia rdhabalia added this to the 1.17 milestone Mar 9, 2017
@merlimat merlimat added the type/bug The PR fixed a bug or issue reported a bug label Mar 9, 2017
@saandrews saandrews force-pushed the master branch 2 times, most recently from cee70dd to 384fe4d Compare March 10, 2017 08:03
@merlimat
Copy link
Contributor

@saandrews I think we still need to make the buffer unpooled, the retain() was there since we at least once saw the ref count for the cmdPing going to 0

@saandrews
Copy link
Contributor Author

I see. I have moved the buffer to unpooled as well.

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants