Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DnsNameResolver TCP fallback test and message leaks #9647

Merged
merged 4 commits into from Oct 14, 2019

Conversation

@njhill
Copy link
Member

njhill commented Oct 8, 2019

Motivation

A memory leak related to DNS resolution was reported in #9634, specifically linked to the TCP retry fallback functionality that was introduced relatively recently. Upon inspection it's apparent that there
are some error paths where the original UDP response might not be fully released, and more significantly the TCP response actually leaks every time on the fallback success path.

It turns out that a bug in the unit test meant that the intended TCP fallback path was not actually exercised, so it did not expose the main leak in question.

Modifications

  • Fix DnsNameResolverTest#testTruncated0 dummy server fallback logic to first read transaction id of retried query and use it in replayed response
  • Adjust semantic of internal DnsQueryContext#finish method to always take refcount ownership of passed in envelope
  • Reorder some logic in DnsResponseHandler fallback handling to verify the context of the response is expected, and ensure that the query response are either released or propagated in all cases. This also reduces a number of redundant retain/release pairings

Result

Fixes #9634

@netty-bot

This comment has been minimized.

Copy link

netty-bot commented Oct 8, 2019

Can one of the admins verify this patch?

@njhill

This comment has been minimized.

Copy link
Member Author

njhill commented Oct 8, 2019

The changes look bigger than they are due to some reordering and indentation changes, recommend viewing diff with w=1 :)

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 9, 2019

@netty-bot test this please

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 9, 2019

@njhill please rebase

}
logger.warn("{} Received a DNS response with an unexpected ID: {}",
channel, queryId);
response.release();

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer Oct 9, 2019

Member

nit: release first before log.


@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
if (logger.isDebugEnabled()) {

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer Oct 9, 2019

Member

I think we should preserve the logic and fallback to the original truncated response if we did not receive a TCP response just like we did before:

https://github.com/netty/netty/pull/9647/files#diff-6678a05ef56f2434e7aa20acbe9d97e9L1277

Also beside this we should not log the exception if we received the response before. As the exception may be just a "connection reset" and so harmless. Also consider calling ctx.close() in any case to ensure the channel is closed in all cases.

This comment has been minimized.

Copy link
@njhill

njhill Oct 10, 2019

Author Member

Now fixed. I didn’t add an explicit close since the channel is closed immediately in the listener below.

@@ -181,30 +181,22 @@ public void run() {
}
}

// Takes ownership of passed envelope

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer Oct 9, 2019

Member

why not make this a real java doc ?

envelope.release();
}
@SuppressWarnings("unchecked")
private boolean setSuccess(AddressedEnvelope<? extends DnsResponse, InetSocketAddress> envelope) {

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer Oct 9, 2019

Member

lets rename this to trySuccess as this is what it does ;)

njhill added 3 commits Oct 2, 2019
Motivation

A memory leak related to DNS resolution was reported in #9634,
specifically linked to the TCP retry fallback functionality that was
introduced relatively recently. Upon inspection it's apparent that there
are some error paths where the original UDP response might not be fully
released, and more significantly the TCP response actually leaks every
time on the fallback success path.

It turns out that a bug in the unit test meant that the intended TCP
fallback path was not actually exercised, so it did not expose the main
leak in question.

Modifications

- Fix DnsNameResolverTest#testTruncated0 dummy server fallback logic to
first read transaction id of retried query and use it in replayed
response
- Adjust semantic of internal DnsQueryContext#finish method to always
take refcount ownership of passed in envelope
- Reorder some logic in DnsResponseHandler fallback handling to verify
the context of the response is expected, and ensure that the query
response are either released or propagated in all cases. This also
reduces a number of redundant retain/release pairings

Result

Fixes #9634
@njhill njhill force-pushed the njhill:dnsresolver-leaks branch from e1c0e3e to fa84a02 Oct 9, 2019
@njhill

This comment has been minimized.

Copy link
Member Author

njhill commented Oct 9, 2019

@normanmaurer addressed comments, PTAL

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 10, 2019

@netty-bot test this please

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 10, 2019

@dziemba can you verify this one as well ?

@dziemba

This comment has been minimized.

Copy link

dziemba commented Oct 10, 2019

@normanmaurer I've built a local snapshot from this branch (fa84a02) and ran the reproduction script and some additional internal test suites we have.

Looks all good now, no more leaks 🎉

Thanks a lot for the quick fix!

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 12, 2019

@netty-bot test this please

@normanmaurer normanmaurer added this to the 4.1.43.Final milestone Oct 12, 2019
@normanmaurer normanmaurer merged commit 2e5dd28 into netty:4.1 Oct 14, 2019
3 checks passed
3 checks passed
pull request validation (centos6-java11) Build finished.
Details
pull request validation (centos6-java12) Build finished.
Details
pull request validation (centos6-java8) Build finished.
Details
@njhill njhill deleted the njhill:dnsresolver-leaks branch Oct 14, 2019
normanmaurer added a commit that referenced this pull request Oct 16, 2019
Motivation

A memory leak related to DNS resolution was reported in #9634,
specifically linked to the TCP retry fallback functionality that was
introduced relatively recently. Upon inspection it's apparent that there
are some error paths where the original UDP response might not be fully
released, and more significantly the TCP response actually leaks every
time on the fallback success path.

It turns out that a bug in the unit test meant that the intended TCP
fallback path was not actually exercised, so it did not expose the main
leak in question.

Modifications

- Fix DnsNameResolverTest#testTruncated0 dummy server fallback logic to
first read transaction id of retried query and use it in replayed
response
- Adjust semantic of internal DnsQueryContext#finish method to always
take refcount ownership of passed in envelope
- Reorder some logic in DnsResponseHandler fallback handling to verify
the context of the response is expected, and ensure that the query
response are either released or propagated in all cases. This also
reduces a number of redundant retain/release pairings

Result

Fixes #9634
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.