Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS status codes not visible to the query future's listener #3976

Closed
dmk23 opened this issue Jul 12, 2015 · 12 comments
Closed

DNS status codes not visible to the query future's listener #3976

dmk23 opened this issue Jul 12, 2015 · 12 comments
Assignees
Labels
Milestone

Comments

@dmk23
Copy link

dmk23 commented Jul 12, 2015

I decided to gather a clearer picture of what is causing failed DNS resolutions by calculating the counts of each DnsResponseCode in the query listener.

To my surprise I found that the only status code visible in the listener (after future.sync().getNow()) is NOERROR. Everything else causes an exception (usually UnknownHostException or "query ID space exhausted") without returning an actual DNS response to look at.

This behavior seems wrong since it mixes up all the codes to the user. For example if we get NXDOMAIN we'd treat it as a valid record of domain's non-existence, but in case of a SERVFAIL, NOTIMP or REFUSED we'd know to retry the query to a different resolver (or do whatever else).

The behavior should be amended to always give user a chance to see the actual response...

@dmk23
Copy link
Author

dmk23 commented Jul 12, 2015

One more note. Even if we try to use the current API and process UnknownHostException we can quickly see that this is not very user friendly. Essentially we are forced to parse this text messages, all while Netty wastes CPU cycles and piles up heap garbage creating it:

2015-07-11 22:57:30,504 ERROR [epollEventLoopGroup-2-1] smtp.DNS: java.net.UnknownHostException: failed to resolve DefaultDnsQuestion(pelicangraphics.net IN MX) after 3 attempts:
        from /198.40.251.131:53: response code: NXDomain(3) with 0 answer(s) and 1 authority resource(s)
        from /12.159.64.132:53: response code: NXDomain(3) with 0 answer(s) and 1 authority resource(s)
        from /74.205.155.40:53: response code: NXDomain(3) with 0 answer(s) and 1 authority resource(s)

In many cases we could reasonably accept the first NXDomain as the valid answer since it would normally come from the upstream / authority nameservers and we'd have no reason to doubt them.

@trustin
Copy link
Member

trustin commented Jul 12, 2015

I agree that some status codes should be treated as success. Besides NXDOMAIN, what would you consider successful by default?

@trustin trustin added this to the 4.1.0.Beta6 milestone Jul 12, 2015
@trustin
Copy link
Member

trustin commented Jul 12, 2015

I'm also going to add a dedicated exception for DNS queries so that we can retrieve the necessary information more easily.

@dmk23
Copy link
Author

dmk23 commented Jul 12, 2015

I am not sure dedicated exception is the best way to go. Exception can only transmit text, while advanced user would want to look at the DNS responses "as is" and decide what to do with them. I'd suggest an option to disable all exceptions / warnings and just pass all responses "as is" to the listener.

The scheme of passing back all the errors would mean there is always just one resolution attempt so perhaps there should be an API to manually retry the same query from the listener.

Not sure what other responses beside NXDOMAIN to consider successes. What I was hoping to do is to look at the distribution of raw responses over the full domain set, review / analyze them and only then consider the right ways to handle them. But at the moment I can only get NOERRORs, so not sure about what other replies are common and what to do about them...

@dmk23
Copy link
Author

dmk23 commented Jul 12, 2015

I should add that the retrial strategy for high-volume DNS resolution is non-trivial. If you have too few resolvers you are going to run into rate limits very quickly. But if you have too many resolvers some % of them will misbehave from time to time and you have to keep track of that.

That's why it is important to have full control over what responses to retry and how....

@trustin
Copy link
Member

trustin commented Jul 12, 2015

We can add a property to an exception so you can access the information you need in a machine friendly way. But I agree with you that query() should raise an exception only when it really failed to get a response. As a result, query() will not retry as long as a server responds and you'll have to implement retry mechanism by yourself.

Sounds good?

@dmk23
Copy link
Author

dmk23 commented Jul 12, 2015

I am still not sure why it is necessary to always wrap every non-NOERROR response in an exception. As it stands DNS resolver already generates lots of GC.

Why not just have a lightweight config option that would simply pass all raw responses (even FORMERR) to the user? Yes, it should generate the exception only if there is no response, e.g. a timeout. It probably should not pre-generate any text either, unless getMessage() or toString() is called.

I'd suggest combining these API amendments with making it easier to retrieve the original question and lifting the parallelism limits per #3973 (comment) and #3972

@trustin
Copy link
Member

trustin commented Jul 12, 2015

Well, maybe I was unclear. What I said is to make query() does not fail the promise unless it really failed (timeout, I/O error).

@trustin
Copy link
Member

trustin commented Jul 12, 2015

See #3979 and let me know what you think.

@trustin trustin added the defect label Jul 12, 2015
@trustin trustin self-assigned this Jul 12, 2015
trustin added a commit that referenced this issue Jul 12, 2015
Related issues:
- #3976
- #3973

Motivation:

Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop.

Also, it is impossible to get the response whose code is not NOERROR
mechanically.

Modifications:

- Make query() not retry
  - Move the retry logic to DnsNameResolver.resolve() instead.
- Make query() fail the promise only when I/O error occurred or it
  failed to get a response
- Add DnsNameResolverException and use it when query() fails so that the
  resolver can give more information about the failure

Result:

- Full control over query()
- More information when query() fails
- TODO: Implement the cache for resolve()
@dmk23
Copy link
Author

dmk23 commented Jul 14, 2015

I'd like to test it out - at the moment Beta6-SNAPSHOT still does not return errors.

Let me know when there is something I can fully evaluate - it is hard for me to figure out what's going on just by looking at the pull request. Ideally incorporating my last couple comments...

@trustin
Copy link
Member

trustin commented Jul 14, 2015

It's just a pull request at the moment, so we did not deploy it as a snapshot. Will deploy a snapshot once it is merged.

trustin added a commit that referenced this issue Aug 1, 2015
…resolveAll()

Related issues:
- #3971
- #3973
- #3976
- #4035

Motivation:

1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.

2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()

Modifications:

- Changes related with DnsNameResolver.query()
  - Make query() not retry
    - Move the retry logic to DnsNameResolver.resolve() instead.
  - Make query() fail the promise only when I/O error occurred or it
    failed to get a response
  - Add DnsNameResolverException and use it when query() fails so that
    the resolver can give more information about the failure
  - query() does not cache anymore.

- Changes related with NameResolver.resolveAll()
  - Add NameResolver.resolveAll()
  - Add SimpleNameResolver.doResolveAll()

- Changes related with DnsNameResolver.resolve() and resolveAll()
  - Make DnsNameResolveContext abstract so that DnsNameResolver can
    decide to get single or multiple addresses from it
  - Re-implement cache so that the cache works for resolve() and
    resolveAll()

- Miscellaneous changes
  - Use ObjectUtil.checkNotNull() wherever possible
  - Add InternetProtocolFamily.addressType() to remove repetitive
    switch-case blocks in DnsNameResolver(Context)
  - Do not raise an exception when decoding a truncated DNS response

Result:

- Full control over query()
- (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
trustin added a commit that referenced this issue Aug 8, 2015
…resolveAll()

Related issues:
- #3971
- #3973
- #3976
- #4035

Motivation:

1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.

2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()

Modifications:

- Changes related with DnsNameResolver.query()
  - Make query() not retry
    - Move the retry logic to DnsNameResolver.resolve() instead.
  - Make query() fail the promise only when I/O error occurred or it
    failed to get a response
  - Add DnsNameResolverException and use it when query() fails so that
    the resolver can give more information about the failure
  - query() does not cache anymore.

- Changes related with NameResolver.resolveAll()
  - Add NameResolver.resolveAll()
  - Add SimpleNameResolver.doResolveAll()

- Changes related with DnsNameResolver.resolve() and resolveAll()
  - Make DnsNameResolveContext abstract so that DnsNameResolver can
    decide to get single or multiple addresses from it
  - Re-implement cache so that the cache works for resolve() and
    resolveAll()

- Miscellaneous changes
  - Use ObjectUtil.checkNotNull() wherever possible
  - Add InternetProtocolFamily.addressType() to remove repetitive
    switch-case blocks in DnsNameResolver(Context)
  - Do not raise an exception when decoding a truncated DNS response

Result:

- Full control over query()
- (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
trustin added a commit that referenced this issue Aug 9, 2015
…resolveAll()

Related issues:
- #3971
- #3973
- #3976
- #4035

Motivation:

1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.

2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()

Modifications:

- Changes related with DnsNameResolver.query()
  - Make query() not retry
    - Move the retry logic to DnsNameResolver.resolve() instead.
  - Make query() fail the promise only when I/O error occurred or it
    failed to get a response
  - Add DnsNameResolverException and use it when query() fails so that
    the resolver can give more information about the failure
  - query() does not cache anymore.

- Changes related with NameResolver.resolveAll()
  - Add NameResolver.resolveAll()
  - Add SimpleNameResolver.doResolveAll()

- Changes related with DnsNameResolver.resolve() and resolveAll()
  - Make DnsNameResolveContext abstract so that DnsNameResolver can
    decide to get single or multiple addresses from it
  - Re-implement cache so that the cache works for resolve() and
    resolveAll()

- Miscellaneous changes
  - Use ObjectUtil.checkNotNull() wherever possible
  - Add InternetProtocolFamily.addressType() to remove repetitive
    switch-case blocks in DnsNameResolver(Context)
  - Do not raise an exception when decoding a truncated DNS response

Result:

- Full control over query()
- A user can now retrieve all addresses via (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
trustin added a commit that referenced this issue Aug 12, 2015
…resolveAll()

Related issues:
- #3971
- #3973
- #3976
- #4035

Motivation:

1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.

2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()

Modifications:

- Changes related with DnsNameResolver.query()
  - Make query() not retry
    - Move the retry logic to DnsNameResolver.resolve() instead.
  - Make query() fail the promise only when I/O error occurred or it
    failed to get a response
  - Add DnsNameResolverException and use it when query() fails so that
    the resolver can give more information about the failure
  - query() does not cache anymore.

- Changes related with NameResolver.resolveAll()
  - Add NameResolver.resolveAll()
  - Add SimpleNameResolver.doResolveAll()

- Changes related with DnsNameResolver.resolve() and resolveAll()
  - Make DnsNameResolveContext abstract so that DnsNameResolver can
    decide to get single or multiple addresses from it
  - Re-implement cache so that the cache works for resolve() and
    resolveAll()

- Miscellaneous changes
  - Use ObjectUtil.checkNotNull() wherever possible
  - Add InternetProtocolFamily.addressType() to remove repetitive
    switch-case blocks in DnsNameResolver(Context)
  - Do not raise an exception when decoding a truncated DNS response

Result:

- Full control over query()
- A user can now retrieve all addresses via (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
trustin added a commit that referenced this issue Aug 18, 2015
…resolveAll()

Related issues:
- #3971
- #3973
- #3976
- #4035

Motivation:

1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.

2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()

Modifications:

- Changes related with DnsNameResolver.query()
  - Make query() not retry
    - Move the retry logic to DnsNameResolver.resolve() instead.
  - Make query() fail the promise only when I/O error occurred or it
    failed to get a response
  - Add DnsNameResolverException and use it when query() fails so that
    the resolver can give more information about the failure
  - query() does not cache anymore.

- Changes related with NameResolver.resolveAll()
  - Add NameResolver.resolveAll()
  - Add SimpleNameResolver.doResolveAll()

- Changes related with DnsNameResolver.resolve() and resolveAll()
  - Make DnsNameResolveContext abstract so that DnsNameResolver can
    decide to get single or multiple addresses from it
  - Re-implement cache so that the cache works for resolve() and
    resolveAll()
  - Add 'traceEnabled' property to enable/disable trace information

- Miscellaneous changes
  - Use ObjectUtil.checkNotNull() wherever possible
  - Add InternetProtocolFamily.addressType() to remove repetitive
    switch-case blocks in DnsNameResolver(Context)
  - Do not raise an exception when decoding a truncated DNS response

Result:

- Full control over query()
- A user can now retrieve all addresses via (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
trustin added a commit that referenced this issue Aug 18, 2015
…resolveAll()

Related issues:
- #3971
- #3973
- #3976
- #4035

Motivation:

1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.

2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()

Modifications:

- Changes related with DnsNameResolver.query()
  - Make query() not retry
    - Move the retry logic to DnsNameResolver.resolve() instead.
  - Make query() fail the promise only when I/O error occurred or it
    failed to get a response
  - Add DnsNameResolverException and use it when query() fails so that
    the resolver can give more information about the failure
  - query() does not cache anymore.

- Changes related with NameResolver.resolveAll()
  - Add NameResolver.resolveAll()
  - Add SimpleNameResolver.doResolveAll()

- Changes related with DnsNameResolver.resolve() and resolveAll()
  - Make DnsNameResolveContext abstract so that DnsNameResolver can
    decide to get single or multiple addresses from it
  - Re-implement cache so that the cache works for resolve() and
    resolveAll()
  - Add 'traceEnabled' property to enable/disable trace information

- Miscellaneous changes
  - Use ObjectUtil.checkNotNull() wherever possible
  - Add InternetProtocolFamily.addressType() to remove repetitive
    switch-case blocks in DnsNameResolver(Context)
  - Do not raise an exception when decoding a truncated DNS response

Result:

- Full control over query()
- A user can now retrieve all addresses via (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
trustin added a commit that referenced this issue Aug 18, 2015
…resolveAll()

Related issues:
- #3971
- #3973
- #3976
- #4035

Motivation:

1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.

2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()

Modifications:

- Changes related with DnsNameResolver.query()
  - Make query() not retry
    - Move the retry logic to DnsNameResolver.resolve() instead.
  - Make query() fail the promise only when I/O error occurred or it
    failed to get a response
  - Add DnsNameResolverException and use it when query() fails so that
    the resolver can give more information about the failure
  - query() does not cache anymore.

- Changes related with NameResolver.resolveAll()
  - Add NameResolver.resolveAll()
  - Add SimpleNameResolver.doResolveAll()

- Changes related with DnsNameResolver.resolve() and resolveAll()
  - Make DnsNameResolveContext abstract so that DnsNameResolver can
    decide to get single or multiple addresses from it
  - Re-implement cache so that the cache works for resolve() and
    resolveAll()
  - Add 'traceEnabled' property to enable/disable trace information

- Miscellaneous changes
  - Use ObjectUtil.checkNotNull() wherever possible
  - Add InternetProtocolFamily.addressType() to remove repetitive
    switch-case blocks in DnsNameResolver(Context)
  - Do not raise an exception when decoding a truncated DNS response

Result:

- Full control over query()
- A user can now retrieve all addresses via (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
@normanmaurer normanmaurer modified the milestones: 4.1.0.Beta6, 4.1.0.RC1 Sep 2, 2015
@dmk23
Copy link
Author

dmk23 commented Sep 12, 2015

I think this is now superceded by #4208 and can be closed

@dmk23 dmk23 closed this as completed Sep 12, 2015
@normanmaurer normanmaurer modified the milestones: 4.1.0.CR1, 4.1.0.Beta8 Nov 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants