Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

host named "dns" causes DNS error #20972

Closed
jossef opened this issue Nov 7, 2019 · 8 comments
Closed

host named "dns" causes DNS error #20972

jossef opened this issue Nov 7, 2019 · 8 comments

Comments

@jossef
Copy link

jossef commented Nov 7, 2019

Stack python3.7 grpcio==1.23.0

When connecting to a target hostname dns I'm always getting a DNS error, even though the DNS resolution is valid.

channel = grpc.insecure_channel('dns:1234')
client = dns_pb2_grpc.MyServiceStub(channel)
request = dns_pb2.FooRequest()
client.Foo(request)

Fails with

grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "DNS resolution failed"
	debug_error_string = "{"created":"@1573135109.222914505","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3818,"referenced_errors":[{"created":"@1573135080.978586454","description":"Resolver transient failure","file":"src/core/ext/filters/client_channel/resolving_lb_policy.cc","file_line":268,"referenced_errors":[{"created":"@1573135080.978580745","description":"DNS resolution failed","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc","file_line":357,"grpc_status":14,"referenced_errors":[{"created":"@1573135080.978495506","description":"C-ares status is not ARES_SUCCESS: Could not contact DNS servers","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244,"referenced_errors":[{"created":"@1573135080.978479846","description":"C-ares status is not ARES_SUCCESS: Could not contact DNS servers","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244}]}]}]}]}"
>
C-ares status is not ARES_SUCCESS: Could not contact DNS servers

to reproduce, add this to your /etc/hosts file and run any python client-server simple example but connect via dns hostname

127.0.0.1 dns
@jossef jossef changed the title cannot connect to host named "dns" host named "dns" causes DNS error Nov 7, 2019
@gnossen
Copy link
Contributor

gnossen commented Nov 7, 2019

CC @apolcyn

@hcaseyal hcaseyal assigned lidizheng and unassigned hcaseyal Nov 9, 2019
@lidizheng lidizheng assigned apolcyn and unassigned lidizheng Nov 11, 2019
@apolcyn
Copy link
Contributor

apolcyn commented Nov 12, 2019

This is easy to reproduce, and looks like a kind of bug in the way that we form the uri when creating a channel.

Note that gRPC channels technically require a target string to be a full uri, with a "scheme", "authority", "path" component - e.g. "dns:///foo.com". However, if the "scheme" component is omitted, then we prepend the default "dns" scheme. https://github.com/grpc/grpc/blob/master/doc/naming.md for details. In the insecure channel creation code, the "maybe add default scheme" is basically done in this call:

ResolverRegistry::AddDefaultPrefixIfNeeded(target);

Internally, that parses the uri and attempts to lookup a "resolver factory" registered for the parsed uri's scheme. If no factory exists, then we prepend the default scheme to the uri and try again, and now use this new prepended target as the server target for the channel. Otherwise, we leave the target as is. I.e. the difference between creating a channel with target:"dns" and target:"dns2" is this code path:

. Usually this works, because when no scheme is provided in the uri, the "scheme" parsed will be bogus (it is actually the "path" component of the uri) and is likely to not match any of the schemes that name resolvers are registered for. However, when the bogus scheme happens to match the scheme a resolver is registered for, then we don't prepend a default scheme and so are left with a malformed server uri to use with the channel.

A fix might involve making the URI parser code stricter, and to fail when the target uri doesn't have scheme.

@jossef it would be good to know how bad the impact is of this bug though - is inability to resolve targets of "dns" a blocker for you? I'm actually a little surprised by use of "dns" as a hostname.

@jossef
Copy link
Author

jossef commented Nov 12, 2019

@apolcyn @yarelm
Thanks for the detailed answer.

The impact is of this bug - an hour of troubleshooting :).

The inability to resolve targets named "dns" is annoying but for me, this is not a blocker. I've worked it around and named it dns-<suffix>.

Until this issue is fixed, suggesting to add a warning for the developer, stating that the hostname "dns" will not resolve due to this issue.

as for the use of the "dns" as a hostname - one of my microservices is a DNS service. this is why.

@mehrdada
Copy link
Member

mehrdada commented Nov 12, 2019

@jossef The right way to fix this is to just prepend all your endpoints with dns:///. dns:///dns:port will work, as will dns:///1.2.3.4:port, etc.

@juanjops
Copy link

juanjops commented Jan 28, 2020

if this is the uri i use in a virtual machine "http://172.18.0.3:8501", why i still receive the same problem? I am the user that opened "tensorflow/serving#1535"

@zengzhengrong
Copy link

@jossef The right way to fix this is to just prepend all your endpoints with dns:///. dns:///dns:port will work, as will dns:///1.2.3.4:port, etc.

I test use consul dns,but it failed connect

service_uri = 'dns:///127.0.0.1:8600/web.service.consul'
grpc.insecure_channel(service_uri)

error message:

  File "/usr/local/lib64/python3.6/site-packages/grpc/_channel.py", line 826, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib64/python3.6/site-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "Socket closed"
        debug_error_string = "{"created":"@1585193665.606107501","description":"Error received from peer ipv4:127.0.0.1:8600","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Socket closed","grpc_status":14}"

I use dig command , it can resolve web.service.consul

@S-UP
Copy link

S-UP commented Mar 27, 2020

I am facing a similar issue. I can use Python 3.x and Google Translate API just fine. But the following error occurs when running the script via CRON (in an otherwise tested setup).

    Error: 503 DNS resolution failed
    Traceback (most recent call last):
      File "/home/MY-USER-NAME/anaconda3/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 57, in error_remapped_callable
      File "/home/MY-USER-NAME/anaconda3/lib/python3.6/site-packages/grpc/_channel.py", line 690, in __call__
      File "/home/MY-USER-NAME/anaconda3/lib/python3.6/site-packages/grpc/_channel.py", line 592, in _end_unary_response_blocking
    grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
            status = StatusCode.UNAVAILABLE
            details = "DNS resolution failed"
            debug_error_string = "{"created":"@1584629013.091398712","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3934,"referenced_errors":[{"created":"@1584629013.091395769","description":"Resolver transient failure","file":"src/core/ext/filters/client_channel/resolving_lb_policy.cc","file_line":262,"referenced_errors":[{"created":"@1584629013.091394954","description":"DNS resolution failed","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc","file_line":370,"grpc_status":14,"referenced_errors":[{"created":"@1584629013.091389655","description":"C-ares status is not ARES_SUCCESS: Could not contact DNS servers","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244,"referenced_errors":[{"created":"@1584629013.091380513","description":"C-ares status is not ARES_SUCCESS: Could not contact DNS servers","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244}]}]}]}]}"
    >

Details are posted here: https://stackoverflow.com/questions/60811596/error-503-dns-resolution-failed-in-google-translate-api-but-only-when-executing

@stale
Copy link

stale bot commented May 6, 2020

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions!

@stale stale bot closed this as completed May 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants