New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stripping trailing dot from URL prevents correct dns-resolving #3022

Closed
thz opened this Issue Sep 20, 2018 · 10 comments

Comments

Projects
None yet
2 participants
@thz
Contributor

thz commented Sep 20, 2018

I did this

I used a trailing dot in the URL to prevent system resolvers from appending suffixes from the search list.

I need to do this because appending the suffix will eventually trigger resolving from a different upstream resolver (split dns setup) and add a long delay.

The trailing dot is the correct way to express this "absoluteness" of the name (RFC1034).

While I can understand the TLS/SNI and perhaps "virtual-host-matching" issues based on trailing dots in the URL (mostly discussed in #716), this issue here is really just about the name resolving.

I expected the following

I expected the trailing dot in the hostname to prevent dns lookups with system's dns search list entries appended.
Because curl removed the dot even for resolving the hostname this did not happen.

References

@bagder bagder added the name lookup label Sep 20, 2018

@bagder

This comment has been minimized.

Member

bagder commented Sep 20, 2018

Some additional data:

I don't see this functionality documented in POSIX for getaddrinfo.

As outlined in #716, the HTTP/1.1 spec section 5.4 clearly says "If the target URI includes an authority component, then a client MUST send a field-value for Host that is identical to that authority component". (emphasis added by me). Sending a Host: header with such a trailing dot causes breakage with numerous http servers out there and we should not do that.

@thz

This comment has been minimized.

Contributor

thz commented Sep 20, 2018

I am completely fine with stripping the dot of the Host: header.
The problem is in resolving the name for connecting to the server.

@thz

This comment has been minimized.

Contributor

thz commented Sep 20, 2018

getadddrinfo specification is on a higher level (above nsswitch) and only referencing rfc1034 when it comes to name-resolving. This makes sense as there could as well be a non dns backends for resolving hosts.

@bagder

This comment has been minimized.

Member

bagder commented Sep 20, 2018

getadddrinfo specification is on a higher level

getadddrinfo is exactly the API we work with (mostly) so that's the specification and docs we have to work with. It not being specified in POSIX makes it unreliable to depend on, even if there are (several?) implementations that work like this.

I am completely fine with stripping the dot of the Host: header.

It's just a violation of the HTTP/1.1 spec...

(if someone wants to work on a PR for this, I'll just warn that #3017 might land soon and will change hostname handling a little bit)

@thz

This comment has been minimized.

Contributor

thz commented Sep 21, 2018

getadddrinfo is exactly the API we work with (mostly) so that's the specification and docs we have to work with. It not being specified in POSIX makes it unreliable to depend on, even if there are (several?) implementations that work like this.

POSIX is the interface to the operating system. But beyond that there are numerous standards (be it HTTP, IP, DNS, ...). The thing about trailing dots in domain names is covered in RFC1034 3.1.

The POSIX interface of getaddrinfo takes care of translating names to addresses. These are not necessarily names from DNS but could also be from NIS/yp, ldap and others. Still, DNS is quite prominent here and the documentation refers to it ("In many cases it is implemented by the Domain Name System, as documented in RFC 1034, RFC 1035, and RFC 1886.").
So POSIX does not specify the trailing-dot-topic, because it is not the right scope. Following the references will lead to the respective standard, though.

I am completely fine with stripping the dot of the Host: header.

It's just a violation of the HTTP/1.1 spec...

"I am completely fine" was highly exaggerated. What I meant is, that I do not care in the scope of this issue, because this issue is not about the Host-Header at all.
This issue is only about the name used as input for dns-resolving (or getaddrinfo) to establish the tcp connection. Currently curl removes the trailing dot before name resolution and that causes issues.

@thz

This comment has been minimized.

Contributor

thz commented Sep 21, 2018

In combination with --resolve this feels even worse:

% ltrace -e 'getaddrinfo' -f curl --resolve example.com.:1234:192.0.2.1 http://example.com.:1234/foo
[pid 28812] libcurl.so.4->getaddrinfo("example.com", "1234", 0x555ddbb0d810, 0x7f8ee0356e50)        = 0
[pid 28812] +++ exited (status 0) +++

The stripping happens in the URI but not in the cache-entry from --resolve. So I use --resolve with the exact same host as found in the URI, but it will not apply.

So the question about dot-stripping or not is to be considered in at least 3 layers:

@bagder

This comment has been minimized.

Member

bagder commented Oct 18, 2018

Are you interested in working on a pull-request for this?

@thz

This comment has been minimized.

Contributor

thz commented Oct 18, 2018

Which kind of approach would be acceptable?
Changing the dot-behavior for the resolving part only, but "fixing" it by not stripping the dot?

@bagder

This comment has been minimized.

Member

bagder commented Oct 18, 2018

Having the dot present for name resolving, but stripping it for SNI and Host: header I'd say.

@thz

This comment has been minimized.

Contributor

thz commented Oct 18, 2018

Then, yes.
I would like to look into this, but cannot promise much capacity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment