-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Absolute domain names get trailing dot stripped from host request header #8290
Comments
I brought this subject to the httpbis group once, didn't help much. It is messy. |
While there hasn't been a lot of encouragement in that thread, there's also very little in the way of reports of real damage from handling absolute domains in URLs/Host request header fields. Given the massive number of requests are being done by the major browsers, if merely supporting absolute URLs caused problems, they would probably either have been fixed by now, or the browser makers would have changed the behaviour of their software. I've certainly never seen anyone complaining about wget handling it that way either. As I see it, there are really four cases here:
The first three all work just fine if the user agent preserves the trailing dot in the Host header (either from a redirect, like with pyropus.ca., or from a user typing it in themselves, or from a link from elsewhere on the net). Only the fourth case breaks, and those servers never generate a redirect Host: header or link using the absolute version, so the only people that would get bitten in this case would be people who manually add a trailing dot to a link or domain name -- and pretty much nobody, to within rounding error, does that. So given that there's no real problems being prevented by not supporting/preserving absolute hostnames in Host: headers, and that supporting them does actually fix a problem (as well as making the library's behaviour closer to the most common user-agents), it seems to me it would be worth making the change. |
It looks like this used to work in curl and changed at some point.
|
I think you're forgetting cases though by looking at this as HTTP only. For example how SNI says there should be no trailing dot in there, so you'd need to send different names in the separate fields (to comply) or not, and there will be servers assuming one or the other way, But yes, I presume we should make curl work closer to what the browsers do with the trailing dots, since they are the main clients that people are generally adapting to. How do they populate the SNI field with trailing dots? |
The browsers leave the dot in the Host: header and strip it from the SNI field as that spec requires. This is what seems to work well for them. It seems like a reasonable approach, and as Ryan mentions above, curl may even have been doing this previously? |
Oh, and as far as other protocols go - this two-values-are-different issue can't come up unless the agent has to send the SNI field and also send the hostname-with-a-dot in a request header field or similar. I don't think that would apply to ftp or gopher or some of the other protocols curl supports - haven't thought about it long enough to eliminate all of them. |
... and we if we did, we changed because it caused a problem in the past... |
Since we need to drop the dot from the SNI, a HTTPS server cannot differentiate between a host name with a trailing dot or not. That makes you question how HTTP and HTTPS can be made that different... Update: I think we've even had issues with HTTPS servers not liking the trailing dot when the SNI (virtual host) is setup to not use a dot. |
I'm not an expert, but I believe SNI has to have the dot stripped - but the various domains listed above seem to work fine with and without it in the Host: header. The only problems I've been able to find have been by me appending a dot to the domain name manually when typing in a URL with some webservers that can't handle it correctly (no domain configured errors etc). But those domains do not generate links or redirect URLs containing the trailing dot, so you can't run into that problem "normally" - i.e. you have to stick your hand in the saw blade to get the negative result. Maybe some HTTPS servers don't like absolute host names in the Host: request header - but again, they won't be generating such URLs so you can only run into problems by adding it yourself. Plenty of other sites handle it just fine. |
|
Okay, that commit message seems to be saying they needed to remove the trailing dot from the SNI value (which is true) but they also did the same for the Host: HTTP request header without including a reason for that. I think they just didn't consider the breakage doing that would cause. The SNI change is correct, the Host: header field one is not. Can that part be reverted so curl users can execute requests with absolute domain names again? |
it is just source code, everything is possible. I doubt the commit in question will agree to "just be reverted" though after all this time and some later follow-up tweaks, so it needs some proper massaging and most likely also a test case or two added to verify. |
Reverts 5de8d84 (May 2014, shipped in 7.37.0) and the follow-up changes done afterward. Keep the dot in names for everything except the SNI to make curl behave more similar to current browsers. This means 'name' and 'name.' send the same SNI for different 'Host:' headers. Updated test 1322 accordingly Fixes #8290 Reported-by: Charles Cazabon Closes #8320
Thank you! |
Reverts 5de8d84 (May 2014, shipped in 7.37.0) and the follow-up changes done afterward. Keep the dot in names for everything except the SNI to make curl behave more similar to current browsers. This means 'name' and 'name.' send the same SNI for different 'Host:' headers. Updated test 1322 accordingly Fixes #8290 Reported-by: Charles Cazabon Closes #8320
curl doesn't honour the domain name part of the redirected URL if it is an absolute name; curl strips the trailing dot. This causes problems loading some pages where the servers are configured to redirect non-absolute requests to the absolute ones:
... so that loops until max-redirects is hit.
The RFCs seem to say that the location value should be used as presented by the server. It should only strip the dot from SNI, not from the Host: request header.
This has come up a couple of times before:
#716
#3022
... but those seem to be about requests to servers that accidentally (?) include the trailing dot, i.e. the servers are not expecting it. The server itself isn't issuing a redirect to the absolute version of its domain name. So accidental breakage in the cases listed shouldn't be a major issue, I think. While the current behaviour does break sites that are using this feature deliberately.
As for other user agents:
I didn't find any other open/closed issues related to this, and the known bugs page doesn't have anything relevant.
I did this
I expected the following
curl to resend the request with the Host: header value set to
pyropus.ca.
.curl/libcurl version
operating system
Linux 5.14.16 #1 SMP Thu Nov 4 11:18:07 CST 2021 x86_64 GNU/Linux
Debian 11 / Bullseye
The text was updated successfully, but these errors were encountered: