Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcurl violates RFC7230 when constructing a proxy request with an explicit port ":80" in the URL #6769

Closed
mcb30 opened this issue Mar 22, 2021 · 2 comments

Comments

@mcb30
Copy link

@mcb30 mcb30 commented Mar 22, 2021

I did this

With anything listening locally on port 3128 (e.g. nc -l 3128), issue a request that uses a proxy server and specifies port 80 explicitly within the request URL:

$ curl -v -x http://localhost:3128 http://example.org:80/index.html
*   Trying ::1:3128...
* Connected to localhost (::1) port 3128 (#0)
> GET http://example.org:80/index.html HTTP/1.1
> Host: example.org
> User-Agent: curl/7.71.1
> Accept: */*
> Proxy-Connection: Keep-Alive

Note that:

  • The HTTP request target correctly uses the absolute-form as defined by RFC 7230 section 5.3.2, since we have used the -x option to specify the use of a proxy server
  • The HTTP request target URI authority portion has the value example.org:80 (i.e. including the port number)
  • The HTTP Host header contains the value example.org (i.e. not including the port number)

This violates RFC 7230 section 5.4, which states in part that

If the target URI includes an authority component, then a
client MUST send a field-value for Host that is identical to that
authority component, excluding any userinfo subcomponent and its "@"
delimiter

This discrepancy between target URI authority portion and Host header causes a failure when the request happens to pass through HAProxy, which will report a 400 Bad Request error due to the mismatch.

Within curl, the Host header is constructed to explicitly omit the port number if it matches the default (80 for HTTP or 443 for HTTPS):

curl/lib/http.c

Lines 2108 to 2123 in 03c8cef

if(((conn->given->protocol&CURLPROTO_HTTPS) &&
(conn->remote_port == PORT_HTTPS)) ||
((conn->given->protocol&CURLPROTO_HTTP) &&
(conn->remote_port == PORT_HTTP)) )
/* if(HTTPS on port 443) OR (HTTP on port 80) then don't include
the port number in the host string */
data->state.aptr.host = aprintf("Host: %s%s%s\r\n",
conn->bits.ipv6_ip?"[":"",
host,
conn->bits.ipv6_ip?"]":"");
else
data->state.aptr.host = aprintf("Host: %s%s%s:%d\r\n",
conn->bits.ipv6_ip?"[":"",
host,
conn->bits.ipv6_ip?"]":"",
conn->remote_port);

and the target URI is constructed to exclude the userinfo subcomponent but will leave the port number present even if it would be omitted from the Host header:

curl/lib/http.c

Lines 2176 to 2188 in 03c8cef

if(strcasecompare("http", data->state.up.scheme)) {
/* when getting HTTP, we don't want the userinfo the URL */
uc = curl_url_set(h, CURLUPART_USER, NULL, 0);
if(uc) {
curl_url_cleanup(h);
return CURLE_OUT_OF_MEMORY;
}
uc = curl_url_set(h, CURLUPART_PASSWORD, NULL, 0);
if(uc) {
curl_url_cleanup(h);
return CURLE_OUT_OF_MEMORY;
}
}

For reference, the relevant code within HAProxy that rejects the mismatched request target URI and Host header seems to be: https://github.com/haproxy/haproxy/blob/19d14710e941a366afd5b4ff8720090c011c83c1/src/h1.c#L871-L896

I expected the following

curl should construct a request that conforms to RFC 7230. This could be achieved by any of:

  • Retaining the port number within the Host header unconditionally, or
  • Retaining the port number within the Host header when issuing a request via a proxy, or
  • Stripping default port numbers from the request URI target using the same logic as in Curl_http_host()

I am happy to put together a pull request if a maintainer could indicate which of the above would be the preferred approach.

curl/libcurl version

curl 7.71.1 (x86_64-redhat-linux-gnu) libcurl/7.71.1 OpenSSL/1.1.1i-fips zlib/1.2.11 brotli/1.0.9 libidn2/2.3.0 libpsl/0.21.1 (+libidn2/2.3.0) libssh/0.9.5/openssl/zlib nghttp2/1.43.0
Release-Date: 2020-07-01
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS brotli GSS-API HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz Metalink NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets

operating system

Linux 5.10.16-200.fc33.x86_64 #1 SMP Sun Feb 14 03:02:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

mcb30 added a commit to mcb30/cvmfs that referenced this issue Mar 22, 2021
The S3 uploader will currently always include an explicit port number
within URIs (e.g. "http://mybucket.s3.us-east-1.amazonaws.com:80")
even when using the default HTTP port 80.

This is not incorrect, but unfortunately triggers a bug within libcurl
(see curl/curl#6769) that causes it to
construct requests that will be rejected if they happen to pass
through HAProxy.

Work around this libcurl bug by omitting an explicit port number from
the constructed URI when the default port is used.

Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
@bagder
Copy link
Member

@bagder bagder commented Mar 22, 2021

Let me address the three suggestions one by one for completeness:

Retaining the port number within the Host header unconditionally, or

This doesn't work on the Internet. Servers out there will break on port numbers for default ports in the Host: header. We learned that decades ago.

Retaining the port number within the Host header when issuing a request via a proxy, or

I fear that this will end up in the same bucket as the above, since the Host: header will most probably be passed along as-is to the remote server.

Stripping default port numbers from the request URI target using the same logic as in Curl_http_host()

I believe this alternative has the highest chance of working with the least amount of friction and problems.

@bagder
Copy link
Member

@bagder bagder commented Mar 22, 2021

It struck me that the fix is really easy! PR coming up in a jiffy.

bagder added a commit that referenced this issue Mar 22, 2021
To make sure the Host: header and the URL provide the same authority
portion when sent to the proxy, strip the default port number from the
URL if one was provided.

Reported-by: Michael Brown
Fixes #6769
Closes #[fill in]
@bagder bagder closed this in 3bbf62b Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants