Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcurl 8.7 regression workaround #640

Merged

Conversation

dscho
Copy link
Member

@dscho dscho commented Apr 11, 2024

This is a companion to git-for-windows#4906, designed to get the CI builds to pass again.

peff added 3 commits April 2, 2024 13:27
In get_active_slot(), we return a CURL handle that may have been used
before (reusing them is good because it lets curl reuse the same
connection across many requests). We set a few curl options back to
defaults that may have been modified by previous requests.

We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
defaults to "-1"). This usually doesn't matter because most POSTs will
set both fields together anyway. But there is one exception: when
handling a large request in remote-curl's post_rpc(), we don't set
_either_, and instead set a READFUNCTION to stream data into libcurl.

This can interact weirdly with a stale POSTFIELDSIZE setting, because
curl will assume it should read only some set number of bytes from our
READFUNCTION. However, it has worked in practice because we also
manually set a "Transfer-Encoding: chunked" header, which libcurl uses
as a clue to set the POSTFIELDSIZE to -1 itself.

So everything works, but we're better off resetting the size manually
for a few reasons:

  - there was a regression in curl 8.7.0 where the chunked header
    detection didn't kick in, causing any large HTTP requests made by
    Git to fail. This has since been fixed (but not yet released). In
    the issue, curl folks recommended setting it explicitly to -1:

      curl/curl#13229 (comment)

    and it indeed works around the regression. So even though it won't
    be strictly necessary after the fix there, this will help folks who
    end up using the affected libcurl versions.

  - it's consistent with what a new curl handle would look like. Since
    get_active_slot() may or may not return a used handle, this reduces
    the possibility of heisenbugs that only appear with certain request
    patterns.

Note that the recommendation in the curl issue is to actually drop the
manual Transfer-Encoding header. Modern libcurl will add the header
itself when streaming from a READFUNCTION. However, that code wasn't
added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
support back to 7.19.5, so those older versions still need the manual
header.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our documentation claims we support curl versions back to 7.19.5. But we
can no longer compile with that version since adding an unconditional
use of CURLOPT_RESOLVE in 511cfd3 (http: add custom hostname to IP
address resolutions, 2022-05-16). That feature wasn't added to libcurl
until 7.21.3.

We could add #ifdefs to make this work back to 7.19.5. But given that
nobody noticed the compilation failure in the intervening two years, it
makes more sense to bump the version in the documentation to 7.21.3
(which is itself over 13 years old).

We could perhaps go forward even more (which would let us drop some
cruft from git-curl-compat.h), but this should be an obviously safe
jump, and we can move forward later.

Note that user-visible syntax for CURLOPT_RESOLVE has grown new features
in subsequent curl versions. Our documentation mentions "+" and "-"
entries, which require more recent versions than 7.21.3. We could
perhaps clarify that in our docs, but it's probably not worth cluttering
them with restrictions of ancient curl versions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of curl 7.66.0, we don't need to manually specify a "chunked"
Transfer-Encoding header. Instead, modern curl deduces the need for it
in a POST that has a POSTFIELDSIZE of -1 and uses READFUNCTION rather
than POSTFIELDS.

That version is recent enough that we can't just drop the header; we
need to do so conditionally. Since it's only a single line, it seems
like the simplest thing would just be to keep setting it unconditionally
(after all, the #ifdefs are much longer than the actual code). But
there's another wrinkle: HTTP/2.

Curl may choose to use HTTP/2 under the hood if the server supports it.
And in that protocol, we do not use the chunked encoding for streaming
at all. Most versions of curl handle this just fine by recognizing and
removing the header. But there's a regression in curl 8.7.0 and 8.7.1
where it doesn't, and large requests over HTTP/2 are broken (which t5559
notices). That regression has since been fixed upstream, but not yet
released.

Make the setting of this header conditional, which will let Git work
even with those buggy curl versions. And as a bonus, it serves as a
reminder that we can eventually clean up the code as we bump the
supported curl versions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@dscho dscho self-assigned this Apr 11, 2024
@dscho dscho merged commit 1b360c3 into microsoft:vfs-2.44.0 Apr 12, 2024
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants