libcurl 7.58 regression with HTTP/2 #2365

Closed
JustArchi opened this Issue Mar 6, 2018 · 7 comments

Comments

Projects
None yet
2 participants
@JustArchi

JustArchi commented Mar 6, 2018

I did this

#include <stdio.h>
#include <curl/curl.h>
#include <string.h>

int main(void)
{
  CURL *curl;

  curl_global_init(CURL_GLOBAL_DEFAULT);

  curl = curl_easy_init();
  if(curl) {
    curl_easy_setopt(curl, CURLOPT_URL, "https://discordapp.com/api/v6/gateway");
    curl_easy_setopt(curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_2_0);

    char errbuf[CURL_ERROR_SIZE];
    curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, errbuf);
    errbuf[0] = 0;

    int count = 1;

    for (count = 1; count < 1500; count++) {
        CURLcode res = curl_easy_perform(curl);
        if (res != CURLE_OK) {
            size_t len = strlen(errbuf);
            fprintf(stderr, "\nlibcurl: (%d) ", res);
            if (len) {
                fprintf(stderr, "%s%s", errbuf, ((errbuf[len - 1] != '\n') ? "\n" : ""));
            } else {
                fprintf(stderr, "%s\n", curl_easy_strerror(res));
            }

            break;
        }

        fprintf(stdout, "OK: #%d\n", count);
    }

    curl_easy_cleanup(curl);
  }

  curl_global_cleanup();

  return 0;
}

I expected the following

for loop going past 1000th request.

I got this

libcurl: (16) Error in the HTTP2 framing layer at 1000th request

image

curl/libcurl version

curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.0.2n zlib/1.2.8 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) libssh2/1.8.0 nghttp2/1.31.0 librtmp/2.3
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL

operating system

Debian Testing

extra info

Initially I reported this issue as part of dotnet/corefx at dotnet/corefx#27730

The issue itself is a weird regression that I managed to reproduce only with one specific server until now, but it's possible that more are affected. This could potentially hint that it's some server-specific rate-limiting or likewise, but then the reproducible case works perfectly fine on older libcurl 7.52.1, also with HTTP/2, which makes me believe that even if it's some server-related issue, it wasn't triggered before.

Side note: I think that 7.57 also worked fine and the regression happened with 7.58, but I could be wrong. It's definitely possible to reproduce with 7.58, and definitely not possible to reproduce with 7.52.1.

The issue is very weird since it always affects exactly 1000th request being sent. Doing a similar thing but with launching a new process with each request, curl in particular, does not lead to this issue - it has to be a part of a single process. You could also do 2 processes spawning 600 requests each and that also finishes just fine.

While it's possible that it's some specific server-related issue, considering that old version of libcurl works fine while new one doesn't, I believe it's some weird regression that I'm sadly unable to analyze further due to very limited knowledge in terms of C and libcurl. I'd like to apologize in advance if I messed up something.

Thank you in advance for looking into the issue.

@bagder bagder added the HTTP/2 label Mar 6, 2018

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 9, 2018

Owner

It certainly can't be a coincidence that it stops exactly at 1000. It feels like it might be related to the max number of streams or something. Running the same test against for example https://daniel.haxx.se/, it works perfectly beyond 1000. This indicates that at least it is related to something the server does.

Owner

bagder commented Mar 9, 2018

It certainly can't be a coincidence that it stops exactly at 1000. It feels like it might be related to the max number of streams or something. Running the same test against for example https://daniel.haxx.se/, it works perfectly beyond 1000. This indicates that at least it is related to something the server does.

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 9, 2018

Owner

It's not related to MAX_CONCURRENT_STREAMS, it gets set to 128...

Owner

bagder commented Mar 9, 2018

It's not related to MAX_CONCURRENT_STREAMS, it gets set to 128...

@JustArchi

This comment has been minimized.

Show comment Hide comment
@JustArchi

JustArchi Mar 10, 2018

Running the same test against for example https://daniel.haxx.se/, it works perfectly beyond 1000. This indicates that at least it is related to something the server does.

Yes, I can confirm that I reproduced it with only that one server until now. At the same time I couldn't reproduce it with exactly the same server on older libcurl. I also couldn't reproduce it with affected libcurl but splitted across several processes, e.g. 1200 requests in 2x 600.

JustArchi commented Mar 10, 2018

Running the same test against for example https://daniel.haxx.se/, it works perfectly beyond 1000. This indicates that at least it is related to something the server does.

Yes, I can confirm that I reproduced it with only that one server until now. At the same time I couldn't reproduce it with exactly the same server on older libcurl. I also couldn't reproduce it with affected libcurl but splitted across several processes, e.g. 1200 requests in 2x 600.

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 10, 2018

Owner
@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 10, 2018

Owner

I captured the entire run with wireshark and here's what I learned:

The server sends a GOAWAY (with error NO_ERROR) when the 1000th stream ends. This message is defined by the spec like this:

The GOAWAY frame (type=0x7) is used to initiate shutdown of a connection or to signal serious error conditions. GOAWAY allows an endpoint to gracefully stop accepting new streams while still finishing processing of previously established streams.

I think the GOAWAY should cause the connection to get closed but I figure the next request with libcurl should then simply create a new one and continue. This case should not be considered an error for libcurl.

Owner

bagder commented Mar 10, 2018

I captured the entire run with wireshark and here's what I learned:

The server sends a GOAWAY (with error NO_ERROR) when the 1000th stream ends. This message is defined by the spec like this:

The GOAWAY frame (type=0x7) is used to initiate shutdown of a connection or to signal serious error conditions. GOAWAY allows an endpoint to gracefully stop accepting new streams while still finishing processing of previously established streams.

I think the GOAWAY should cause the connection to get closed but I figure the next request with libcurl should then simply create a new one and continue. This case should not be considered an error for libcurl.

bagder added a commit that referenced this issue Mar 10, 2018

http2: GOAWAY NO_ERROR is not an error, just end of connection
Reported-by: Łukasz Domeradzki
Fixes #2365
@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 10, 2018

Owner

With #2375 applied your example above reaches all the way to 1500 for me, by creating a new connection after stream 1000 ends. As one would like it to do.

Owner

bagder commented Mar 10, 2018

With #2375 applied your example above reaches all the way to 1500 for me, by creating a new connection after stream 1000 ends. As one would like it to do.

@JustArchi

This comment has been minimized.

Show comment Hide comment
@JustArchi

JustArchi Mar 10, 2018

Yeah, that was the goal. Thank you a lot for that!

Yeah, that was the goal. Thank you a lot for that!

bagder added a commit that referenced this issue Mar 11, 2018

http2: mark the connection for close on GOAWAY
Assisted-by: Jay Satiro
Reported-by: Łukasz Domeradzki
Fixes #2365

@bagder bagder closed this in 8b498a8 Mar 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment