Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http/2 responses without content length headers can cause timeouts #4496

Closed
TvdW opened this issue Oct 16, 2019 · 9 comments
Labels

Comments

@TvdW
Copy link

@TvdW TvdW commented Oct 16, 2019

This is a continuation of #4043, I guess. A slightly more parallel version of the code from that issue can trigger this bug.

Code: https://gist.github.com/TvdW/b07e146eb648fd5ba05f70e87dee0a3b

curl 7.67.0-DEV (x86_64-unknown-linux-gnu) libcurl/7.67.0-DEV OpenSSL/1.0.2k-fips zlib/1.2.7 brotli/1.0.1 c-ares/1.15.0 nghttp2/1.39.1
Release-Date: [unreleased]
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli Debug GSS-API HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz NTLM NTLM_WB SPNEGO SSL TrackMemory UnixSockets

After a few seconds of doing a lot of requests, the code ends with

* Operation timed out after 5005 milliseconds with 0 bytes received
* Kill stream: Disconnected with pending data
* multi_done
* stopped the pause stream!
* Connection still in use 1, no more multi_done now!
* Operation timed out after 5004 milliseconds with 0 bytes received
* Kill stream: Disconnected with pending data

This does appear to be a race condition, though, in which data must come back before all requests have been submitted. The server listed in the .c snippet runs in an Amsterdam datacenter, but if that's too far then the Go code for it is at https://gist.github.com/TvdW/d90c1b6d69d140e274b8d0eb7788e623

Some things I've noticed:

  • Only triggers with PIPEWAIT (I've created #4494 so this can be done with the curl command line in the future)
  • Triggers when the server doesn't return a Content-Length header (like #4043)
  • Clearly a race condition: triggers with <20 requests on localhost, takes hundreds of requests against a public server. Adding 50ms delay on the server also hides the problem.

Feel free to hit the server I've set up for this with however many requests are needed to test, it can handle it.

@bagder bagder added the HTTP/2 label Oct 16, 2019
@bagder

This comment has been minimized.

Copy link
Member

@bagder bagder commented Oct 16, 2019

I ran your example code and after running for some time it ended as you described, which I presume then reproduced the problem. I'm on it.

@bagder

This comment has been minimized.

Copy link
Member

@bagder bagder commented Oct 17, 2019

@TvdW, do things look better for you with this?

--- a/lib/http2.c
+++ b/lib/http2.c
@@ -846,10 +846,11 @@ static int on_stream_close(nghttp2_session *session, int32_t stream_id,
       return NGHTTP2_ERR_CALLBACK_FAILURE;
 
     stream->closed = TRUE;
     httpc = &conn->proto.httpc;
     drain_this(data_s, httpc);
+    Curl_expire(data_s, 0, EXPIRE_RUN_NOW);
     httpc->error_code = error_code;
 
     /* remove the entry from the hash as the stream is now gone */
     rv = nghttp2_session_set_stream_user_data(session, stream_id, 0);
     if(rv) {
@TvdW

This comment has been minimized.

Copy link
Author

@TvdW TvdW commented Oct 17, 2019

Haha, well, that seems to fix the test case I provided but not my original issue 😄 I'll report back once I know more

@TvdW

This comment has been minimized.

Copy link
Author

@TvdW TvdW commented Oct 17, 2019

So, eh, this happened

./curl -m5 -Z https://domisc-01.server.tvdw.eu/curl-{1..1000} --no-progress-meter >/dev/null

Main difference is I updated the Go server to write a few bytes in the output instead of returning an empty response

@bagder

This comment has been minimized.

Copy link
Member

@bagder bagder commented Oct 17, 2019

What am I looking for there? I suppose that doesn't trigger the problem for me because I can't see anything fishy...

@TvdW

This comment has been minimized.

Copy link
Author

@TvdW TvdW commented Oct 17, 2019

Oh wow, okay. Back to the drawing board.

fwiw, this is what happens for me:

$ ./curl -m5 -Z https://domisc-01.server.tvdw.eu/curl-{1..1000} >/dev/null
DL% UL%  Dled  Uled  Xfers  Live   Qd Total     Current  Left    Speed
--  --  12103     0  1000     2    67 --:--:--  0:00:05 --:--:--  2414      curl: (28) Operation timed out after 5001 milliseconds with 0 out of 0 bytes received
curl: (28) Operation timed out after 5000 milliseconds with 0 out of 0 bytes received
--  --  12103     0  1000     0    67 --:--:--  0:00:05 --:--:--  2414     
@TvdW

This comment has been minimized.

Copy link
Author

@TvdW TvdW commented Oct 17, 2019

Some subtle tweaking later, this reproduces it for me again: https://gist.github.com/TvdW/f832c9a7752826c8260e625a77eb6c8d

Tested it from several networks and computers this time just to be sure. With this iteration of the code I'm able to reproduce it from a network ~150ms away as well.

bagder added a commit that referenced this issue Oct 18, 2019
To make sure that transfer is being dealt with. Streams without
Content-Length need a final read to notice the end-of-stream state.

Reported-by: Tom van der Woerdt
Fixes #4496
bagder added a commit that referenced this issue Oct 18, 2019
... and thus should return 0, not EAGAIN.

Reported-by: Tom van der Woerdt
Fixes #4496
@bagder

This comment has been minimized.

Copy link
Member

@bagder bagder commented Oct 18, 2019

Another attempt, this one seems to fix your updated recipe as well!

@TvdW

This comment has been minimized.

Copy link
Author

@TvdW TvdW commented Oct 18, 2019

LGTM! 😃 Thanks!

@bagder bagder closed this in 59041f0 Oct 18, 2019
bagder added a commit that referenced this issue Oct 18, 2019
... and thus should return 0, not EAGAIN.

Reported-by: Tom van der Woerdt
Fixes #4496
Closes #4505
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.