New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http2 assert drain_total and state.drain gets out of sync #1680

Closed
Andersbakken opened this Issue Jul 13, 2017 · 2 comments

Comments

Projects
None yet
4 participants
@Andersbakken
Contributor

Andersbakken commented Jul 13, 2017

I can reproduce an issue with http2 in curl 7.53 (and seemingly nothing has changed in this area in 7.54)

I did this

  • Issue 10 concurrent get requests to a local nginx server for a 1kb resource
  • Every time one of them finishes I issue another one until 10000 completes

I expected the following

No asserts, instead I get this one every time:

http2_handle_stream_close: Assertion `httpc->drain_total >= data->state.drain' failed.

I believe it's related to retrying of requests since it seems to shortly follow this debug statement every time:

Issue another request to this URL: ...

It seems more or less innocuous to me and maybe the code that checks that state.drain > total_drain could just be defensive but I'm not 100% sure what unintended consequences that might have.

I compiled the app like this:

/usr/local/x86_64-toolchain-linux-gnu-master/bin/x86_64-toolchain-linux-gnu-gcc -I/usr/local/x86_64-toolchain-linux-gnu-master/include/ main.c -lcurl -lnghttp2 -L/usr/local/x86_64-toolchain-linux-gnu-master/lib -lssl -lcrypto -lz -Wl,-rpath,/usr/local/x86_64-toolchain-linux-gnu-master/lib
http2assert.zip

In the attached zip there is a small example program (main.c), an nginx.conf and a log from a run with --verbose (lots and lots of output, it'll kill your terminal if you don't redirect)

curl/libcurl version

7.53
[curl -V output]
curl 7.53.0-DEV (i686-pc-linux-gnu) libcurl/7.53.0-DEV OpenSSL/1.0.2f zlib/1.2.8 nghttp2/1.21.0
Protocols: http https smb smbs
Features: Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy

operating system

Linux

@bagder bagder added the HTTP/2 label Jul 13, 2017

@bagder bagder assigned bagder and unassigned bagder Aug 12, 2017

@sunilvirus

This comment has been minimized.

Show comment
Hide comment
@sunilvirus

sunilvirus Oct 29, 2017

I am facing same issue in more or less the same setup.
I am doing a bunch of concurrent post request to nginx sever.

Few additional data points if they help.

  1. I see this error typically after a some error. Mostly its timeout.
    Not doing some cleanup properly on errors ?

  2. I don't see the same assertion when I am posting (same content) to the nghttp2 server
    I am not sure if one is better than the other in their protocol implementation

I am facing same issue in more or less the same setup.
I am doing a bunch of concurrent post request to nginx sever.

Few additional data points if they help.

  1. I see this error typically after a some error. Mostly its timeout.
    Not doing some cleanup properly on errors ?

  2. I don't see the same assertion when I am posting (same content) to the nghttp2 server
    I am not sure if one is better than the other in their protocol implementation

@dhakiptk

This comment has been minimized.

Show comment
Hide comment
@dhakiptk

dhakiptk Feb 5, 2018

I am getting the same assertion failure, once every couple of hours for my application(details below)

curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.1.0g zlib/1.2.5 brotli/1.0.2 c-ares/1.13.0 nghttp2/1.29.0
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz brotli TLS-SRP HTTP2 UnixSockets HTTPS-proxy

A little detail about my application -
I make http2 call to m http(HTTP2 + TLS) servers waiting a max of n milliseconds for the response using multi_socket_action API

I also use http multiplexing(but not pipelining) by setting the suitable option on the multi-handle and k out of m requests could be going to the same server i.e multi-plexed over the same TCP connection using http2.

After max n milliseconds I remove all the easy handle using curl_multi_remove_handle API regardless of whether individual transfers are "DONE" or not.

The same underlying connection is re-used(which prior to version 7.58 was not the case, #2237) for the next subsequent request to the server.

Here is the backtrace
(gdb) bt
#0 0x0000003aea2325e5 in raise () from /lib64/libc.so.6
#1 0x0000003aea233dc5 in abort () from /lib64/libc.so.6
#2 0x0000003aea22b70e in __assert_fail_base () from /lib64/libc.so.6
#3 0x0000003aea22b7d0 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000738984 in http2_handle_stream_close (
conn=0x7f710c001048, data=0x7f73a1464e48, stream=0x7f73a3f0e0a8,
err=0x7f73d89f654c) at http2.c:1322
#5 0x0000000000739351 in http2_recv (conn=0x7f710c001048,
sockindex=0,
mem=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., len=16384, err=0x7f73d89f654c)
at http2.c:1557
#6 0x000000000075987a in Curl_read (conn=0x7f710c001048,
sockfd=3208,
buf=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., sizerequested=16384, n=0x7f73d89f65e8)
at sendf.c:741
#7 0x0000000000769916 in readwrite_data (data=0x7f73a1464e48,
conn=0x7f710c001048, k=0x7f73a1464f08, didwhat=0x7f73d89f667c,
done=0x7f73d89f671b, comeback=0x7f73d89f66f7) at transfer.c:475
#8 0x000000000076adb0 in Curl_readwrite (conn=0x7f710c001048,
data=0x7f73a1464e48, done=0x7f73d89f671b, comeback=0x7f73d89f66f7)
at transfer.c:1113
#9 0x000000000072c575 in multi_runsingle (multi=0x7f73a08bafa8,
now=..., data=0x7f73a1464e48) at multi.c:1871
#10 0x000000000072d86d in multi_socket (multi=0x7f73a08bafa8,
checkall=false, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2551
#11 0x000000000072e1b3 in curl_multi_socket_action (
multi=0x7f73a08bafa8, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2700

(gdb) p httpc->drain_total
$27 = 2
(gdb) p data->state.drain
$28 = 3

dhakiptk commented Feb 5, 2018

I am getting the same assertion failure, once every couple of hours for my application(details below)

curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.1.0g zlib/1.2.5 brotli/1.0.2 c-ares/1.13.0 nghttp2/1.29.0
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz brotli TLS-SRP HTTP2 UnixSockets HTTPS-proxy

A little detail about my application -
I make http2 call to m http(HTTP2 + TLS) servers waiting a max of n milliseconds for the response using multi_socket_action API

I also use http multiplexing(but not pipelining) by setting the suitable option on the multi-handle and k out of m requests could be going to the same server i.e multi-plexed over the same TCP connection using http2.

After max n milliseconds I remove all the easy handle using curl_multi_remove_handle API regardless of whether individual transfers are "DONE" or not.

The same underlying connection is re-used(which prior to version 7.58 was not the case, #2237) for the next subsequent request to the server.

Here is the backtrace
(gdb) bt
#0 0x0000003aea2325e5 in raise () from /lib64/libc.so.6
#1 0x0000003aea233dc5 in abort () from /lib64/libc.so.6
#2 0x0000003aea22b70e in __assert_fail_base () from /lib64/libc.so.6
#3 0x0000003aea22b7d0 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000738984 in http2_handle_stream_close (
conn=0x7f710c001048, data=0x7f73a1464e48, stream=0x7f73a3f0e0a8,
err=0x7f73d89f654c) at http2.c:1322
#5 0x0000000000739351 in http2_recv (conn=0x7f710c001048,
sockindex=0,
mem=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., len=16384, err=0x7f73d89f654c)
at http2.c:1557
#6 0x000000000075987a in Curl_read (conn=0x7f710c001048,
sockfd=3208,
buf=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., sizerequested=16384, n=0x7f73d89f65e8)
at sendf.c:741
#7 0x0000000000769916 in readwrite_data (data=0x7f73a1464e48,
conn=0x7f710c001048, k=0x7f73a1464f08, didwhat=0x7f73d89f667c,
done=0x7f73d89f671b, comeback=0x7f73d89f66f7) at transfer.c:475
#8 0x000000000076adb0 in Curl_readwrite (conn=0x7f710c001048,
data=0x7f73a1464e48, done=0x7f73d89f671b, comeback=0x7f73d89f66f7)
at transfer.c:1113
#9 0x000000000072c575 in multi_runsingle (multi=0x7f73a08bafa8,
now=..., data=0x7f73a1464e48) at multi.c:1871
#10 0x000000000072d86d in multi_socket (multi=0x7f73a08bafa8,
checkall=false, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2551
#11 0x000000000072e1b3 in curl_multi_socket_action (
multi=0x7f73a08bafa8, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2700

(gdb) p httpc->drain_total
$27 = 2
(gdb) p data->state.drain
$28 = 3

bagder added a commit that referenced this issue Apr 19, 2018

http2: clear the "drain counter" when a stream is closed
This fixes the notorious "httpc->drain_total >= data->state.drain"
assert.

Reported-by: Anders Bakken

Fixes #1680

bagder added a commit that referenced this issue Apr 19, 2018

http2: clear the "drain counter" when a stream is closed
This fixes the notorious "httpc->drain_total >= data->state.drain"
assert.

Reported-by: Anders Bakken

Fixes #1680

@bagder bagder closed this in 7645c6b Apr 20, 2018

@lock lock bot locked as resolved and limited conversation to collaborators Jul 19, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.