http2 assert drain_total and state.drain gets out of sync #1680

Closed
Andersbakken opened this Issue Jul 13, 2017 · 2 comments

Comments

Projects
None yet
4 participants
@Andersbakken
Contributor

Andersbakken commented Jul 13, 2017

I can reproduce an issue with http2 in curl 7.53 (and seemingly nothing has changed in this area in 7.54)

I did this

  • Issue 10 concurrent get requests to a local nginx server for a 1kb resource
  • Every time one of them finishes I issue another one until 10000 completes

I expected the following

No asserts, instead I get this one every time:

http2_handle_stream_close: Assertion `httpc->drain_total >= data->state.drain' failed.

I believe it's related to retrying of requests since it seems to shortly follow this debug statement every time:

Issue another request to this URL: ...

It seems more or less innocuous to me and maybe the code that checks that state.drain > total_drain could just be defensive but I'm not 100% sure what unintended consequences that might have.

I compiled the app like this:

/usr/local/x86_64-toolchain-linux-gnu-master/bin/x86_64-toolchain-linux-gnu-gcc -I/usr/local/x86_64-toolchain-linux-gnu-master/include/ main.c -lcurl -lnghttp2 -L/usr/local/x86_64-toolchain-linux-gnu-master/lib -lssl -lcrypto -lz -Wl,-rpath,/usr/local/x86_64-toolchain-linux-gnu-master/lib
http2assert.zip

In the attached zip there is a small example program (main.c), an nginx.conf and a log from a run with --verbose (lots and lots of output, it'll kill your terminal if you don't redirect)

curl/libcurl version

7.53
[curl -V output]
curl 7.53.0-DEV (i686-pc-linux-gnu) libcurl/7.53.0-DEV OpenSSL/1.0.2f zlib/1.2.8 nghttp2/1.21.0
Protocols: http https smb smbs
Features: Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy

operating system

Linux

@bagder bagder added the HTTP/2 label Jul 13, 2017

@bagder bagder assigned bagder and unassigned bagder Aug 12, 2017

@sunilvirus

This comment has been minimized.

Show comment Hide comment
@sunilvirus

sunilvirus Oct 29, 2017

I am facing same issue in more or less the same setup.
I am doing a bunch of concurrent post request to nginx sever.

Few additional data points if they help.

  1. I see this error typically after a some error. Mostly its timeout.
    Not doing some cleanup properly on errors ?

  2. I don't see the same assertion when I am posting (same content) to the nghttp2 server
    I am not sure if one is better than the other in their protocol implementation

I am facing same issue in more or less the same setup.
I am doing a bunch of concurrent post request to nginx sever.

Few additional data points if they help.

  1. I see this error typically after a some error. Mostly its timeout.
    Not doing some cleanup properly on errors ?

  2. I don't see the same assertion when I am posting (same content) to the nghttp2 server
    I am not sure if one is better than the other in their protocol implementation

@dhakiptk

This comment has been minimized.

Show comment Hide comment
@dhakiptk

dhakiptk Feb 5, 2018

I am getting the same assertion failure, once every couple of hours for my application(details below)

curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.1.0g zlib/1.2.5 brotli/1.0.2 c-ares/1.13.0 nghttp2/1.29.0
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz brotli TLS-SRP HTTP2 UnixSockets HTTPS-proxy

A little detail about my application -
I make http2 call to m http(HTTP2 + TLS) servers waiting a max of n milliseconds for the response using multi_socket_action API

I also use http multiplexing(but not pipelining) by setting the suitable option on the multi-handle and k out of m requests could be going to the same server i.e multi-plexed over the same TCP connection using http2.

After max n milliseconds I remove all the easy handle using curl_multi_remove_handle API regardless of whether individual transfers are "DONE" or not.

The same underlying connection is re-used(which prior to version 7.58 was not the case, #2237) for the next subsequent request to the server.

Here is the backtrace
(gdb) bt
#0 0x0000003aea2325e5 in raise () from /lib64/libc.so.6
#1 0x0000003aea233dc5 in abort () from /lib64/libc.so.6
#2 0x0000003aea22b70e in __assert_fail_base () from /lib64/libc.so.6
#3 0x0000003aea22b7d0 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000738984 in http2_handle_stream_close (
conn=0x7f710c001048, data=0x7f73a1464e48, stream=0x7f73a3f0e0a8,
err=0x7f73d89f654c) at http2.c:1322
#5 0x0000000000739351 in http2_recv (conn=0x7f710c001048,
sockindex=0,
mem=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., len=16384, err=0x7f73d89f654c)
at http2.c:1557
#6 0x000000000075987a in Curl_read (conn=0x7f710c001048,
sockfd=3208,
buf=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., sizerequested=16384, n=0x7f73d89f65e8)
at sendf.c:741
#7 0x0000000000769916 in readwrite_data (data=0x7f73a1464e48,
conn=0x7f710c001048, k=0x7f73a1464f08, didwhat=0x7f73d89f667c,
done=0x7f73d89f671b, comeback=0x7f73d89f66f7) at transfer.c:475
#8 0x000000000076adb0 in Curl_readwrite (conn=0x7f710c001048,
data=0x7f73a1464e48, done=0x7f73d89f671b, comeback=0x7f73d89f66f7)
at transfer.c:1113
#9 0x000000000072c575 in multi_runsingle (multi=0x7f73a08bafa8,
now=..., data=0x7f73a1464e48) at multi.c:1871
#10 0x000000000072d86d in multi_socket (multi=0x7f73a08bafa8,
checkall=false, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2551
#11 0x000000000072e1b3 in curl_multi_socket_action (
multi=0x7f73a08bafa8, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2700

(gdb) p httpc->drain_total
$27 = 2
(gdb) p data->state.drain
$28 = 3

dhakiptk commented Feb 5, 2018

I am getting the same assertion failure, once every couple of hours for my application(details below)

curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.1.0g zlib/1.2.5 brotli/1.0.2 c-ares/1.13.0 nghttp2/1.29.0
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz brotli TLS-SRP HTTP2 UnixSockets HTTPS-proxy

A little detail about my application -
I make http2 call to m http(HTTP2 + TLS) servers waiting a max of n milliseconds for the response using multi_socket_action API

I also use http multiplexing(but not pipelining) by setting the suitable option on the multi-handle and k out of m requests could be going to the same server i.e multi-plexed over the same TCP connection using http2.

After max n milliseconds I remove all the easy handle using curl_multi_remove_handle API regardless of whether individual transfers are "DONE" or not.

The same underlying connection is re-used(which prior to version 7.58 was not the case, #2237) for the next subsequent request to the server.

Here is the backtrace
(gdb) bt
#0 0x0000003aea2325e5 in raise () from /lib64/libc.so.6
#1 0x0000003aea233dc5 in abort () from /lib64/libc.so.6
#2 0x0000003aea22b70e in __assert_fail_base () from /lib64/libc.so.6
#3 0x0000003aea22b7d0 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000738984 in http2_handle_stream_close (
conn=0x7f710c001048, data=0x7f73a1464e48, stream=0x7f73a3f0e0a8,
err=0x7f73d89f654c) at http2.c:1322
#5 0x0000000000739351 in http2_recv (conn=0x7f710c001048,
sockindex=0,
mem=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., len=16384, err=0x7f73d89f654c)
at http2.c:1557
#6 0x000000000075987a in Curl_read (conn=0x7f710c001048,
sockfd=3208,
buf=0x7f73a14981a8 "HTTP/2 204 \r\ncontent-type: text/html; charset=ISO-8859-1\r\nx-content-type-options: nosniff\r\ndate: Sun, 04 Feb 2018 14:00:25 GMT\r\nserver: xbfe\r\nx-xss-protection: 1; mode=block\r\nx-frame-options: SAMEORIG"..., sizerequested=16384, n=0x7f73d89f65e8)
at sendf.c:741
#7 0x0000000000769916 in readwrite_data (data=0x7f73a1464e48,
conn=0x7f710c001048, k=0x7f73a1464f08, didwhat=0x7f73d89f667c,
done=0x7f73d89f671b, comeback=0x7f73d89f66f7) at transfer.c:475
#8 0x000000000076adb0 in Curl_readwrite (conn=0x7f710c001048,
data=0x7f73a1464e48, done=0x7f73d89f671b, comeback=0x7f73d89f66f7)
at transfer.c:1113
#9 0x000000000072c575 in multi_runsingle (multi=0x7f73a08bafa8,
now=..., data=0x7f73a1464e48) at multi.c:1871
#10 0x000000000072d86d in multi_socket (multi=0x7f73a08bafa8,
checkall=false, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2551
#11 0x000000000072e1b3 in curl_multi_socket_action (
multi=0x7f73a08bafa8, s=3208, ev_bitmask=1,
running_handles=0x7f73d89f69dc) at multi.c:2700

(gdb) p httpc->drain_total
$27 = 2
(gdb) p data->state.drain
$28 = 3

bagder added a commit that referenced this issue Apr 19, 2018

http2: clear the "drain counter" when a stream is closed
This Fixes the notorious "httpc->drain_total >= data->state.drain"
assert.

Reported-by: Anders Bakken

Fixes #1680

bagder added a commit that referenced this issue Apr 19, 2018

http2: clear the "drain counter" when a stream is closed
This Fixes the notorious "httpc->drain_total >= data->state.drain"
assert.

Reported-by: Anders Bakken

Fixes #1680

@bagder bagder closed this in 7645c6b Apr 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment