New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion fails in http2.c #2800

Closed
yandreiy opened this Issue Jul 27, 2018 · 21 comments

Comments

Projects
None yet
3 participants
@yandreiy

yandreiy commented Jul 27, 2018

I created a small reproducible sample to help with debugging, that gives a crash after around 1 minute of running.
The app is here: https://github.com/yandreiy/curl-crash-reproduction
The error and stack trace:

app: http2.c:1681: http2_recv: Assertion `httpc->drain_total >= data->state.drain' failed.

Thread 2 "app" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff5d66700 (LWP 9271)]
0x00007ffff6faa428 in __GI_raise (sig=...) at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6faa428 in __GI_raise (sig=...) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff6fac02a in __GI_abort () at abort.c:89
#2  0x00007ffff6fa2bd7 in __assert_fail_base (fmt=..., assertion=..., file=..., line=..., function=...) at assert.c:92
#3  0x00007ffff6fa2c82 in __GI___assert_fail (assertion=..., file=..., line=..., function=...) at assert.c:101
#4  0x00007ffff7b6ff98 in http2_recv (conn=..., sockindex=..., mem=..., len=..., err=...) at http2.c:1681
#5  0x00007ffff7b1f207 in Curl_read (conn=..., sockfd=..., buf=..., sizerequested=..., n=...) at sendf.c:749
#6  0x00007ffff7b3ad05 in readwrite_data (data=..., conn=..., k=..., didwhat=..., done=..., comeback=...)
    at transfer.c:482
#7  0x00007ffff7b3c1ee in Curl_readwrite (conn=..., data=..., done=..., comeback=...) at transfer.c:1125
#8  0x00007ffff7b4a53a in multi_runsingle (multi=..., now=..., data=...) at multi.c:1914
#9  0x00007ffff7b4ad93 in curl_multi_perform (multi=..., running_handles=...) at multi.c:2181
#10 0x0000000000402eab in HTTPRequestHandler::run (this=...) at HTTPRequestHandler.cpp:130
#11 0x00000000004027f1 in std::__invoke_impl<void, void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*>(std::__invoke_memfun_deref, void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*&&) (__f=..., __t=...)
    at /usr/include/c++/6/functional:227
#12 0x000000000040276a in std::__invoke<void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*>(void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*&&) (__fn=..., __args#0=...) at /usr/include/c++/6/functional:251
#13 0x000000000040270c in std::_Mem_fn_base<void (HTTPRequestHandler::*)(), true>::operator()<HTTPRequestHandler*>(HTTPRequestHandler*&&) const (this=..., __args#0=...) at /usr/include/c++/6/functional:604
#14 0x00000000004026d7 in std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=...) at /usr/include/c++/6/functional:1391
#15 0x0000000000402612 in std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)>::operator()() (this=...) at /usr/include/c++/6/functional:1380
#16 0x00000000004025e2 in std::thread::_State_impl<std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)> >::_M_run() (this=...) at /usr/include/c++/6/thread:197
#17 0x00007ffff761457f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#18 0x00007ffff78ea6ba in start_thread (arg=...) at pthread_create.c:333
#19 0x00007ffff707c41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) f 4
#4  0x00007ffff7b6ff98 in http2_recv (conn=..., sockindex=..., mem=..., len=..., err=...) at http2.c:1681
1681          DEBUGASSERT(httpc->drain_total >= data->state.drain);

I did this

Add the line to /etc/hosts, so the request resolves to an error.

1.1.1.1   my-url

Run the sample application, which sends requests to "my-url".

I expected the following

No assert occuring

curl/libcurl version

curl 7.61.0 (x86_64-pc-linux-gnu) libcurl/7.61.0 OpenSSL/1.0.2g zlib/1.2.8 c-ares/1.14.0 nghttp2/1.32.0
Release-Date: 2018-07-11
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy

operating system

Ubuntu 16.04.4 LTS

@bagder bagder added the HTTP/2 label Jul 27, 2018

@bagder

This comment has been minimized.

Member

bagder commented Jul 27, 2018

Any chance you can retry this with curl from git? We've fixed a few http2 related issues recently so chances are this doesn't trigger anymore.

@yandreiy

This comment has been minimized.

yandreiy commented Jul 27, 2018

I tried it but the crash still occurs.

curl 7.61.1-DEV (x86_64-pc-linux-gnu) libcurl/7.61.1-DEV OpenSSL/1.0.2g zlib/1.2.8 c-ares/1.14.0 nghttp2/1.32.0
Release-Date: [unreleased]
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy 
@bagder

This comment has been minimized.

Member

bagder commented Jul 27, 2018

Thanks for testing that. I'll try with your test app within a few days and see what I can do!

@bagder bagder self-assigned this Jul 27, 2018

@bagder

This comment has been minimized.

Member

bagder commented Jul 28, 2018

I can reproduce the issue using your code. It just takes a while and quite a lot of transfers until it happens, so not exactly quick to figure out. But it's a start!

bagder added a commit that referenced this issue Jul 29, 2018

http2: clear the drain counter in multi_done
Reported-by: Andrei Virtosu
Fixes #2800
@bagder

This comment has been minimized.

Member

bagder commented Jul 29, 2018

With the fix in #2809, the problem no longer reproduces for me!

@bagder bagder closed this in c3b2974 Jul 30, 2018

@yandreiy

This comment has been minimized.

yandreiy commented Jul 31, 2018

Excellent, thanks @bagder!

xquery added a commit to xquery/curl that referenced this issue Aug 9, 2018

@yandreiy

This comment has been minimized.

yandreiy commented Aug 15, 2018

@bagder Tested this again with latest master (39cb713) and the assert still occurs - with the reproduction sample.
Could you take another look?

@bagder bagder reopened this Aug 15, 2018

@bagder

This comment has been minimized.

Member

bagder commented Aug 15, 2018

I can't reproduce. Here's the single-file version of your recipe that I'm using: issue-2800.cpp

Can you post a fresh stack trace of this with the current git version of curl?

@yandreiy yandreiy closed this Aug 16, 2018

@yandreiy yandreiy reopened this Aug 16, 2018

@yandreiy

This comment has been minimized.

yandreiy commented Aug 16, 2018

I get the assert again, using the single file issue-2800.cpp
One important step before testing is to add the line in etc/hosts:

1.1.1.1   my-url

without it, the assert won't occur.

This is the back-trace:

#0  0x00007ffff78f24bd in write () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007ffff659fcb5 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#2  0x00007ffff659dcdc in BIO_write () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#3  0x00007ffff68f6bf2 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#4  0x00007ffff68f8e13 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#5  0x00007ffff68f4a22 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#6  0x00007ffff7b876a2 in ossl_close (connssl=...) at vtls/openssl.c:1227
#7  0x00007ffff7b8774b in Curl_ossl_close (conn=..., sockindex=...) at vtls/openssl.c:1244
#8  0x00007ffff7b8f581 in Curl_ssl_close (conn=..., sockindex=...) at vtls/vtls.c:552
#9  0x00007ffff7b27fac in Curl_disconnect (data=..., conn=..., dead_connection=...) at url.c:794
#10 0x00007ffff7b46d14 in multi_done (connp=..., status=..., premature=...) at multi.c:625
#11 0x00007ffff7b49a3b in multi_runsingle (multi=..., now=..., data=...) at multi.c:2005
#12 0x00007ffff7b4a066 in curl_multi_perform (multi=..., running_handles=...) at multi.c:2190
#13 0x0000000000401c97 in HTTPRequestHandler::run (this=...) at issue-2800.cpp:164
#14 0x0000000000404aad in std::__invoke_impl<void, void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*>(std::__invoke_memfun_deref, void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*&&) (__f=..., __t=...)
    at /usr/include/c++/6/functional:227
#15 0x0000000000404a26 in std::__invoke<void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*>(void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*&&) (__fn=..., __args#0=...) at /usr/include/c++/6/functional:251
#16 0x00000000004049c8 in std::_Mem_fn_base<void (HTTPRequestHandler::*)(), true>::operator()<HTTPRequestHandler*>(HTTPRequestHandler*&&) const (this=..., __args#0=...) at /usr/include/c++/6/functional:604
#17 0x0000000000404993 in std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=...) at /usr/include/c++/6/functional:1391
#18 0x00000000004048ce in std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)>::operator()() (this=...) at /usr/include/c++/6/functional:1380
#19 0x000000000040489e in std::thread::_State_impl<std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)> >::_M_run() (this=...) at /usr/include/c++/6/thread:197
#20 0x00007ffff761357f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#21 0x00007ffff78e96ba in start_thread (arg=...) at pthread_create.c:333
#22 0x00007ffff707b41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

curl/libcurl version

curl 7.61.1-DEV (x86_64-pc-linux-gnu) libcurl/7.61.1-DEV OpenSSL/1.0.2g zlib/1.2.8 c-ares/1.14.0 nghttp2/1.32.0
Release-Date: [unreleased]
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy

@bagder

This comment has been minimized.

Member

bagder commented Aug 16, 2018

That's not the assert though. This is a signal that originates from OpenSSL on close, and you sort of asked for this when you disabled libcurl's use of signals. You can avoid this "crash" by making sure you don't set:

curl_easy_setopt(handle, CURLOPT_NOSIGNAL, 1L);
@ThomasObenaus

This comment has been minimized.

ThomasObenaus commented Aug 16, 2018

Hey @bagder. We had problems/ crashes in our application. I don't remember exactly but we had crashes that had callstacks ending up in CurlSpay() and similar methods.

Based on this information from the docu

This option is here to allow multi-threaded unix applications to still set/use all timeout options etc, without risking getting signals

We decided to set curl_easy_setopt(handle, CURLOPT_NOSIGNAL, 1L);
I guess we somehow misunderstood this option. Can you please explain when to use it and when not?

And then the problems where gone.

@bagder

This comment has been minimized.

Member

bagder commented Aug 16, 2018

The CURLOPT_NOSIGNAL man page is pretty detailed.

If you want to

  1. handle signals yourself for whatever reason,
  2. use libcurl with the sync name resolver and avoid it causing signals (which it otherwise uses to timeout slow resolves)

... then you set NOSIGNAL to 1L and be prepared that signals might be generated so you better have a signal handler to deal with them.

@yandreiy

This comment has been minimized.

yandreiy commented Aug 16, 2018

Just a quick test with CURLOPT_NOSIGNAL set to 0L, I get similar crash:

#0  0x00007ffff78f24bd in write () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007ffff659fcb5 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#2  0x00007ffff659dcdc in BIO_write () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#3  0x00007ffff68f6bf2 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#4  0x00007ffff68f72e5 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#5  0x00007ffff7b8d75d in ossl_send (conn=..., sockindex=..., mem=..., len=..., curlcode=...) at vtls/openssl.c:3595
#6  0x00007ffff7b6c8aa in send_callback (h2=..., data=..., length=..., flags=..., userp=...) at http2.c:373
#7  0x00007ffff6b48599 in nghttp2_session_send (session=...) at nghttp2_session.c:3218
#8  0x00007ffff7b7043f in http2_send (conn=..., sockindex=..., mem=..., len=..., err=...) at http2.c:2069
#9  0x00007ffff7b1d859 in Curl_write (conn=..., sockfd=..., mem=..., len=..., written=...) at sendf.c:337
#10 0x00007ffff7b16b2c in Curl_add_buffer_send (in=..., conn=..., bytes_written=..., included_body_bytes=..., 
    socketindex=...) at http.c:1177
#11 0x00007ffff7b1ac6f in Curl_http (conn=..., done=...) at http.c:2858
#12 0x00007ffff7b480e7 in multi_do (connp=..., done=...) at multi.c:1271
#13 0x00007ffff7b48ef9 in multi_runsingle (multi=..., now=..., data=...) at multi.c:1681
#14 0x00007ffff7b4a066 in curl_multi_perform (multi=..., running_handles=...) at multi.c:2190
#15 0x0000000000401c97 in HTTPRequestHandler::run (this=...) at issue-2800.cpp:164
#16 0x0000000000404aad in std::__invoke_impl<void, void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*>(std::__invoke_memfun_deref, void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*&&) (__f=..., __t=...)
    at /usr/include/c++/6/functional:227
#17 0x0000000000404a26 in std::__invoke<void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*>(void (HTTPRequestHandler::* const&)(), HTTPRequestHandler*&&) (__fn=..., __args#0=...) at /usr/include/c++/6/functional:251
#18 0x00000000004049c8 in std::_Mem_fn_base<void (HTTPRequestHandler::*)(), true>::operator()<HTTPRequestHandler*>(HTTPRequestHandler*&&) const (this=..., __args#0=...) at /usr/include/c++/6/functional:604
#19 0x0000000000404993 in std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=...) at /usr/include/c++/6/functional:1391
#20 0x00000000004048ce in std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)>::operator()() (this=...) at /usr/include/c++/6/functional:1380
#21 0x000000000040489e in std::thread::_State_impl<std::_Bind_simple<std::_Mem_fn<void (HTTPRequestHandler::*)()> (HTTPRequestHandler*)> >::_M_run() (this=...) at /usr/include/c++/6/thread:197
#22 0x00007ffff761357f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#23 0x00007ffff78e96ba in start_thread (arg=...) at pthread_create.c:333
#24 0x00007ffff707b41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
@ThomasObenaus

This comment has been minimized.

ThomasObenaus commented Aug 16, 2018

Ok. If I remember correctly we also based our decision on https://stackoverflow.com/questions/21887264/why-libcurl-needs-curlopt-nosignal-option-and-what-are-side-effects-when-it-is.
And thought, that curl might throw signals on it's own.

But if I understand correctly, then curl throws only signals if no async nameresolver (like c-ares) is used, right?
So we are safe since we are using c-ares?

@bagder

This comment has been minimized.

Member

bagder commented Aug 16, 2018

Just a quick test with CURLOPT_NOSIGNAL set to 0L, I get similar crash:

Then what is that crash? That happens within OpenSSL...

@bagder

This comment has been minimized.

Member

bagder commented Aug 16, 2018

curl throws only signals if no async nameresolver (like c-ares) is used, right?
So we are safe since we are using c-ares?

Correct. With c-ares or the threaded resolver, curl will not use signals for anything.

@yandreiy

This comment has been minimized.

yandreiy commented Aug 16, 2018

@bagder Ok, thanks for explaining. What do you suggest we do to fix it, is it a problem in openssl ?

@bagder

This comment has been minimized.

Member

bagder commented Aug 16, 2018

@yandreiy what's the reason for the crash?

@yandreiy

This comment has been minimized.

yandreiy commented Aug 17, 2018

@yandreiy what's the reason for the crash?

Thread 2 "issue" received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x7ffff5d65700 (LWP 4693)]
0x00007ffff78f24bd in write () at ../sysdeps/unix/syscall-template.S:84
84	../sysdeps/unix/syscall-template.S: No such file or directory.
@bagder

This comment has been minimized.

Member

bagder commented Aug 17, 2018

See, that's a signal. It should've been ignored by libcurl:

See frame #13-14 in your stack trace:

#13 0x00007ffff7b48ef9 in multi_runsingle (multi=..., now=..., data=...) at multi.c:1681
#14 0x00007ffff7b4a066 in curl_multi_perform (multi=..., running_handles=...) at multi.c:2190

If you look in the curl_multi_perform function you can see how the libcurl functions ignore and restore the sigpipe ignore:

curl/lib/multi.c

Lines 2167 to 2169 in ba58ce6

sigpipe_ignore(data, &pipe_st);
result = multi_runsingle(multi, now, data);
sigpipe_restore(&pipe_st);

Can you figure out why it doesn't do that for you? By observation, it looks like you still set CURLOPT_NOSIGNAL to 1...

@bagder

This comment has been minimized.

Member

bagder commented Aug 21, 2018

On further thoughts: the assert that was the reason for this issue is fixed and does not seem to appear again.

The later SIGPIPE issue is different. If you still experience this without asking for it by telling libcurl to disable its signal handler, please submit a new issue. I consider this assert bug fixed and this issue closed.

bagder added a commit that referenced this issue Aug 21, 2018

libcurl-thread.3: expand somewhat on the NO_SIGNAL motivation
Multi-threaded applictions basically MUST set CURLOPT_NO_SIGNAL to 1L to
avoid the risk of getting a SIGPIPE.

Either way, a multi-threaded application that uses libcurl/openssl needs
to have a signhandler for or ignore SIGPIPE on its own.

Based on discussions in #2800

@bagder bagder closed this Aug 22, 2018

bagder added a commit that referenced this issue Aug 23, 2018

libcurl-thread.3: expand somewhat on the NO_SIGNAL motivation
Multi-threaded applictions basically MUST set CURLOPT_NO_SIGNAL to 1L to
avoid the risk of getting a SIGPIPE.

Either way, a multi-threaded application that uses libcurl/openssl needs
to have a signhandler for or ignore SIGPIPE on its own.

Based on discussions in #2800
Closes #2904

falconindy added a commit to falconindy/curl that referenced this issue Sep 10, 2018

falconindy added a commit to falconindy/curl that referenced this issue Sep 10, 2018

libcurl-thread.3: expand somewhat on the NO_SIGNAL motivation
Multi-threaded applictions basically MUST set CURLOPT_NO_SIGNAL to 1L to
avoid the risk of getting a SIGPIPE.

Either way, a multi-threaded application that uses libcurl/openssl needs
to have a signhandler for or ignore SIGPIPE on its own.

Based on discussions in curl#2800
Closes curl#2904

@lock lock bot locked as resolved and limited conversation to collaborators Nov 20, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.