Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

munmap_chunk(): invalid pointer, "nuraft_w_0" received signal SIGABRT, Aborted #188

Closed
kishorekrd opened this issue Mar 28, 2021 · 12 comments

Comments

@kishorekrd
Copy link

kishorekrd commented Mar 28, 2021

Getting the following crash randomly on Ubuntu 18.04 . Any fix or mitigation ?

munmap_chunk(): invalid pointer

Thread 21 "nuraft_w_0" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fff74ae8700 (LWP 21466)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
Sun Mar 28 06:37:01 UTC 2021
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5d4c921 in __GI_abort () at abort.c:79
#2  0x00007ffff5d95967 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff5ec2b0d "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff5d9c9da in malloc_printerr (str=str@entry=0x7ffff5ec4720 "munmap_chunk(): invalid pointer") at malloc.c:5342
#4  0x00007ffff5da3fbc in munmap_chunk (p=0x7fff480018e0) at malloc.c:2846
#5  __GI___libc_free (mem=0x7fff480018f0) at malloc.c:3127
#6  0x00005555565fa90e in OPENSSL_free (orig_ptr=0x7fff480018f8) at external/boringssl/src/crypto/mem.c:154
#7  0x000055555657e70d in bio_free (bio=0x7fff0c0014e8) at external/boringssl/src/crypto/bio/pair.c:144
#8  0x000055555657c3a9 in BIO_free (bio=0x7fff0c0014e8) at external/boringssl/src/crypto/bio/bio.c:103
#9  0x00005555566c610c in asio::ssl::detail::engine::~engine (this=0x555558680e70, __in_chrg=<optimized out>) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/ssl/detail/impl/engine.ipp:66
#10 asio::ssl::detail::stream_core::~stream_core (this=0x555558680e70, __in_chrg=<optimized out>) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/ssl/detail/stream_core.hpp:54
#11 0x00005555566c621a in asio::ssl::stream<asio::basic_stream_socket<asio::ip::tcp>&>::~stream (this=0x555558680e68, __in_chrg=<optimized out>) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/ssl/strea$.hpp:120
#12 nuraft::asio_rpc_client::~asio_rpc_client (this=0x555558680e10, __in_chrg=<optimized out>) at /home/azureuser/barrel/thirdparty/NuRaft/src/asio_service.cxx:854
#13 0x0000555556714b3e in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555558680e00) at /usr/include/c++/7/bits/shared_ptr_base.h:154
#14 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:684
#15 std::__shared_ptr<nuraft::rpc_client, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#16 std::__shared_ptr<nuraft::rpc_client, (__gnu_cxx::_Lock_policy)2>::operator= (__r=..., this=0x5555586adc50) at /usr/include/c++/7/bits/shared_ptr_base.h:1213
#17 std::shared_ptr<nuraft::rpc_client>::operator= (__r=..., this=0x5555586adc50) at /usr/include/c++/7/bits/shared_ptr.h:319
#18 nuraft::peer::recreate_rpc (this=0x5555586adc30, config=std::shared_ptr<nuraft::srv_config> (use count 3, weak count 0) = {...}, ctx=...) at /home/azureuser/barrel/thirdparty/NuRaft/src/peer.cxx:205
#19 0x000055555670eb85 in nuraft::raft_server::request_prevote (this=this@entry=0x55555872b490) at /home/azureuser/barrel/thirdparty/NuRaft/src/handle_vote.cxx:80
#20 0x000055555670a4db in nuraft::raft_server::handle_election_timeout (this=0x55555872b490) at /home/azureuser/barrel/thirdparty/NuRaft/src/handle_timeout.cxx:288
#21 0x00005555566c1034 in std::__invoke_impl<void, void (*&)(std::shared_ptr<nuraft::delayed_task>&, std::error_code), std::shared_ptr<nuraft::delayed_task>&, std::error_code const&> (__f=<optimized out>) at /usr/include/$++/7/bits/invoke.h:60
#22 std::__invoke<void (*&)(std::shared_ptr<nuraft::delayed_task>&, std::error_code), std::shared_ptr<nuraft::delayed_task>&, std::error_code const&> (__fn=@0x7fff74abcef0: 0x5555566b3990 <_timer_handler_(std::shared_ptr<$uraft::delayed_task>&, std::error_code)>) at /usr/include/c++/7/bits/invoke.h:95
#23 std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>::__call<void, std::error_code const&, 0ul, 1ul>(std::tuple<std::error_code con$t&>&&, std::_Index_tuple<0ul, 1ul>) (__args=..., this=0x7fff74abcef0) at /usr/include/c++/7/functional:467
#24 std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>::operator()<std::error_code const&, void>(std::error_code const&) (this=0x7fff$4abcef0) at /usr/include/c++/7/functional:551
#25 asio::detail::binder1<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, std::error_code>::operator()() (this=0x7fff74abcef0) at
/home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/detail/bind_handler.hpp:64
#26 asio::asio_handler_invoke<asio::detail::binder1<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, std::error_code> >(asio::deta$l::binder1<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, std::error_code>&, ...) (function=...) at /home/azureuser/barrel/third$arty/NuRaft/asio/asio/include/asio/handler_invoke_hook.hpp:68
#27 asio_handler_invoke_helpers::invoke<asio::detail::binder1<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, std::error_code>, st
d::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)> >(asio::detail::binder1<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std
::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, std::error_code>&, std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std
::error_code)>&) (context=..., function=...) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/detail/handler_invoke_helpers.hpp:37
#28 asio::detail::handler_work<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, asio::system_executor>::complete<asio::detail::bind
er1<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, std::error_code> >(asio::detail::binder1<std::_Bind<void (*(std::shared_ptr<nu
raft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)>, std::error_code>&, std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nur
aft::delayed_task>&, std::error_code)>&) (this=<synthetic pointer>, handler=..., function=...) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/detail/handler_work.hpp:81
#29 asio::detail::wait_handler<std::_Bind<void (*(std::shared_ptr<nuraft::delayed_task>, std::_Placeholder<1>))(std::shared_ptr<nuraft::delayed_task>&, std::error_code)> >::do_complete(void*, asio::detail::scheduler_operat
ion*, std::error_code const&, unsigned long) (owner=0x55555860be90, base=0x7fff080133a0) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/detail/wait_handler.hpp:71
#30 0x00005555566bed25 in asio::detail::scheduler_operation::complete (bytes_transferred=<optimized out>, ec=..., owner=0x55555860be90, this=<optimized out>) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/as
io/detail/scheduler_operation.hpp:39
#31 asio::detail::scheduler::do_run_one (ec=..., this_thread=..., lock=..., this=0x55555860be90) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/detail/impl/scheduler.ipp:400
#32 asio::detail::scheduler::run (this=0x55555860be90, ec=...) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/detail/impl/scheduler.ipp:153
#33 0x00005555566b3dc5 in asio::io_context::run (this=0x55555860bc20) at /home/azureuser/barrel/thirdparty/NuRaft/asio/asio/include/asio/impl/io_context.ipp:61
#34 nuraft::asio_service_impl::worker_entry (this=0x55555860bc20) at /home/azureuser/barrel/thirdparty/NuRaft/src/asio_service.cxx:1563
#35 0x00007ffff74e26df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#36 0x00007ffff7bbb6db in start_thread (arg=0x7fff74ae8700) at pthread_create.c:463
#37 0x00007ffff5e2d71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95`
(gdb) frame 7
#7  0x000055555657e70d in bio_free (bio=0x7fff0c0014e8) at external/boringssl/src/crypto/bio/pair.c:144
144     external/boringssl/src/crypto/bio/pair.c: No such file or directory.
(gdb) p *bio
$7 = {method = 0x555556f911e0 <methods_biop>, init = 0, shutdown = 1, flags = 0, retry_reason = 0, num = 0, references = 0, ptr = 0x7fff480018f8, next_bio = 0x0, num_read = 0, num_write = 0}
(gdb) frame 6
#6  0x00005555565fa90e in OPENSSL_free (orig_ptr=0x7fff480018f8) at external/boringssl/src/crypto/mem.c:154
154     external/boringssl/src/crypto/mem.c: No such file or directory.
(gdb) p orig_ptr
$8 = (void *) 0x7fff480018f8
(gdb) p (struct bio_bio_st *)orig_ptr
$9 = (struct bio_bio_st *) 0x7fff480018f8
(gdb) p *(struct bio_bio_st *)orig_ptr
$10 = {peer = 0x0, closed = 0, len = 0, offset = 0, size = 0, buf = 0x0, request = 0}
(gdb) info local
ptr = 0x7fff480018f0
size = 56
(gdb) p *(struct bio_bio_st *)0x7fff480018f0
$11 = {peer = 0x0, closed = 0, len = 0, offset = 0, size = 0, buf = 0x0, request = 0}
@greensky00
Copy link
Contributor

Hi @kishorekrd

Looks like you are using BoringSSL. Do you see the same issue even with OpenSSL?

@kishorekrd
Copy link
Author

Yes, I have seen 3 more different crashes in BoringSSL while running NuRaft. I did not change any NuRaft configuration. I built NuRaft with default configuration. I don't see BoringSSL installed on my system. Here is the list of ssl packages

$ sudo dpkg-query -l | grep -i ssl
ii  libcurl4:amd64                         7.58.0-2ubuntu3.12                          amd64        easy-to-use client-side URL transfer library (OpenSSL flavour)
ii  libcurl4-openssl-dev:amd64             7.58.0-2ubuntu3.12                          amd64        development files and documentation for libcurl (OpenSSL flavour)
ii  libflac8:amd64                         1.3.2-1                                     amd64        Free Lossless Audio Codec - runtime C library
ii  libio-socket-ssl-perl                  2.060-3~ubuntu18.04.1                       all          Perl module implementing object oriented interface to SSL sockets
ii  libnet-smtp-ssl-perl                   1.04-1                                      all          Perl module providing SSL support to Net::SMTP
ii  libnet-ssleay-perl                     1.84-1ubuntu0.2                             amd64        Perl module for Secure Sockets Layer (SSL)
ii  libssl-dev:amd64                       1.1.1-1ubuntu2.1~18.04.9                    amd64        Secure Sockets Layer toolkit - development files
ii  libssl1.0.0:amd64                      1.0.2n-1ubuntu5.6                           amd64        Secure Sockets Layer toolkit - shared libraries
ii  libssl1.1:amd64                        1.1.1-1ubuntu2.1~18.04.9                    amd64        Secure Sockets Layer toolkit - shared libraries
ii  libwavpack1:amd64                      5.1.0-2ubuntu1.5                            amd64        audio codec (lossy and lossless) - library
ii  libxmlsec1-openssl:amd64               1.2.25-1build1                              amd64        Openssl engine for the XML security library
ii  libzstd1:amd64                         1.3.3+dfsg-2ubuntu1.2                       amd64        fast lossless compression algorithm
**ii  openssl                                1.1.1-1ubuntu2.1~18.04.9                    amd64        Secure Sockets Layer toolkit - cryptographic utility**
ii  perl-openssl-defaults:amd64            3build1                                     amd64        version compatibility baseline for Perl OpenSSL packages
ii  python3-certifi                        2018.1.18-2                                 all          root certificates for validating SSL certs and verifying TLS hosts (python3)
ii  python3-openssl                        17.5.0-1ubuntu1                             all          Python 3 wrapper around the OpenSSL library
ii  python3-service-identity               16.0.0-2                                    all          Service identity verification for pyOpenSSL (Python 3 module)

@greensky00
Copy link
Contributor

greensky00 commented Mar 30, 2021

I think it is compiled with a custom library, not from the system path.

OPENSSL_free (orig_ptr=0x7fff480018f8) at external/boringssl/src/crypto/mem.c:154

What is the CMake message (when you do cmake ..) for SSL library path? It should be like this if it uses the system library:

-- Open SSL library path: /usr/lib/x86_64-linux-gnu/libssl.a

If it is not, you may manually add the path /usr/lib/x86_64-linux-gnu to CMakeLists.txt, so as to make it scanned first:

    find_path(OPENSSL_LIBRARY_PATH
              NAMES libssl.a
              PATHS /usr/lib/x86_64-linux-gnu
                    ${PROJECT_SOURCE_DIR}
                    ${DEPS_PREFIX}/lib
                    ${DEPS_PREFIX}/lib64
                    /usr/local/opt/openssl/lib
                    ${LIB_PATH_HINT})

@kishorekrd
Copy link
Author

kishorekrd commented Mar 30, 2021

Yes, I see it

-- deps prefix is not given
-- Open SSL library path: /usr/lib/x86_64-linux-gnu/libssl.a
-- Output library file name: libnuraft.a
-- Configuring done
-- Generating done

Even though NuRaft is built with openssl, it is somehow using boringssl. Is there anyway I can force Nuraft to use openssl?

@greensky00
Copy link
Contributor

greensky00 commented Mar 30, 2021

From what you shared, I think the below is the case

  • Your NuRaft is built with OpenSSL (it is a static library so that NuRaft itself doesn't include OpenSSL objects).
  • But, your executable binary (using NuRaft) is linked with BoringSSL.

To avoid such wrong linkage, you may need to

  • Build NuRaft with BoringSSL, or
  • Use OpenSSL for your executable binary.

@kishorekrd
Copy link
Author

Thanks for the response. If I don't want to use SSL right now, Can I use mock_ssl by disabling SSL ?
cmake -DDISABLE_SSL=1 ../

@greensky00
Copy link
Contributor

Yes you can. But does this happen even though you don't set below enable_ssl_ option?

asio_service::options asio_opt;
asio_opt.enable_ssl_ = true;

https://github.com/eBay/NuRaft/blob/master/docs/enabling_ssl.md

@kishorekrd
Copy link
Author

Yes that's correct. I have this following settings

asio_service::options asio_opt;
asio_opt.enable_ssl_ = false;

What is the default behavior?

@greensky00
Copy link
Contributor

Default is false. I just wonder if this issue happens with the false option.

@kishorekrd
Copy link
Author

Even with false, the issue is happening. I think the main issue is mixing openssl and Boringssl.
As you mentioned
To avoid such wrong linkage, you may need to

Build NuRaft with BoringSSL, or
Use OpenSSL for your executable binary.

@kishorekrd
Copy link
Author

Hi, Even though I disabled SSL for Nuraft build, I am still seeing asio running some ssl code.

Jun 13 20:39:27 -- Build type is not given, use default.
Jun 13 20:39:27 -- Build type: RelWithDebInfo
Jun 13 20:39:27 -- Build Install Prefix : /home/user/thirdparty/NuRaft/install
Jun 13 20:39:27 -- ASIO include path: /home/azureuser/work2/barrel/thirdparty/NuRaft/asio/asio/include
Jun 13 20:39:27 -- deps prefix is not given
Jun 13 20:39:27 -- ---- DISABLED SSL ----
Jun 13 20:39:27 -- Output library file name: libnuraft.a

How to disable SSL completely for Nuraft and also asio?

How can I build Nuraft with Boringssl?
I see flag OPENSSL_IS_BORINGSSL in asio code. How to use this flag?

@greensky00
Copy link
Contributor

@kishorekrd
If -DDISABLE_SSL option is on, NuRaft does not even include SSL library (LIBSSL must be empty):

NuRaft/CMakeLists.txt

Lines 130 to 139 in 714db11

if (NOT OPENSSL_LIBRARY_PATH)
message(STATUS "Use system's side OpenSSL library")
set(LIBSSL ssl)
set(LIBCRYPTO crypto)
else ()
message(STATUS "Open SSL library path: ${OPENSSL_LIBRARY_PATH}/libssl.a")
set(LIBSSL ${OPENSSL_LIBRARY_PATH}/libssl.a)
set(LIBCRYPTO ${OPENSSL_LIBRARY_PATH}/libcrypto.a)
endif ()
endif ()

You can check symbols in libnuraft.a:

$ nm libnuraft.a -C | grep ssl

If it is compiled correctly, the only thing you can see should be mock_ssl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants