Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] coredump in 3.4.6-1 #9301

Closed
daoxian opened this issue Jun 20, 2019 · 3 comments
Closed

[BUG] coredump in 3.4.6-1 #9301

daoxian opened this issue Jun 20, 2019 · 3 comments

Comments

@daoxian
Copy link

daoxian commented Jun 20, 2019

3.4.6-1, RocksDB, CentoOS7.4, single mode.
coredump info:

Failed to read a valid object file image from memory. Core was generated by /usr/sbin/arangod -c /home/admin/arangodb/data_3.4.5_singlemode_partition/singl'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000001f36d89 in je_large_dalloc ()
Missing separate debuginfos, use: debuginfo-install arangodb3-3.4.6-1.1.x86_64
(gdb) bt
#0 0x0000000001f36d89 in je_large_dalloc ()
#1 0x00000000010b6ef3 in void std::vector<arangodb::pregel::SenderMessage, std::allocator<arangodb::pregel::SenderMessage > >::_M_range_insert<__gnu_cxx::__normal_iterator<arangodb::pregel::SenderMessage const*, std::vector<arangodb::pregel::SenderMessage, std::allocator<arangodb::pregel::SenderMessage > > > >(__gnu_cxx::__normal_iterator<arangodb::pregel::SenderMessage, std::vector<arangodb::pregel::SenderMessage, std::allocator<arangodb::pregel::SenderMessage > > >, __gnu_cxx::__normal_iterator<arangodb::pregel::SenderMessage const, std::vector<arangodb::pregel::SenderMessage, std::allocator<arangodb::pregel::SenderMessage > > >, __gnu_cxx::__normal_iterator<arangodb::pregel::SenderMessage const*, std::vector<arangodb::pregel::SenderMessage, std::allocator<arangodb::pregel::SenderMessage > > >, std::forward_iterator_tag) ()
#2 0x00000000010b7289 in arangodb::pregel::ArrayInCache<arangodb::pregel::SenderMessage >::mergeCache(arangodb::pregel::WorkerConfig const&, arangodb::pregel::InCache<arangodb::pregel::SenderMessage > const*) ()
#3 0x0000000000ea0eb7 in arangodb::pregel::Worker<arangodb::pregel::HITSValue, signed char, arangodb::pregel::SenderMessage >::_processVertices(unsigned long, arangodb::pregel::RangeIteratorarangodb::pregel::VertexEntry&) ()
#4 0x0000000000eb1710 in arangodb::pregel::Worker<arangodb::pregel::HITSValue, signed char, arangodb::pregel::SenderMessage >::_startProcessing()::{lambda(bool)#1}::operator()(bool) const ()
#5 0x00000000008bdc5c in asio::detail::completion_handler<arangodb::rest::Scheduler::post(std::function<void (bool)>, bool)::{lambda()#2}>::do_complete(void*, asio::detail::scheduler_operation*, std::error_code const&, unsigned long) ()
#6 0x00000000008c5e43 in arangodb::SchedulerThread::run() ()
#7 0x0000000001204499 in arangodb::Thread::startThread(void*) ()
#8 0x00000000012c4559 in ThreadStarter(void*) ()
#9 0x00000000027d1a08 in start ()
#10 0x0000000000000000 in ?? ()
`

@daoxian daoxian changed the title [BUG] coredump in 3.5.6-1 [BUG] coredump in 3.4.6-1 Jun 20, 2019
@daoxian
Copy link
Author

daoxian commented Jun 20, 2019

with debuginfo symbols:

#0  atomic_load_p (mo=atomic_memory_order_acquire, a=<optimized out>) at include/jemalloc/internal/atomic.h:55
#1  extent_arena_get (extent=0x0) at include/jemalloc/internal/extent_inlines.h:49
#2  je_large_dalloc (tsdn=0x7f5ef2ff2f98, extent=0x0) at src/large.c:347
#3  0x00000000010b6ef3 in deallocate (this=0x7f7c4f5b2138, __p=<optimized out>) at /usr/include/c++/6.4.0/ext/new_allocator.h:110
#4  deallocate (__a=..., __n=<optimized out>, __p=<optimized out>) at /usr/include/c++/6.4.0/bits/alloc_traits.h:462
#5  _M_deallocate (this=0x7f7c4f5b2138, __n=<optimized out>, __p=<optimized out>) at /usr/include/c++/6.4.0/bits/stl_vector.h:178
#6  std::vector<arangodb::pregel::SenderMessage<double>, std::allocator<arangodb::pregel::SenderMessage<double> > >::_M_range_insert<__gnu_cxx::__normal_iterator<arangodb::pregel::SenderMessage<double> const*, std::vector<arangodb::pregel::SenderMessage<double>, std::allocator<arangodb::pregel::SenderMessage<double> > > > > (this=0x7f7c4f5b2138, __position=...,
    __first=..., __last=...) at /usr/include/c++/6.4.0/bits/vector.tcc:685
#7  0x00000000010b7289 in _M_insert_dispatch<__gnu_cxx::__normal_iterator<arangodb::pregel::SenderMessage<double> const*, std::vector<arangodb::pregel::SenderMessage<double>, std::allocator<arangodb::pregel::SenderMessage<double> > > > > (__last=..., __first=..., __pos=..., this=<optimized out>) at /usr/include/c++/6.4.0/bits/stl_vector.h:1375
#8  insert<__gnu_cxx::__normal_iterator<arangodb::pregel::SenderMessage<double> const*, std::vector<arangodb::pregel::SenderMessage<double>, std::allocator<arangodb::pregel::SenderMessage<double> > > > > (__last=..., __first=..., __position=..., this=<optimized out>) at /usr/include/c++/6.4.0/bits/stl_vector.h:1100
#9  arangodb::pregel::ArrayInCache<arangodb::pregel::SenderMessage<double> >::mergeCache (this=0x7fd38d6f1a00, config=..., otherCache=0x7fd38d6f1f80)
    at /work/ArangoDB/arangod/Pregel/IncomingCache.cpp:151
#10 0x0000000000ea0eb7 in arangodb::pregel::Worker<arangodb::pregel::HITSValue, signed char, arangodb::pregel::SenderMessage<double> >::_processVertices (this=0x7f4b5be54310,
    threadId=<optimized out>, vertexIterator=...) at /work/ArangoDB/arangod/Pregel/Worker.cpp:418
#11 0x0000000000eb1710 in arangodb::pregel::Worker<arangodb::pregel::HITSValue, signed char, arangodb::pregel::SenderMessage<double> >::_startProcessing()::{lambda(bool)#1}::operator()(bool) const (__closure=0x7feb5d9376e0) at /work/ArangoDB/arangod/Pregel/Worker.cpp:335
#12 0x00000000008bdc5c in operator() (__args#0=<optimized out>, this=0x7f5ef2ff1e48) at /usr/include/c++/6.4.0/functional:2127
#13 operator() (__closure=0x7f5ef2ff1e40) at /work/ArangoDB/arangod/Scheduler/Scheduler.cpp:327
#14 asio_handler_invoke<arangodb::rest::Scheduler::post(std::function<void(bool)>, bool)::<lambda()> > (function=...)
    at /work/ArangoDB/3rdParty/asio/1.12/include/asio/handler_invoke_hook.hpp:68
#15 invoke<arangodb::rest::Scheduler::post(std::function<void(bool)>, bool)::<lambda()>, arangodb::rest::Scheduler::post(std::function<void(bool)>, bool)::<lambda()> > (context=...,
    function=...) at /work/ArangoDB/3rdParty/asio/1.12/include/asio/detail/handler_invoke_helpers.hpp:37
#16 complete<arangodb::rest::Scheduler::post(std::function<void(bool)>, bool)::<lambda()> > (this=<synthetic pointer>, handler=..., function=...)
    at /work/ArangoDB/3rdParty/asio/1.12/include/asio/detail/handler_work.hpp:81
#17 asio::detail::completion_handler<arangodb::rest::Scheduler::post(std::function<void(bool)>, bool)::<lambda()> >::do_complete(void *, asio::detail::operation *, const asio::error_code &, std::size_t) (owner=<optimized out>, base=<optimized out>) at /work/ArangoDB/3rdParty/asio/1.12/include/asio/detail/completion_handler.hpp:69
#18 0x00000000008c5e43 in complete (bytes_transferred=0, ec=..., owner=0x7ff9fb07ae00, this=0x7fd0a72ddb40)
    at /work/ArangoDB/3rdParty/asio/1.12/include/asio/detail/scheduler_operation.hpp:39
#19 do_run_one (ec=..., this_thread=..., lock=..., this=<optimized out>) at /work/ArangoDB/3rdParty/asio/1.12/include/asio/detail/impl/scheduler.ipp:400
#20 run_one (ec=..., this=0x7ff9fb07ae00) at /work/ArangoDB/3rdParty/asio/1.12/include/asio/detail/impl/scheduler.ipp:174
#21 run_one (this=<optimized out>) at /work/ArangoDB/3rdParty/asio/1.12/include/asio/impl/io_context.ipp:76
#22 arangodb::SchedulerThread::run (this=0x7ff2103645b0) at /work/ArangoDB/arangod/Scheduler/Scheduler.cpp:212
#23 0x0000000001204499 in runMe (this=0x7ff2103645b0) at /work/ArangoDB/lib/Basics/Thread.cpp:347
#24 arangodb::Thread::startThread (arg=0x7ff2103645b0) at /work/ArangoDB/lib/Basics/Thread.cpp:82
#25 0x00000000012c4559 in ThreadStarter (data=0x7ff9c78e8240) at /work/ArangoDB/lib/Basics/threads-posix.cpp:71
#26 0x00000000027d1a08 in start (p=0x7f5ef2ff4ae0) at src/thread/pthread_create.c:150
#27 0x00000000027d2fe9 in __clone () at src/thread/x86_64/clone.s:21
Backtrace stopped: frame did not save the PC

@mpoeter
Copy link
Member

mpoeter commented Jun 21, 2019

To me this looks very much like a race on ArrayInCache::_shardMap (or more specifically on the entries of that map).
BTW - this backtrace looks identical to this one: #9302 (comment)

@maxkernbach
Copy link
Contributor

Since the same core dump has been provided in the stated issue, I am closing this issue as a duplicate to continue further discussion in #9302.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants