Skip to content

TiFlash crash in failpoint test #7678

@lilinghai

Description

@lilinghai

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

[2023/06/19 06:54:16.425 +08:00] [ERROR] [BaseDaemon.cpp:376] [########################################] [source=BaseDaemon] [thread_id=1585]
[2023/06/19 06:54:16.425 +08:00] [ERROR] [BaseDaemon.cpp:377] ["(from thread 263) Received signal Segmentation fault(11)."] [source=BaseDaemon] [thread_id=1585]
[2023/06/19 06:54:16.425 +08:00] [ERROR] [BaseDaemon.cpp:405] ["Address: NULL pointer."] [source=BaseDaemon] [thread_id=1585]
[2023/06/19 06:54:16.426 +08:00] [ERROR] [BaseDaemon.cpp:413] ["Access: read."] [source=BaseDaemon] [thread_id=1585]
[2023/06/19 06:54:16.426 +08:00] [ERROR] [BaseDaemon.cpp:425] ["Unknown si_code."] [source=BaseDaemon] [thread_id=1585]
[2023/06/19 06:54:16.438 +08:00] [WARN] [ExchangeReceiver.cpp:1004] ["connection end. meet error: true, err msg: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,, current alive connections: 2"] [source="MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:16> ExchangeReceiver_69 tunnel9+16"] [thread_id=1209]
[2023/06/19 06:54:16.438 +08:00] [WARN] [ExchangeReceiver.cpp:1004] ["connection end. meet error: true, err msg: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,, current alive connections: 2"] [source="MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:16> ExchangeReceiver_75 tunnel14+16"] [thread_id=240]
[2023/06/19 06:54:16.451 +08:00] [WARN] [ExchangeReceiver.cpp:1023] ["Finish receiver channels, meet error: true, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:16> ExchangeReceiver_69"] [thread_id=1209]
[2023/06/19 06:54:16.453 +08:00] [WARN] [ExchangeReceiver.cpp:1023] ["Finish receiver channels, meet error: true, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:16> ExchangeReceiver_75"] [thread_id=240]
[2023/06/19 06:54:16.491 +08:00] [WARN] [ExchangeReceiver.cpp:1004] ["connection end. meet error: true, err msg: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,, current alive connections: 2"] [source="MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:19> ExchangeReceiver_139 tunnel15+19"] [thread_id=941]
[2023/06/19 06:54:16.492 +08:00] [WARN] [ExchangeReceiver.cpp:1023] ["Finish receiver channels, meet error: true, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:19> ExchangeReceiver_139"] [thread_id=941]
[2023/06/19 06:54:16.497 +08:00] [WARN] [TiRemoteBlockInputStream.h:74] ["remote reader meets error: Receiver state: ERROR, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="TiRemote(ExchangeReceiver) MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:19> ExchangeReceiver ExchangeReceiver_139 ExchangeReceiver_139"] [thread_id=285]
[2023/06/19 06:54:16.497 +08:00] [WARN] [TiRemoteBlockInputStream.h:74] ["remote reader meets error: Receiver state: ERROR, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="TiRemote(ExchangeReceiver) MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:19> ExchangeReceiver ExchangeReceiver_139 ExchangeReceiver_139"] [thread_id=324]
[2023/06/19 06:54:16.498 +08:00] [WARN] [TiRemoteBlockInputStream.h:74] ["remote reader meets error: Receiver state: ERROR, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="TiRemote(ExchangeReceiver) MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:19> ExchangeReceiver ExchangeReceiver_139 ExchangeReceiver_139"] [thread_id=745]
[2023/06/19 06:54:16.498 +08:00] [WARN] [TiRemoteBlockInputStream.h:74] ["remote reader meets error: Receiver state: ERROR, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="TiRemote(ExchangeReceiver) MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:19> ExchangeReceiver ExchangeReceiver_139 ExchangeReceiver_139"] [thread_id=330]
[2023/06/19 06:54:16.498 +08:00] [WARN] [TiRemoteBlockInputStream.h:74] ["remote reader meets error: Receiver state: ERROR, error message: Exchange receiver meet error : From MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:14>: Code: 0, e.displayText() = DB::Exception: write to tunnel tunnel14+15 which is already closed, , e.what() = DB::Exception,"] [source="TiRemote(ExchangeReceiver) MPP<query:<query_ts:1687128853723203110, local_query_id:9879, server_id:1736808, start_ts:442270706204147713>,task_id:19> ExchangeReceiver ExchangeReceiver_139 ExchangeReceiver_139"] [thread_id=343]
[2023/06/19 06:54:16.499 +08:00] [ERROR] [BaseDaemon.cpp:569] ["\n       0xad20b5d\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+181537629]\n                \tlibs/libdaemon/src/BaseDaemon.cpp:220\n  0x7f65e1cb2d90\t<unknown symbol> [libc.so.6+347536]\n       0x42b0989\tlong std::__1::__cxx_atomic_fetch_sub<long>(std::__1::__cxx_atomic_base_impl<long>*, long, std::__1::memory_order) [tiflash+69929353]\n                \t/usr/local/bin/../include/c++/v1/atomic:1082\n       0x42b06f3\tstd::__1::__atomic_base<long, true>::fetch_sub(long, std::__1::memory_order) [tiflash+69928691]\n                \t/usr/local/bin/../include/c++/v1/atomic:1736\n       0x42be7a7\tMemoryTracker::free(long) [tiflash+69986215]\n                \tdbms/src/Common/MemoryTracker.cpp:182\n       0x42be811\tMemoryTracker::free(long) [tiflash+69986321]\n                \tdbms/src/Common/MemoryTracker.cpp:197\n       0xbd6dfc9\tDB::MemTrackerWrapper::free(unsigned long) [tiflash+198631369]\n                \tdbms/src/Flash/Mpp/TrackedMppDataPacket.h:81\n       0xbd6df29\tDB::MemTrackerWrapper::freeAll() [tiflash+198631209]\n                \tdbms/src/Flash/Mpp/TrackedMppDataPacket.h:104\n       0x41ddfc5\tDB::MemTrackerWrapper::~MemTrackerWrapper() [tiflash+69066693]\n                \tdbms/src/Flash/Mpp/TrackedMppDataPacket.h:99\n       0x41e489e\tDB::TrackedMppDataPacket::~TrackedMppDataPacket() [tiflash+69093534]\n                \tdbms/src/Flash/Mpp/TrackedMppDataPacket.h:111\n       0x41e485b\tvoid std::__1::destroy_at<DB::TrackedMppDataPacket>(DB::TrackedMppDataPacket*) [tiflash+69093467]\n                \t/usr/local/bin/../include/c++/v1/__memory/construct_at.h:50\n       0x41e47f9\tvoid std::__1::allocator_traits<std::__1::allocator<DB::TrackedMppDataPacket> >::destroy<DB::TrackedMppDataPacket, void, void>(std::__1::allocator<DB::TrackedMppDataPacket>&, DB::TrackedMppDataPacket*) [tiflash+69093369]\n                \t/usr/local/bin/../include/c++/v1/__memory/allocator_traits.h:317\n       0x41e471d\tstd::__1::__shared_ptr_emplace<DB::TrackedMppDataPacket, std::__1::allocator<DB::TrackedMppDataPacket> >::__on_zero_shared() [tiflash+69093149]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:313\n       0x42adf3c\tstd::__1::__shared_count::__release_shared() [tiflash+69918524]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:177\n       0x42adef6\tstd::__1::__shared_weak_count::__release_shared() [tiflash+69918454]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:219\n       0xbd6a2b0\tstd::__1::shared_ptr<DB::TrackedMppDataPacket>::~shared_ptr() [tiflash+198615728]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:959\n       0xbfe8d4e\tDB::ReceivedMessage::~ReceivedMessage() [tiflash+201231694]\n                \tdbms/src/Flash/Mpp/ReceivedMessage.h:26\n       0xbfe8cfb\tvoid std::__1::destroy_at<DB::ReceivedMessage>(DB::ReceivedMessage*) [tiflash+201231611]\n                \t/usr/local/bin/../include/c++/v1/__memory/construct_at.h:50\n       0xbfe8c99\tvoid std::__1::allocator_traits<std::__1::allocator<DB::ReceivedMessage> >::destroy<DB::ReceivedMessage, void, void>(std::__1::allocator<DB::ReceivedMessage>&, DB::ReceivedMessage*) [tiflash+201231513]\n                \t/usr/local/bin/../include/c++/v1/__memory/allocator_traits.h:317\n       0xbfe8a7d\tstd::__1::__shared_ptr_emplace<DB::ReceivedMessage, std::__1::allocator<DB::ReceivedMessage> >::__on_zero_shared() [tiflash+201230973]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:313\n       0x42adf3c\tstd::__1::__shared_count::__release_shared() [tiflash+69918524]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:177\n       0x42adef6\tstd::__1::__shared_weak_count::__release_shared() [tiflash+69918454]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:219\n       0xbe99c60\tstd::__1::shared_ptr<DB::ReceivedMessage>::~shared_ptr() [tiflash+199859296]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:959\n       0xbfe0c95\tDB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage::~DataWithMemoryUsage() [tiflash+201198741]\n                \tdbms/src/Common/LooseBoundedMPMCQueue.h:258\n       0xbfe0c6b\tvoid std::__1::destroy_at<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage>(DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage*) [tiflash+201198699]\n                \t/usr/local/bin/../include/c++/v1/__memory/construct_at.h:50\n       0xbfe0b39\tvoid std::__1::allocator_traits<std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage> >::destroy<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage, void, void>(std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage>&, DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage*) [tiflash+201198393]\n                \t/usr/local/bin/../include/c++/v1/__memory/allocator_traits.h:317\n       0xbfe09b4\tstd::__1::__deque_base<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage, std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage> >::clear() [tiflash+201198004]\n                \t/usr/local/bin/../include/c++/v1/deque:1253\n       0xbfe08d9\tstd::__1::__deque_base<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage, std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage> >::~__deque_base() [tiflash+201197785]\n                \t/usr/local/bin/../include/c++/v1/deque:1190\n       0xbfdf215\tstd::__1::deque<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage, std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::DataWithMemoryUsage> >::~deque() [tiflash+201191957]\n                \t/usr/local/bin/../include/c++/v1/deque:1272\n       0xbfe0e62\tDB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >::~LooseBoundedMPMCQueue() [tiflash+201199202]\n                \tdbms/src/Common/LooseBoundedMPMCQueue.h:31\n       0xbfe0deb\tvoid std::__1::destroy_at<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> > >(DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >*) [tiflash+201199083]\n                \t/usr/local/bin/../include/c++/v1/__memory/construct_at.h:50\n       0xbfe0d89\tvoid std::__1::allocator_traits<std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> > > >::destroy<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >, void, void>(std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> > >&, DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >*) [tiflash+201198985]\n                \t/usr/local/bin/../include/c++/v1/__memory/allocator_traits.h:317\n       0xbfdda4d\tstd::__1::__shared_ptr_emplace<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> >, std::__1::allocator<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> > > >::__on_zero_shared() [tiflash+201185869]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:313\n       0x42adf3c\tstd::__1::__shared_count::__release_shared() [tiflash+69918524]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:177\n       0x42adef6\tstd::__1::__shared_weak_count::__release_shared() [tiflash+69918454]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:219\n       0x4200290\tstd::__1::shared_ptr<DB::LooseBoundedMPMCQueue<std::__1::shared_ptr<DB::ReceivedMessage> > >::~shared_ptr() [tiflash+69206672]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:959\n       0x4200222\tDB::ReceivedMessageQueue::~ReceivedMessageQueue() [tiflash+69206562]\n                \tdbms/src/Flash/Mpp/ReceivedMessageQueue.h:37\n       0x42001d3\tstd::__1::default_delete<DB::ReceivedMessageQueue>::operator()(DB::ReceivedMessageQueue*) const [tiflash+69206483]\n                \t/usr/local/bin/../include/c++/v1/__memory/unique_ptr.h:57\n       0x4200160\tstd::__1::unique_ptr<DB::ReceivedMessageQueue, std::__1::default_delete<DB::ReceivedMessageQueue> >::reset(DB::ReceivedMessageQueue*) [tiflash+69206368]\n                \t/usr/local/bin/../include/c++/v1/__memory/unique_ptr.h:318\n       0x41dffc7\tstd::__1::unique_ptr<DB::ReceivedMessageQueue, std::__1::default_delete<DB::ReceivedMessageQueue> >::~unique_ptr() [tiflash+69074887]\n                \t/usr/local/bin/../include/c++/v1/__memory/unique_ptr.h:272\n       0x42017f1\tDB::ExchangeReceiverBase<DB::GRPCReceiverContext>::~ExchangeReceiverBase() [tiflash+69212145]\n                \tdbms/src/Flash/Mpp/ExchangeReceiver.cpp:390\n       0xb1384c5\tDB::ExchangeReceiver::~ExchangeReceiver() [tiflash+185828549]\n                \tdbms/src/Flash/Mpp/ExchangeReceiver.h:251\n       0xb13849b\tvoid std::__1::destroy_at<DB::ExchangeReceiver>(DB::ExchangeReceiver*) [tiflash+185828507]\n                \t/usr/local/bin/../include/c++/v1/__memory/construct_at.h:50\n       0xb138439\tvoid std::__1::allocator_traits<std::__1::allocator<DB::ExchangeReceiver> >::destroy<DB::ExchangeReceiver, void, void>(std::__1::allocator<DB::ExchangeReceiver>&, DB::ExchangeReceiver*) [tiflash+185828409]\n                \t/usr/local/bin/../include/c++/v1/__memory/allocator_traits.h:317\n       0xb1376bd\tstd::__1::__shared_ptr_emplace<DB::ExchangeReceiver, std::__1::allocator<DB::ExchangeReceiver> >::__on_zero_shared() [tiflash+185824957]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:313\n       0x42adf3c\tstd::__1::__shared_count::__release_shared() [tiflash+69918524]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:177\n       0x42adef6\tstd::__1::__shared_weak_count::__release_shared() [tiflash+69918454]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:219\n       0xb129ae0\tstd::__1::shared_ptr<DB::ExchangeReceiver>::~shared_ptr() [tiflash+185768672]\n                \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:959\n       0xb139577\tDB::TiRemoteBlockInputStream<DB::ExchangeReceiver>::~TiRemoteBlockInputStream() [tiflash+185832823]\n                \tdbms/src/DataStreams/TiRemoteBlockInputStream.h:38\n       0xb13afdb\tvoid std::__1::destroy_at<DB::TiRemoteBlockInputStream<DB::ExchangeReceiver> >(DB::TiRemoteBlockInputStream<DB::ExchangeReceiver>*) [tiflash+185839579]\n                \t/usr/local/bin/../include/c++/v1/__memory/construct_at.h:50"] [source=BaseDaemon] [thread_id=1585]

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

[2023/06/19 03:16:24.043 +08:00] [INFO] [client.go:514] ["Cluster version information"] [type=tikv] [version=7.2.0-alpha] [git_hash=0daa38c454c8d518a62f372f5fe679c1e858871e]
[2023/06/19 03:16:24.043 +08:00] [INFO] [client.go:514] ["Cluster version information"] [type=tidb] [version=7.2.0-alpha] [git_hash=5bd56cb5d5597e106a511007d42c93ee4fc20f50]
[2023/06/19 03:16:24.043 +08:00] [INFO] [client.go:514] ["Cluster version information"] [type=pd] [version=7.2.0-alpha] [git_hash=1c5ca17da988a71e7293a8ec2de3d5b13d841fa2]
[2023/06/19 03:16:24.043 +08:00] [INFO] [client.go:514] ["Cluster version information"] [type=tiflash] [version=7.2.0-alpha] [git_hash=7aafea8835af8f8c77f48fb197fec660eb9f850d]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions