Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Data race in accessing queue_state_.current_term leading to failures in tsan runs #13850

Closed
bmatican opened this issue Sep 2, 2022 · 4 comments
Assignees
Labels
2.14 Backport Required area/docdb YugabyteDB core features kind/bug This issue is a bug kind/failing-test Tests and testing infra priority/high High Priority

Comments

@bmatican
Copy link
Contributor

bmatican commented Sep 2, 2022

@bmatican bmatican added kind/failing-test Tests and testing infra area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Sep 2, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Sep 2, 2022
@bmatican bmatican added priority/high High Priority and removed priority/medium Medium priority issue labels Sep 2, 2022
@yugabyte-ci yugabyte-ci assigned yusong-yan and unassigned rthallamko3 Sep 2, 2022
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Sep 2, 2022
@rthallamko3
Copy link
Contributor

@yusong-yan , Basava had made changes to the remote bootstrap logic recently. Can you check the failures and sync with Amit on whether Basava's changes have caused these, and address the test failure.

@rthallamko3
Copy link
Contributor

Looks like the test was added by Basava by his commit - 739ea8f . @yusong-yan , Consult @amitanandaiyer , if you have any questions on the intent of the test etc.

The failures are only in centos-clang12-tsan and alma8-gcc11-fastdebug build types, so it looks like a race condition in the test.

@bmatican
Copy link
Contributor Author

bmatican commented Sep 8, 2022

It does seem to be a race, that we even catch in TSAN: https://jenkins.dev.yugabyte.com/job/github-yugabyte-db-centos-master-clang12-tsan/903/artifact/build/tsan-clang12-dynamic-ninja/yb-test-logs/tests-integration-tests__remote_bootstrap-itest/RemoteBootstrapITest_TestLongRemoteBootstrapsAcrossServers.log

[ts-4] WARNING: ThreadSanitizer: data race (pid=6208)
[ts-4]   Write of size 8 at 0x7b5800160028 by thread T72 (mutexes: write M581381393255692144, write M587573894283919480):
[ts-4]     #0 yb::consensus::PeerMessageQueue::SetLeaderMode(yb::OpId const&, long, yb::OpId const&, yb::consensus::RaftConfigPB const&) ${BUILD_ROOT}/../../src/yb/consensus/consensus_queue.cc:258:29 (libconsensus.so+0x1d26a9)
[ts-4]     #1 yb::consensus::RaftConsensus::RefreshConsensusQueueAndPeersUnlocked() ${BUILD_ROOT}/../../src/yb/consensus/raft_consensus.cc:3092:11 (libconsensus.so+0x250fe1)
[ts-4]     #2 yb::consensus::RaftConsensus::ReplicateConfigChangeUnlocked(std::__1::shared_ptr<yb::consensus::ReplicateMsg> const&, yb::consensus::RaftConfigPB const&, yb::consensus::ChangeConfigType, std::__1::function<void (yb::Status const&)>) ${BUILD_ROOT}/../../src/yb/consensus/raft_consensus.cc:3062:3 (libconsensus.so+0x25cc57)
[ts-4]     #3 yb::consensus::RaftConsensus::ChangeConfig(yb::consensus::ChangeConfigRequestPB const&, std::__1::function<void (yb::Status const&)> const&, boost::optional<yb::tserver::TabletServerErrorPB_Code>*) ${BUILD_ROOT}/../../src/yb/consensus/raft_consensus.cc:2582:5 (libconsensus.so+0x25c1a2)
[ts-4]     #4 yb::tserver::ConsensusServiceImpl::ChangeConfig(yb::consensus::ChangeConfigRequestPB const*, yb::consensus::ChangeConfigResponsePB*, yb::rpc::RpcContext) ${BUILD_ROOT}/../../src/yb/tserver/tablet_service.cc:1952:25 (libtserver.so+0x4f0305)
[ts-4]     #5 yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>) const::'lambda'(yb::consensus::ChangeConfigRequestPB const*, yb::consensus::ChangeConfigResponsePB*, yb::rpc::RpcContext)::operator()(yb::consensus::ChangeConfigRequestPB const*, yb::consensus::ChangeConfigResponsePB*, yb::rpc::RpcContext) const ${BUILD_ROOT}/src/yb/consensus/consensus.service.cc:333:9 (libconsensus_proto.so+0xed6af)
[ts-4]     #6 auto yb::rpc::HandleCall<yb::rpc::RpcCallPBParamsImpl<yb::consensus::ChangeConfigRequestPB, yb::consensus::ChangeConfigResponsePB>, yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>) const::'lambda'(yb::consensus::ChangeConfigRequestPB const*, yb::consensus::ChangeConfigResponsePB*, yb::rpc::RpcContext)>(std::__1::shared_ptr<yb::rpc::InboundCall>, yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>) const::'lambda'(yb::consensus::ChangeConfigRequestPB const*, yb::consensus::ChangeConfigResponsePB*, yb::rpc::RpcContext)) ${BUILD_ROOT}/../../src/yb/rpc/local_call.h:124:7 (libconsensus_proto.so+0xed563)
[ts-4]     #7 yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>) const ${BUILD_ROOT}/src/yb/consensus/consensus.service.cc:331:7 (libconsensus_proto.so+0xed2ef)
[ts-4]     #8 decltype(std::__1::forward<yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3&>(fp)(std::__1::forward<std::__1::shared_ptr<yb::rpc::InboundCall> >(fp0))) std::__1::__invoke<yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3&, std::__1::shared_ptr<yb::rpc::InboundCall> >(yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3&, std::__1::shared_ptr<yb::rpc::InboundCall>&&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/type_traits:3694:1 (libconsensus_proto.so+0xed24b)
[ts-4]     #9 void std::__1::__invoke_void_return_wrapper<void, true>::__call<yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3&, std::__1::shared_ptr<yb::rpc::InboundCall> >(yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3&, std::__1::shared_ptr<yb::rpc::InboundCall>&&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/__functional_base:348:9 (libconsensus_proto.so+0xed1b8)
[ts-4]     #10 std::__1::__function::__alloc_func<yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3, std::__1::allocator<yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3>, void (std::__1::shared_ptr<yb::rpc::InboundCall>)>::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>&&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1558:16 (libconsensus_proto.so+0xed173)
[ts-4]     #11 std::__1::__function::__func<yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3, std::__1::allocator<yb::consensus::ConsensusServiceIf::InitMethods(scoped_refptr<yb::MetricEntity> const&)::$_3>, void (std::__1::shared_ptr<yb::rpc::InboundCall>)>::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>&&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1732:12 (libconsensus_proto.so+0xec40c)
[ts-4]     #12 std::__1::__function::__value_func<void (std::__1::shared_ptr<yb::rpc::InboundCall>)>::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>&&) const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1885:16 (libpg_client_proto.so+0x28a326)
[ts-4]     #13 std::__1::function<void (std::__1::shared_ptr<yb::rpc::InboundCall>)>::operator()(std::__1::shared_ptr<yb::rpc::InboundCall>) const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2560:12 (libpg_client_proto.so+0x285e48)
[ts-4]     #14 yb::consensus::ConsensusServiceIf::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) ${BUILD_ROOT}/src/yb/consensus/consensus.service.cc:271:3 (libconsensus_proto.so+0xe7b6a)
[ts-4]     #15 yb::rpc::ServicePoolImpl::Handle(std::__1::shared_ptr<yb::rpc::InboundCall>) ${BUILD_ROOT}/../../src/yb/rpc/service_pool.cc:270:19 (libyrpc.so+0x3ce79d)
[ts-4]     #16 yb::rpc::InboundCall::InboundCallTask::Run() ${BUILD_ROOT}/../../src/yb/rpc/inbound_call.cc:237:13 (libyrpc.so+0x2ddbd6)
[ts-4]     #17 yb::rpc::(anonymous namespace)::Worker::Execute() ${BUILD_ROOT}/../../src/yb/rpc/thread_pool.cc:104:15 (libyrpc.so+0x3ec29c)
[ts-4]     #18 decltype(*(std::__1::forward<yb::rpc::(anonymous namespace)::Worker*&>(fp0)).*fp()) std::__1::__invoke<void (yb::rpc::(anonymous namespace)::Worker::*&)(), yb::rpc::(anonymous namespace)::Worker*&, void>(void (yb::rpc::(anonymous namespace)::Worker::*&)(), yb::rpc::(anonymous namespace)::Worker*&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/type_traits:3635:1 (libyrpc.so+0x3ed673)
[ts-4]     #19 std::__1::__bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<>, __is_valid_bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<> >::value>::type std::__1::__apply_functor<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, 0ul, std::__1::tuple<> >(void (yb::rpc::(anonymous namespace)::Worker::*&)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>&, std::__1::__tuple_indices<0ul>, std::__1::tuple<>&&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2857:12 (libyrpc.so+0x3ed5e8)
[ts-4]     #20 std::__1::__bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<>, __is_valid_bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<> >::value>::type std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>::operator()<>() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2890:20 (libyrpc.so+0x3ed59d)
[ts-4]     #21 decltype(std::__1::forward<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&>(fp)()) std::__1::__invoke<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&>(std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/type_traits:3694:1 (libyrpc.so+0x3ed559)
[ts-4]     #22 void std::__1::__invoke_void_return_wrapper<void, true>::__call<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&>(std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/__functional_base:348:9 (libyrpc.so+0x3ed4e9)
[ts-4]     #23 std::__1::__function::__alloc_func<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>, std::__1::allocator<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&> >, void ()>::operator()() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1558:16 (libyrpc.so+0x3ed4b1)
[ts-4]     #24 std::__1::__function::__func<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>, std::__1::allocator<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&> >, void ()>::operator()() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1732:12 (libyrpc.so+0x3ec75d)
[ts-4]     #25 std::__1::__function::__value_func<void ()>::operator()() const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1885:16 (libyb-redis.so+0x1ff3a4)
[ts-4]     #26 std::__1::function<void ()>::operator()() const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2560:12 (libyb-redis.so+0x1f45e9)
[ts-4]     #27 yb::Thread::SuperviseThread(void*) ${BUILD_ROOT}/../../src/yb/util/thread.cc:774:3 (libyb_util.so+0x6d4508)
[ts-4] 
[ts-4]   Previous read of size 8 at 0x7b5800160028 by thread T60 (mutexes: write M959120897900741544):
[ts-4]     #0 yb::consensus::PeerMessageQueue::GetRemoteBootstrapRequestForPeer(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, yb::consensus::StartRemoteBootstrapRequestPB*) ${BUILD_ROOT}/../../src/yb/consensus/consensus_queue.cc:866:37 (libconsensus.so+0x1d6270)
[ts-4]     #1 yb::consensus::Peer::SendNextRequest(yb::consensus::RequestTriggerMode) ${BUILD_ROOT}/../../src/yb/consensus/consensus_peers.cc:244:24 (libconsensus.so+0x1b399b)
[ts-4]     #2 yb::consensus::Peer::ProcessResponse() ${BUILD_ROOT}/../../src/yb/consensus/consensus_peers.cc:465:5 (libconsensus.so+0x1b45cf)
[ts-4]     #3 decltype(*(std::__1::forward<std::__1::shared_ptr<yb::consensus::Peer>&>(fp0)).*fp()) std::__1::__invoke<void (yb::consensus::Peer::*&)(), std::__1::shared_ptr<yb::consensus::Peer>&, void>(void (yb::consensus::Peer::*&)(), std::__1::shared_ptr<yb::consensus::Peer>&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/type_traits:3635:1 (libconsensus.so+0x1c91ab)
[ts-4]     #4 std::__1::__bind_return<void (yb::consensus::Peer::*)(), std::__1::tuple<std::__1::shared_ptr<yb::consensus::Peer> >, std::__1::tuple<>, __is_valid_bind_return<void (yb::consensus::Peer::*)(), std::__1::tuple<std::__1::shared_ptr<yb::consensus::Peer> >, std::__1::tuple<> >::value>::type std::__1::__apply_functor<void (yb::consensus::Peer::*)(), std::__1::tuple<std::__1::shared_ptr<yb::consensus::Peer> >, 0ul, std::__1::tuple<> >(void (yb::consensus::Peer::*&)(), std::__1::tuple<std::__1::shared_ptr<yb::consensus::Peer> >&, std::__1::__tuple_indices<0ul>, std::__1::tuple<>&&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2857:12 (libconsensus.so+0x1c9119)
[ts-4]     #5 std::__1::__bind_return<void (yb::consensus::Peer::*)(), std::__1::tuple<std::__1::shared_ptr<yb::consensus::Peer> >, std::__1::tuple<>, __is_valid_bind_return<void (yb::consensus::Peer::*)(), std::__1::tuple<std::__1::shared_ptr<yb::consensus::Peer> >, std::__1::tuple<> >::value>::type std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>::operator()<>() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2890:20 (libconsensus.so+0x1c90c1)
[ts-4]     #6 decltype(std::__1::forward<std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>&>(fp)()) std::__1::__invoke<std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>&>(std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/type_traits:3694:1 (libconsensus.so+0x1c9051)
[ts-4]     #7 void std::__1::__invoke_void_return_wrapper<void, true>::__call<std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>&>(std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/__functional_base:348:9 (libconsensus.so+0x1c8fe1)
[ts-4]     #8 std::__1::__function::__alloc_func<std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>, std::__1::allocator<std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&> >, void ()>::operator()() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1558:16 (libconsensus.so+0x1c8fa1)
[ts-4]     #9 std::__1::__function::__func<std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&>, std::__1::allocator<std::__1::__bind<void (yb::consensus::Peer::*)(), std::__1::shared_ptr<yb::consensus::Peer>&> >, void ()>::operator()() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1732:12 (libconsensus.so+0x1c7cad)
[ts-4]     #10 std::__1::__function::__value_func<void ()>::operator()() const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1885:16 (libyb-redis.so+0x1ff3a4)
[ts-4]     #11 std::__1::function<void ()>::operator()() const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2560:12 (libyb-redis.so+0x1f45e9)
[ts-4]     #12 yb::rpc::OutboundCall::InvokeCallbackSync() ${BUILD_ROOT}/../../src/yb/rpc/outbound_call.cc:348:3 (libyrpc.so+0x316d4b)
[ts-4]     #13 yb::rpc::InvokeCallbackTask::Run() ${BUILD_ROOT}/../../src/yb/rpc/outbound_call.cc:125:10 (libyrpc.so+0x316cef)
[ts-4]     #14 yb::rpc::(anonymous namespace)::Worker::Execute() ${BUILD_ROOT}/../../src/yb/rpc/thread_pool.cc:104:15 (libyrpc.so+0x3ec29c)
[ts-4]     #15 decltype(*(std::__1::forward<yb::rpc::(anonymous namespace)::Worker*&>(fp0)).*fp()) std::__1::__invoke<void (yb::rpc::(anonymous namespace)::Worker::*&)(), yb::rpc::(anonymous namespace)::Worker*&, void>(void (yb::rpc::(anonymous namespace)::Worker::*&)(), yb::rpc::(anonymous namespace)::Worker*&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/type_traits:3635:1 (libyrpc.so+0x3ed673)
[ts-3] I0908 19:36:45.676331  6150 leader_election.cc:164] Ignoring peer {ts1_peer_id} vote because its member type is PRE_VOTER
[ts-3] I0908 19:36:45.676617  6150 leader_election.cc:216] T {test-workload_tablet_id1} P {ts3_peer_id} [CANDIDATE]: Term 2 pre-election: Requesting vote from peer {ts4_peer_id}
[ts-3] I0908 19:36:45.677485  6150 leader_election.cc:216] T {test-workload_tablet_id1} P {ts3_peer_id} [CANDIDATE]: Term 2 pre-election: Requesting vote from peer {ts5_peer_id}
[ts-4]     #16 std::__1::__bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<>, __is_valid_bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<> >::value>::type std::__1::__apply_functor<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, 0ul, std::__1::tuple<> >(void (yb::rpc::(anonymous namespace)::Worker::*&)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>&, std::__1::__tuple_indices<0ul>, std::__1::tuple<>&&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2857:12 (libyrpc.so+0x3ed5e8)
[ts-4]     #17 std::__1::__bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<>, __is_valid_bind_return<void (yb::rpc::(anonymous namespace)::Worker::*)(), std::__1::tuple<yb::rpc::(anonymous namespace)::Worker*>, std::__1::tuple<> >::value>::type std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>::operator()<>() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2890:20 (libyrpc.so+0x3ed59d)
[ts-4]     #18 decltype(std::__1::forward<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&>(fp)()) std::__1::__invoke<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&>(std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/type_traits:3694:1 (libyrpc.so+0x3ed559)
[ts-4]     #19 void std::__1::__invoke_void_return_wrapper<void, true>::__call<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&>(std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>&) /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/__functional_base:348:9 (libyrpc.so+0x3ed4e9)
[ts-4]     #20 std::__1::__function::__alloc_func<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>, std::__1::allocator<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&> >, void ()>::operator()() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1558:16 (libyrpc.so+0x3ed4b1)
[ts-4]     #21 std::__1::__function::__func<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&>, std::__1::allocator<std::__1::__bind<void (yb::rpc::(anonymous namespace)::Worker::* const&)(), yb::rpc::(anonymous namespace)::Worker* const&> >, void ()>::operator()() /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1732:12 (libyrpc.so+0x3ec75d)
[ts-4]     #22 std::__1::__function::__value_func<void ()>::operator()() const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:1885:16 (libyb-redis.so+0x1ff3a4)
[ts-4]     #23 std::__1::function<void ()>::operator()() const /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20220806014430-c2f02d2024-centos7-x86_64-clang12/installed/tsan/libcxx/include/c++/v1/functional:2560:12 (libyb-redis.so+0x1f45e9)
[ts-4]     #24 yb::Thread::SuperviseThread(void*) ${BUILD_ROOT}/../../src/yb/util/thread.cc:774:3 (libyb_util.so+0x6d4508)

yusong-yan pushed a commit that referenced this issue Sep 21, 2022
…rossServer

Summary:
1. This test fails on both TSAN and Fastdebug is because of new added server got removed after **follower_unavailable_considered_failed_sec** timeout. It might because TSAN and Fastdebug have slower operation which cause faulty removal. From checking the log record, I notice it took those new added server more than 10 second to complete the pending commit. After change **follower_unavailable_considered_failed_sec** from 10 to 20, it give those server enough time to complete the commit. Now, the test passed both on TSAN and Fastdebug.

2. Fixed race condition warning being noticed from TSAN, which is because reading **queue_state_.current_term** instance without holding the lock.

3. Marked queue_state_ with GUARDED_BY(queue_lock_). And add REQUIRE(queue_lock_) to associated functions. Meanwhile, I notice another potential race condition similar to the one above which is reading **queue_state_.current_term , queue_state.active_config** instances without holding the lock. The way to fix this race condition is also same as the previous one, which is cacheing those values inside the lock scope,  then use the cached values outside of the scope.

Test Plan: Tested on both TSAN and Fastdebug for 100 times

Reviewers: rthallam, amitanand

Reviewed By: rthallam, amitanand

Subscribers: bogdan, ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D19506
@rthallamko3 rthallamko3 reopened this Mar 8, 2023
@rthallamko3
Copy link
Contributor

Reopening for backports.

@rthallamko3 rthallamko3 changed the title [DocDB] flaky test: RemoteBootstrapITest.TestLongRemoteBootstrapsAcrossServers [DocDB] Data race in accessing queue_state_.current_term leading to failures in tsan runs Mar 14, 2023
yusong-yan pushed a commit that referenced this issue Mar 15, 2023
…oteBootstrapsAcrossServer

Summary:
Original commit:82bcb2e159106d72d7725777d171fcf93449bd11/D19506.

1. Fixed race condition warning being noticed from TSAN, which is because reading **queue_state_.current_term** instance without holding the lock.

2. Marked queue_state_ with GUARDED_BY(queue_lock_). And add REQUIRE(queue_lock_) to associated functions. Meanwhile, I notice another potential race condition similar to the one above which is reading **queue_state_.current_term , queue_state.active_config** instances without holding the lock. The way to fix this race condition is also same as the previous one, which is cacheing those values inside the lock scope,  then use the cached values outside of the scope.

Test Plan: Tested on both TSAN and Fastdebug for 100 times

Reviewers: amitanand, rthallam, qhu

Reviewed By: rthallam, qhu

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D23478
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.14 Backport Required area/docdb YugabyteDB core features kind/bug This issue is a bug kind/failing-test Tests and testing infra priority/high High Priority
Projects
None yet
Development

No branches or pull requests

4 participants