You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Program terminated with signal 11, Segmentation fault.
#0 0x00007f4e5a46bbb9 in Get<google::protobuf::RepeatedPtrField<yb::log::LogEntryPB>::TypeHandler> (this=0x6b504d20, index=<optimized out>)
at /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20200829090443-f431681041-centos/installed/uninstrumented/include/google/protobuf/repeated_field.h:1523
1523 /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20200829090443-f431681041-centos/installed/uninstrumented/include/google/protobuf/repeated_field.h: No such file or directory.
(gdb) #0 0x00007f4e5a46bbb9 in Get<google::protobuf::RepeatedPtrField<yb::log::LogEntryPB>::TypeHandler> (this=0x6b504d20, index=<optimized out>)
at /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20200829090443-f431681041-centos/installed/uninstrumented/include/google/protobuf/repeated_field.h:1523
#1 Get (index=<optimized out>, this=<optimized out>) at /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20200829090443-f431681041-centos/installed/uninstrumented/include/google/protobuf/repeated_field.h:1989
#2 entry (index=<optimized out>, this=<optimized out>) at src/yb/consensus/log.pb.h:773
#3 MaxReplicateOpId (this=0x6b504d00) at ../../src/yb/consensus/log.cc:243
#4 ToString (this=0x6b504d00) at ../../src/yb/consensus/log.cc:183
#5 ToString (this=<optimized out>, this=<optimized out>) at ../../src/yb/util/blocking_queue.h:232
#6 ToString<yb::BlockingQueue<yb::log::LogEntryBatch*, yb::DefaultLogicalSize> > (value=...) at ../../src/yb/util/tostring.h:73
#7 AsString<yb::BlockingQueue<yb::log::LogEntryBatch*, yb::DefaultLogicalSize> > (t=...) at ../../src/yb/util/tostring.h:279
#8 yb::TaskStream<yb::log::LogEntryBatch>::ToString (this=<optimized out>) at ../../src/yb/util/taskstream.h:80
#9 0x00007f4e5a45e1b3 in ToString (this=<optimized out>) at ../../src/yb/consensus/log.cc:322
#10 yb::log::Log::WaitForSafeOpIdToApply (this=0x17f3a300, min_allowed=..., duration=...) at ../../src/yb/consensus/log.cc:1089
#11 0x00007f4e5a7203bb in yb::consensus::RaftConsensus::WaitForSafeOpIdToApply (this=0xe6a94b0, op_id=...) at ../../src/yb/consensus/raft_consensus.cc:3105
#12 0x00007f4e5a7495df in yb::consensus::ReplicaState::ApplyPendingOperationsUnlocked (this=this@entry=0x176498c0, committed_op_id=..., could_stop=..., could_stop@entry=...) at ../../src/yb/consensus/replica_state.cc:903
#13 0x00007f4e5a74a23f in yb::consensus::ReplicaState::AdvanceCommittedOpIdUnlocked (this=0x176498c0, committed_op_id=..., could_stop=...) at ../../src/yb/consensus/replica_state.cc:852
#14 0x00007f4e5a730551 in yb::consensus::RaftConsensus::MarkOperationsAsCommittedUnlocked (this=this@entry=0xe6a94b0, request=..., deduped_req=..., last_from_leader=...) at ../../src/yb/consensus/raft_consensus.cc:1975
#15 0x00007f4e5a736a27 in yb::consensus::RaftConsensus::UpdateReplica (this=this@entry=0xe6a94b0, request=request@entry=0x6aed5ba0, response=response@entry=0x2b9ad3f0) at ../../src/yb/consensus/raft_consensus.cc:1748
#16 0x00007f4e5a737973 in yb::consensus::RaftConsensus::Update (this=this@entry=0xe6a94b0, request=request@entry=0x6aed5ba0, response=response@entry=0x2b9ad3f0, deadline=...) at ../../src/yb/consensus/raft_consensus.cc:1304
#17 0x00007f4e5b3c2c01 in yb::tserver::ConsensusServiceImpl::UpdateConsensus (this=this@entry=0x1feef00, req=req@entry=0x6aed5ba0, resp=resp@entry=0x2b9ad3f0, context=...) at ../../src/yb/tserver/tablet_service.cc:2111
#18 0x00007f4e57c4608a in yb::consensus::ConsensusServiceIf::Handle (this=0x1feef00, call=...) at src/yb/consensus/consensus.service.cc:100
#19 0x00007f4e53a13d09 in yb::rpc::ServicePoolImpl::Handle (this=0x2672240, incoming=...) at ../../src/yb/rpc/service_pool.cc:262
#20 0x00007f4e539bab34 in yb::rpc::InboundCall::InboundCallTask::Run (this=<optimized out>) at ../../src/yb/rpc/inbound_call.cc:212
#21 0x00007f4e53a262f8 in yb::rpc::(anonymous namespace)::Worker::Execute (this=<optimized out>) at ../../src/yb/rpc/thread_pool.cc:105
#22 0x00007f4e5225110f in operator() (this=0x4e0d018) at /home/yugabyte/yb-software/yugabyte-2.3.0.0-b167-centos-x86_64/linuxbrew-xxxxxxxxxxxx/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:2267
#23 yb::Thread::SuperviseThread (arg=0x4e0cfc0) at ../../src/yb/util/thread.cc:760
#24 0x00007f4e4ca6c694 in start_thread (arg=0x7f4cc4a67700) at pthread_create.c:333
#25 0x00007f4e4c1a941d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Since these long waits are something we DFATAL, this codepath is likely not exercised much. I suspect the slowdown from DNS queries was contributing to this. I was seeing these in the log
Summary:
When WaitForSafeOpIdToApply takes too long time we dump appender state.
If it contains empty batch it would crash the process.
This diff updates MaxReplicateOpId to handle empty batch, so process would not crash in above scenario.
Test Plan: Jenkins
Reviewers: bogdan
Reviewed By: bogdan
Subscribers: ybase
Differential Revision: https://phabricator.dev.yugabyte.com/D9374
Seems to trace back to this diff: https://phabricator.dev.yugabyte.com/D8972
Since these long waits are something we DFATAL, this codepath is likely not exercised much. I suspect the slowdown from DNS queries was contributing to this. I was seeing these in the log
The text was updated successfully, but these errors were encountered: