Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Contention profile shows contention in PeerMessageQueue and LogMessageQueue #13783

Closed
karan-yb opened this issue Aug 26, 2022 · 0 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@karan-yb
Copy link
Contributor

karan-yb commented Aug 26, 2022

Jira Link: DB-3302

Description

This contention profile was captured from internal testing of a customer issue.


33978240 2132 @ 00007f2e0878b157 00007f2e120413cd 00007f2e12030a22 00007f2e12031945 00007f2e12040a36 00007f2e11d8663d 00007f2e11d8c0cd 00007f2e11d90cac 00007f2e087aa984 00007f2e087a715f 00007f2e03ef3694 00007f2e0363041d
@ 0x7f2e0878b156 yb::(anonymous namespace)::SubmitSpinLockProfileData()
@ 0x7f2e120413cc yb::consensus::LogCache::EvictThroughOp()
@ 0x7f2e12030a21 yb::consensus::PeerMessageQueue::ResponseFromPeer()
@ 0x7f2e12031944 yb::consensus::PeerMessageQueue::LocalPeerAppendFinished()
@ 0x7f2e12040a35 yb::consensus::LogCache::LogCallback()
@ 0x7f2e11d8663c yb::log::Log::Appender::GroupWork()
@ 0x7f2e11d8c0cc yb::log::Log::Appender::ProcessBatch()
@ 0x7f2e11d90cab yb::TaskStream<>::Run()
@ 0x7f2e087aa983 yb::ThreadPool::DispatchThread()
@ 0x7f2e087a715e yb::Thread::SuperviseThread()
@ 0x7f2e03ef3693 start_thread
@ 0x7f2e0363041c __clone


3277077504 17947 @ 00007f2e0878b157 00007f2e1204357c 00007f2e120436f7 00007f2e12025542 00007f2e1205385a 00007f2e1204f5d7 00007f2e12486587 00007f2e12486cdc 00007f2e12487130 00007f2e087aa984 00007f2e087a715f 00007f2e03ef3694 00007f2e0363041d
@ 0x7f2e0878b156 yb::(anonymous namespace)::SubmitSpinLockProfileData()
@ 0x7f2e1204357b yb::consensus::LogCache::PrepareAppendOperations()
@ 0x7f2e120436f6 yb::consensus::LogCache::AppendOperations()
@ 0x7f2e12025541 yb::consensus::PeerMessageQueue::AppendOperations()
@ 0x7f2e12053859 yb::consensus::RaftConsensus::AppendNewRoundsToQueueUnlocked()
@ 0x7f2e1204f5d6 yb::consensus::RaftConsensus::ReplicateBatch()
@ 0x7f2e12486586 yb::tablet::PreparerImpl::ReplicateSubBatch()
@ 0x7f2e12486cdb yb::tablet::PreparerImpl::ProcessAndClearLeaderSideBatch()
@ 0x7f2e1248712f yb::tablet::PreparerImpl::Run()
@ 0x7f2e087aa983 yb::ThreadPool::DispatchThread()
@ 0x7f2e087a715e yb::Thread::SuperviseThread()
@ 0x7f2e03ef3693 start_thread
@ 0x7f2e0363041c __clone


4552624000 3758 @ 00007f2e0878b157 00007f2e12041554 00007f2e1207961e 00007f2e12079dab 00007f2e1207a30b 00007f2e1205ad72 00007f2e12027bf1 00007f2e087aa984 00007f2e087a715f 00007f2e03ef3694 00007f2e0363041d
@ 0x7f2e0878b156 yb::(anonymous namespace)::SubmitSpinLockProfileData()
@ 0x7f2e12041553 yb::consensus::LogCache::TrackOperationsMemory()
@ 0x7f2e1207961d yb::consensus::ReplicaState::ApplyPendingOperationsUnlocked()
@ 0x7f2e12079daa yb::consensus::ReplicaState::AdvanceCommittedOpIdUnlocked()
@ 0x7f2e1207a30a yb::consensus::ReplicaState::UpdateMajorityReplicatedUnlocked()
@ 0x7f2e1205ad71 yb::consensus::RaftConsensus::UpdateMajorityReplicated()
@ 0x7f2e12027bf0 yb::consensus::PeerMessageQueue::NotifyObserversOfMajorityReplOpChangeTask()
@ 0x7f2e087aa983 yb::ThreadPool::DispatchThread()
@ 0x7f2e087a715e yb::Thread::SuperviseThread()
@ 0x7f2e03ef3693 start_thread
@ 0x7f2e0363041c __clone

@karan-yb karan-yb added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Aug 26, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Aug 26, 2022
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Aug 26, 2022
karan-yb added a commit that referenced this issue Sep 1, 2022
…make sure that we release log messages outside of logcache lock.

Summary:
This change ensures that locking is consistent for LogCache eviction -- before this change PeerMessageQueue was acquired for log eviction sometimes. For eviction, we don't need to hold the queue lock since eviction only depends on the low flushed watermark.
Also LogCache lock provides synchronization for operations performed on the cache.

In addition to above change, we move the release of memory in LogCache outside of the lock. This will help us avoid slowness observed because of underlying heap/tcmalloc contention.

Note: UpdateMetric call seems to be called after Evict everytime although there is no dependency on it. Keeping it same for now to minimize changes.

Test Plan:
./build/latest/tests-consensus/consensus_queue-test
./build/latest/tests-consensus/consensus_peer-test
./build/latest/tests-consensus/log_cache-test

Reviewers: rthallam, sergei, mbautin, bogdan

Reviewed By: mbautin

Subscribers: timur, ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D19198
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants