Skip to content

Commit

Permalink
[BACKPORT 2.16.6][#18465] docdb: Dump PeerManager to tserver debug ui…
Browse files Browse the repository at this point in the history
… consensus page

Summary:
Original commit: 12a44ba / D27568
To debug a case where a follower is possibly not tracked at leader side, we can dump PeerManager to the consensus state page and get a clear view of it.

When dumping the peer manager, need to acquire the peer manager lock and the peer lock, if the peer is in a bad state (deadlock, etc), might get http request timed out.
Jira: DB-7438

Test Plan: Manual test: Start rf-3 unverse, create a tablet. curl `LEADER_IP:9000/tablet-consensus-status?id=TABLET_ID` to check the result.

Reviewers: qhu, bogdan

Reviewed By: qhu, bogdan

Subscribers: bogdan, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D28077
  • Loading branch information
Huqicheng authored and karan-yb committed Aug 24, 2023
1 parent 38722b3 commit 23a160a
Show file tree
Hide file tree
Showing 5 changed files with 32 additions and 0 deletions.
14 changes: 14 additions & 0 deletions src/yb/consensus/consensus_peers.cc
Expand Up @@ -69,6 +69,7 @@
#include "yb/util/status_format.h"
#include "yb/util/threadpool.h"
#include "yb/util/tsan_util.h"
#include "yb/util/url-coding.h"

using namespace std::literals;
using namespace std::placeholders;
Expand Down Expand Up @@ -199,6 +200,19 @@ Status Peer::SignalRequest(RequestTriggerMode trigger_mode) {
return status;
}

void Peer::DumpToHtml(std::ostream& out) const {
const auto peer_pb_str = EscapeForHtmlToString("Peer PB: " + peer_pb_.DebugString());
out << "Peer:" << std::endl;
std::lock_guard lock(peer_lock_);
out << Format(
"<ul><li>$0</li><li>$1</li><li>$2</li><li>$3</li></ul>",
EscapeForHtmlToString(Format("State: $0", state_)),
EscapeForHtmlToString(Format("Current Heartbeat Id: $0", cur_heartbeat_id_)),
EscapeForHtmlToString(Format("Failed Attempts: $0", failed_attempts_)),
peer_pb_str)
<< std::endl;
}

void Peer::SendNextRequest(RequestTriggerMode trigger_mode) {
auto retain_self = shared_from_this();
DCHECK(performing_update_mutex_.is_locked()) << "Cannot send request";
Expand Down
2 changes: 2 additions & 0 deletions src/yb/consensus/consensus_peers.h
Expand Up @@ -179,6 +179,8 @@ class Peer : public std::enable_shared_from_this<Peer> {
return failed_attempts_;
}

void DumpToHtml(std::ostream& out) const;

private:
void SendNextRequest(RequestTriggerMode trigger_mode);

Expand Down
12 changes: 12 additions & 0 deletions src/yb/consensus/peer_manager.cc
Expand Up @@ -153,6 +153,18 @@ void PeerManager::ClosePeersNotInConfig(const RaftConfigPB& config) {
}
}

void PeerManager::DumpToHtml(std::ostream& out) const {
out << "<h2>Peer Manager</h2>" << std::endl;
out << "<ul>" << std::endl;
std::lock_guard lock(lock_);
for (const auto& entry : peers_) {
out << "<li>" << std::endl;
entry.second->DumpToHtml(out);
out << "</li>" << std::endl;
}
out << "</ul>" << std::endl;
}

std::string PeerManager::LogPrefix() const {
return MakeTabletLogPrefix(tablet_id_, local_uuid_);
}
Expand Down
2 changes: 2 additions & 0 deletions src/yb/consensus/peer_manager.h
Expand Up @@ -91,6 +91,8 @@ class PeerManager {
// Closes connections to those peers that are not in config.
virtual void ClosePeersNotInConfig(const RaftConfigPB& config);

virtual void DumpToHtml(std::ostream& out) const;

private:
std::string LogPrefix() const;

Expand Down
2 changes: 2 additions & 0 deletions src/yb/consensus/raft_consensus.cc
Expand Up @@ -3206,6 +3206,8 @@ void RaftConsensus::DumpStatusHtml(std::ostream& out) const {
role = state_->GetActiveRoleUnlocked();
}
if (role == PeerRole::LEADER) {
peer_manager_->DumpToHtml(out);
out << "<hr/>" << std::endl;
out << "<h2>Queue overview</h2>" << std::endl;
out << "<pre>" << EscapeForHtmlToString(queue_->ToString()) << "</pre>" << std::endl;
out << "<hr/>" << std::endl;
Expand Down

0 comments on commit 23a160a

Please sign in to comment.