Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage/reports: change node liveness considerations #43825

merged 4 commits into from Jan 14, 2020
Changes from 1 commit
File filter...
Filter file types
Jump to…
Jump to file
Failed to load files.


Just for now


storage/reports: change node liveness considerations

Before this patch, whether a node was "alive" or "dead" did not matter
for the under-replicated ranges count in system.replication_stats. This patch
makes replicas on dead stores be ignored when counting replicas for
purposes of the underreplicated counter.
Liveness used to matter for the unavailable count and for the
system.critical_localities report (and continues to matter).

Note that this change interestingly means that a range can be considered
both under-replicated and over-replicated at the same time - if there's
too many replicas, but sufficiently many of them are dead.

This patch also changes the liveness criteria across the board for the
reports that cared about liveness (unavailable ranges, under-replicated
ranges and critical localities). The code was buggy in that a
decomissioning node was considered live for 5 minutes after it stopped
heartbeating its liveness record, whereas a non-decomissioning one was
only considered live for a few seconds. The patch fixes it by using the
same logic to make liveness determinations as the underreplicated metric
does - you're dead as soon as the liveness record expires regardless of
decomissioning status.

Release note (sql change): Ranges are now considered under-replicated by
the system.replication_stats report when one of the replicas is
unresponsive (or the respective node is not running).
  • Loading branch information
andreimatei committed Jan 8, 2020
commit bd9a8af71faba146aa971527993e38f63a3ee88d
@@ -443,7 +443,8 @@ type node struct {
locality string
stores []store
// dead, if set, indicates that the node is to be considered dead for the
// purposes of reports generation.
// purposes of reports generation. In production, this corresponds to a node
// with an expired liveness record.
dead bool

@@ -395,15 +395,26 @@ func (v *replicationStatsVisitor) countRange(
key ZoneKey, replicationFactor int, r *roachpb.RangeDescriptor,
) {
voters := len(r.Replicas().Voters())
underReplicated := replicationFactor > voters
overReplicated := replicationFactor < voters
var liveNodeCount int
var liveVoters int
for _, rep := range r.Replicas().Voters() {
if v.nodeChecker(rep.NodeID) {
unavailable := liveNodeCount < (len(r.Replicas().Voters())/2 + 1)

// TODO(andrei): This unavailability determination is naive. We need to take
// into account two different quorums when the range is in the joint-consensus
// state. See #43836.
unavailable := liveVoters < (voters/2 + 1)
// TODO(andrei): In the joint-consensus state, this under-replication also
// needs to consider the number of live replicas in each quorum. For example,
// with 2 VoterFulls, 1 VoterOutgoing, 1 VoterIncoming, if the outgoing voter
// is on a dead node, the range should be considered under-replicated.
underReplicated := replicationFactor > liveVoters
overReplicated := replicationFactor < voters
// Note that a range can be under-replicated and over-replicated at the same
// time if it has many replicas, but sufficiently many of them are on dead
// nodes., unavailable, underReplicated, overReplicated)
@@ -283,29 +283,37 @@ func TestReplicationStatsReport(t *testing.T) {
splits: []split{
// No problem.
{key: "/Table/t1/pk/100", stores: []int{1, 2, 3}},
// Under-replicated.
{key: "/Table/t1/pk/101", stores: []int{1}},
// Under-replicated.
{key: "/Table/t1/pk/102", stores: []int{1, 2}},
// Under-replicated because 4 is dead.
{key: "/Table/t1/pk/103", stores: []int{1, 2, 4}},
// Under-replicated and unavailable.
{key: "/Table/t1/pk/102", stores: []int{3}},
{key: "/Table/t1/pk/104", stores: []int{3}},
// Over-replicated.
{key: "/Table/t1/pk/103", stores: []int{1, 2, 3, 4}},
{key: "/Table/t1/pk/105", stores: []int{1, 2, 3, 4}},
// Under-replicated and over-replicated.
{key: "/Table/t1/pk/106", stores: []int{1, 2, 4, 5}},
nodes: []node{
{id: 1, stores: []store{{id: 1}}},
{id: 2, stores: []store{{id: 2}}},
{id: 3, stores: []store{{id: 3}}, dead: true},
{id: 4, stores: []store{{id: 4}}},
{id: 3, stores: []store{{id: 3}}},
{id: 4, stores: []store{{id: 4}}, dead: true},
{id: 5, stores: []store{{id: 3}}, dead: true},
exp: []replicationStatsEntry{
object: "t1.p1",
zoneRangeStatus: zoneRangeStatus{
numRanges: 4,
numRanges: 7,
unavailable: 1,
underReplicated: 2,
overReplicated: 1,
underReplicated: 5,
overReplicated: 2,
@@ -27,7 +27,6 @@ import (
@@ -195,18 +194,9 @@ func (stats *Reporter) update(
return storeDescs

livenessStatus := stats.liveness.GetLivenessStatusMap()
isLiveMap := stats.liveness.GetIsLiveMap()
isNodeLive := func(nodeID roachpb.NodeID) bool {
status, ok := livenessStatus[nodeID]
if !ok {
return false
switch status {
case storagepb.NodeLivenessStatus_LIVE, storagepb.NodeLivenessStatus_DECOMMISSIONING:
return true
return false
return isLiveMap[nodeID].IsLive

// Create the visitors that we're going to pass to visitRanges() below.
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.