You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It will be useful to be able to tell what is the status that each node sees, for example, in a split-brain situation
each node will only see part of the cluster nodes, even thogh the monitoring for example will be able to see all the nodes.
It could be expensive to report each of the possible node-to-node connetion as it's quadratice in nature.
Instead, each node would report only the number of live and unreachable nodes.
This can be reported once per node (not per shard) and can be added easility to the gossiper.
The text was updated successfully, but these errors were encountered:
amnonh
added a commit
to amnonh/scylla
that referenced
this issue
Feb 20, 2022
this patch adds two gauges:
scylla_gossip_live - how many live nodes the gossiper sees
scylla_gossip_unreachable - how many nodes the gossiper tries to connect
to but cannot.
Both metrics are reported once per node (i.e., per node, not per shard) it
gives visibility to how a specific node sees the cluster.
For example, a split-brain 6 nodes cluster (3 and 3). Each node would
report that it sees 2 nodes, but the monitoring system would see that
there are, in fact, 6 nodes.
Example of two nodes cluster, both running:
``
scylla_gossip_live{shard="0"} 1.000000
scylla_gossip_unreachable{shard="0"} 0.000000
``
Example of two nodes cluster, one is down:
``
scylla_gossip_live{shard="0"} 0.000000
scylla_gossip_unreachable{shard="0"} 1.000000
``
Fixesscylladb#10102
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
It will be useful to be able to tell what is the status that each node sees, for example, in a split-brain situation
each node will only see part of the cluster nodes, even thogh the monitoring for example will be able to see all the nodes.
It could be expensive to report each of the possible node-to-node connetion as it's quadratice in nature.
Instead, each node would report only the number of live and unreachable nodes.
This can be reported once per node (not per shard) and can be added easility to the gossiper.
The text was updated successfully, but these errors were encountered: