You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Monitoring should be able to identify nodes that are too slow to join the cluster, or perhaps simply stuck.
Possible implementation
Once new node is created and scylla is started, monitoring stack should start scrapping metrics from scylla and operating system.
If new node in UJ state.
@d-helios luckily, I've added a metrics for a node state, scylla_node_operation_mode so we can check if a node is in joining mode for more than X minutes
Motivation
Monitoring should be able to identify nodes that are too slow to join the cluster, or perhaps simply stuck.
Possible implementation
Once new node is created and scylla is started, monitoring stack should start scrapping metrics from scylla and operating system.
If new node in UJ state.
if diff for the last X minutes lower then Y over the Z minutes, trigger an alert.
The text was updated successfully, but these errors were encountered: