This is a C# implementation to Phi Accrual Failure Detector (PDF version here)
The interval time unit is 100-nanosecond because it use ToFileTimeUtc
generate timestamps.
I’ll introduce a Clock
abstraction to eliminate this limitation in the future.
We use a very large initial interval since the "right" average depends on the cluster size and it’s better to err high (false negatives, which will be corrected by waiting a bit longer) than low (false positives, which cause "flapping").
Choose
var failureDetector = new PhiFailureDetector(
capacity: 100, // Store at most 100 heartbeat points
initialHeartbeatInterval: 2000,
phiFunc: PhiFailureDetector.Exponential
);
communicationService[peerId].onHeartBeat += (ignored1, ignored2) {
failureDetector.Report();
};
communicationService.watch(peerId, () => failureDetector.Phi() > threshold);
3 components:
-
Monitoring
-
Interpretation
-
Action
The phi function should be made up according to the heartbeat distribution.
See CASSANDRA-2597 for more details.