Failure Detection and Consensus in Distributed Systems with F#
Author: Natallia Dzenisenka
This publication covers the topic of failure detectors and consensus - fundamental distributed algorithms. They are essential to enable available, fault-tolerant, and resilient distributed systems.
There are many academic papers that describe the theory behind distributed algorithms and prove their correctness, but rarely they illustrate other valuable aspects associated with practical implementations of the algorithms.
To be able to build distributed systems, it's important to understand the end-to-end process of working with distributed algorithms. This includes underlying networking and actual details of implementation of the algorithm using a programming language.
F# is a powerful open-source programming language that provides many advantages for distributed system development. F# consolidates useful concepts that are beneficial for concurrent and distributed programming, such as computation expressions, function composition, strong typing, type inference, and many more.
This publication demonstrates how F# can be used to build resilient distributed systems. It describes the theory behind the crucial distributed failure detector and consensus algorithms, presenting implementations of variety of failure detector algorithms and a failure-detector-based consensus algorithm.
Table Of Contents
Part I: Theory behind Failure Detectors
- Failure Detectors
- Failures Are Everywhere
- Discovering Failures
- Failure Detectors Of The Real World
- Problems Solved With Failure Detectors
- Properties Of A Failure Detector
- Types Of Failure Detectors
- Failure Detectors In Asynchronous Environment
- Reducibility Of Failure Detectors
Part II: Implementation of Networking
Part III: Implementation of Failure Detectors
- Implementing A Node
- Ping-Ack Failure Detector
- Heartbeat Failure Detector
- Heartbeat Failure Detector With Adjustable Timeout
- Heartbeat Failure Detector With Sliding Window
- Heartbeat Failure Detector With Suspect Level
- Gossipping Failure Detector