This repository contains formal models of core Elasticsearch algorithms and is directly related to implementation efforts around data replication and cluster coordination. The models in this repository might represent past, current and future designs of Elasticsearch and can differ to their implementations in substantial ways. The formal models mainly serve to illustrate some of the high-level concepts and help to validate resiliency-related aspects.
The cluster coordination TLA+ model ensures the consistency of cluster state updates and represents the core cluster coordination and metadata replication algorithm implemented in Elasticsearch 7.0. It consists of two files:
- TLA+ specification which has a direct one-to-one implementation in Elasticsearch
- TLC model checking configuration
The data replication TLA+ model describes the Elasticsearch sequence number based data replication approach, implemented since Elasticsearch 6.0, which consists of two files:
A TLA+ model of how the engine handles replication requests.
The alternative cluster coordination TLA+ model consists of two files:
The alternative cluster consensus Isabelle model consists of the following theories:
- Basic definitions
- An implementation in functional style
- An implementation in monadic style, along with a proof it's equivalent to the previous
- The proof that each slot is consistent, based on Lamport's Synod algorithm
- The proof that the implementation ensures consistency
- Install the TLA Toolbox
- If on Mac OS, move the downloaded app to the Applications folder first
- Read some documentation
How to run the model checker in headless mode:
- Download tla2tools.jar
- Run the model checker once in TLA+ Toolbox on desktop (can be aborted once started). This generates the folder
elasticsearch.toolbox/model/
that contains all model files that are required to run the model checker in headless mode. - Copy the above folder and
tla2tools.jar
to the server running in headless mode. cd
to the folder and runjava -Xmx30G -cp ../tla2tools.jar tlc2.TLC MC -deadlock -workers 12
. The setting-Xmx30G
denotes the amount of memory to allocate to the model checker and-workers 12
the number of worker threads (should be equal to the number of cores on machine). The setting-deadlock
ensures that TLC explores the full reachable state space, not searching for deadlocks.