Formal models of core Elasticsearch algorithms
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

Formal models of core Elasticsearch algorithms

This repository contains formal models of core Elasticsearch algorithms and is directly related to ongoing implementation efforts around data replication and cluster consensus. We consider this work-in-progress: Models as well as implementations are still evolving and might differ in substantial ways. The formal models mainly serve to illustrate some of the high-level concepts and help to validate resiliency-related aspects.

Models

Data replication

The data replication model consists of two files:

Cluster consensus

The cluster consensus TLA+ model consists of two files:

The cluster consensus Isabelle model consists of the following theories:

Replica engine

A TLA+ model of how the engine handles replication requests.

Zen with terms

An alternative idea for improving the consistency of cluster state updates, effectively by adding the notion of a term to the existing Zen algorithm.

How to edit/run TLA+:

How to run the model checker in headless mode:

  • Download tla2tools.jar
  • Run the model checker once in TLA+ Toolbox on desktop (can be aborted once started). This generates the folder elasticsearch.toolbox/model/ that contains all model files that are required to run the model checker in headless mode.
  • Copy the above folder and tla2tools.jar to the server running in headless mode.
  • cd to the folder and run java -Xmx30G -cp ../tla2tools.jar tlc2.TLC MC -deadlock -workers 12. The setting -Xmx30G denotes the amount of memory to allocate to the model checker and -workers 12 the number of worker threads (should be equal to the number of cores on machine). The setting -deadlock ensures that TLC explores the full reachable state space, not searching for deadlocks.