A VM pause (due to GC, high IO load, etc) can cause the loss of inserted documents

Following up on #7572 and #10407, I've found that Elasticsearch will lose inserted documents even in the event of a node hiccup due to garbage collection, swapping, disk failure, IO panic, virtual machine pauses, VM migration, etc. https://gist.github.com/aphyr/b8c98e6149bc66a2d839 shows a log where we pause an elasticsearch primary via SIGSTOP and SIGCONT. Even though no operations can take place against the suspended node during this time, and a new primary for the cluster comes to power, it looks like the old primary is still capable of acking inserts which are not replicated to the new primary--somewhere right before or right after the pause. The result is the loss of ~10% of acknowledged inserts.

You can replicate these results with Jepsen (commit e331ff3578), by running `lein test :only elasticsearch.core-test/create-pause` in the `elasticsearch` directory.

Looking through the Elasticsearch cluster state code (which I am by no means qualified to understand or evaluate), I get the... really vague, probably incorrect impression that Elasticsearch might make a couple assumptions:
1. Primaries are considered authoritative "now", without a logical clock that identifies what "now" means.
2. Operations like "insert a document" don't... seem... to carry a logical clock with them allowing replicas to decide whether or not the operation supercedes their state, which means that messages delayed in flight can show up and cause interesting things to happen.

Are these at all correct? Have you considered looking in to an epoch/term/generation scheme? If primaries are elected uniquely for a certain epoch, you can tag each operation with that epoch and use it to reject invalid requests from the logical past--invariants around advancing the epoch, in turn, can enforce the logical monotonicity of operations. It might make it easier to tamp down race conditions like this.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A VM pause (due to GC, high IO load, etc) can cause the loss of inserted documents #10426

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

A VM pause (due to GC, high IO load, etc) can cause the loss of inserted documents #10426

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions