Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for internal versioning for concurrency control #38254

Merged
merged 51 commits into from Feb 5, 2019

Conversation

Projects
None yet
5 participants
@bleskes
Copy link
Member

commented Feb 2, 2019

Elasticsearch has long supported compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:

When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document

We recently introduced a new sequence number based approach that doesn't suffer from this dirty reads problem.

This PR removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.

Notes:

  1. I'm opening this PR to allow downstream projects to test their code to see it doesn't use versioning.
  2. I will open a separate PR with deprecation warnings for 6.x
  3. I still need to add breaking changes documentation
  4. This PR contain some minor fixes I run into that I will extract to a separate PR before merging.

Relates to #1078

@elasticmachine

This comment has been minimized.

Copy link

commented Feb 2, 2019

bleskes added some commits Feb 2, 2019

@spalger spalger referenced this pull request Feb 2, 2019

Merged

Remove dependency on doc versions #29906

4 of 6 tasks complete

bleskes added some commits Feb 3, 2019

spalger added a commit to spalger/kibana that referenced this pull request Feb 5, 2019

Remove dependency on doc versions (elastic#29906)
See elastic/elasticsearch#38254

Using the `version` parameter to implement optimistic concurrency is not going to be supported in 7.0, so we need to replace our usage of document version with the new `_seq_no` and `_primary_term` parameters. These fields are returned in the same way that `_version` was returned on all read/write requests except for search, where it needs to be requested by sending `seq_no_primary_term: true` in the body of the search request. These parameters are sent back to Elasticsearch on write requests with the `if_seq_no` and `if_primary_term` parameters, and are functionally equivalent to sending a `version` in a write request before elastic/elasticsearch#38254.

To make these updates I searched the code base for uses of a `version` and `_version`, then triaged each usage, so I'm fairly confident that I got everything but it's possible something slipped through the cracks, so if you know of any usage of the document version field please help me out by double checking that I converted it.

- [x] **Saved Objects**:  @elastic/kibana-platform, @elastic/es-security - for BWC and ergonomics the `version` provided by the Saved Objects client/API was not removed, it was converted from a number to a string whose value is `base64(json([_seq_no, _primary_term]))`. This allows the Saved Objects API and its consumers to remain mostly unmodified, as long as the underlying value in the version field is irrelevant. This was the case for all usages in Kibana, only thing that needed updating was tests and TS types.

- [x] **Reporting/esqueue**: @joelgriffith, @tsullivan - the version parameter was used here specifically for implementing optimistic concurrency, and since its usage was contained within the esqueue module I just updated it to use the new `_seq_no` and `_primary_term` fields.

- [x] **Task Manager**: @tsullivan @njd5475 - Like esqueue this module uses version for optimistic concurrency but the usage is contained with the module so I just updated it to use, store, and request the `_seq_no` and `_primary_term` fields.

- [ ] **ML**: @elastic/ml-ui - Best I could tell the only "version" in the ML code refers to the stack version, elastic@077245f

- [ ] **Beats CM**: @elastic/beats - Looks like the references to `_version` in the code is only in the types but not in the code itself. I updated the types to use `_seq_no` and `_primary_term`, and their camelCase equivalents where appropriate. I did find a method that used one of the types referencing version but when investigating its usage it seemed the only consumer of that method was itself so i removed it. elastic@52d890f

- [x] **Spaces (tests)**: @elastic/kibana-security - The spaces test helpers use saved objects with versions in a number of places, so I updated them to use the new string versions where the version was predictable, and removed the assertion on version where it wasn't. We test the version in the saved objects code so this should be fine.
@bleskes

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2019

@elasticmachine run elasticsearch-ci/default-distro

bleskes added some commits Feb 5, 2019

@jasontedor

This comment has been minimized.

Copy link
Member

commented Feb 5, 2019

@elasticmachine test this please

@bleskes

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2019

@elasticmachine run elasticsearch-ci/1

bleskes added some commits Feb 5, 2019

@bleskes bleskes merged commit 033ba72 into elastic:master Feb 5, 2019

8 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci/1 Build finished.
Details
elasticsearch-ci/2 Build finished.
Details
elasticsearch-ci/default-distro Build finished.
Details
elasticsearch-ci/docbldesx Build finished.
Details
elasticsearch-ci/docs-check Build finished.
Details
elasticsearch-ci/oss-distro-docs Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

@bleskes bleskes deleted the bleskes:cas_remove_internal_versioning branch Feb 5, 2019

bleskes added a commit that referenced this pull request Feb 5, 2019

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 11, 2019

Merge branch 'master' into retention-lease-ccr
* master:
  Add an authentication cache for API keys (elastic#38469)
  Fix exit code in certutil packaging test (elastic#38393)
  Enable logs for intermittent test failure (elastic#38426)
  Disable BWC to backport recovering retention leases (elastic#38477)
  Enable bwc tests now that elastic#38443 is backported. (elastic#38462)
  Fix Master Failover and DataNode Leave Blocking Snapshot (elastic#38460)
  Recover retention leases during peer recovery (elastic#38435)
  Set update mappings mater node timeout to 30 min (elastic#38439)
  Assert job is not null in FullClusterRestartIT (elastic#38218)
  Update ilm-api.asciidoc, point to REMOVE policy (elastic#38235) (elastic#38463)
  SQL: Fix esType for DATETIME/DATE and INTERVALS (elastic#38179)
  Handle deprecation header-AbstractUpgradeTestCase (elastic#38396)
  XPack: core/ccr/Security-cli migration to java-time (elastic#38415)
  Disable bwc tests for elastic#38443 (elastic#38456)
  Bubble-up exceptions from scheduler (elastic#38317)
  Re-enable TasksClientDocumentationIT.testCancelTasks (elastic#38234)
  Allow custom authorization with an authorization engine  (elastic#38358)
  CRUDDocumentationIT fix documentation references
  Remove support for internal versioning for concurrency control (elastic#38254)

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Feb 12, 2019

Remove support for internal versioning for concurrency control (elast…
…ic#38254)

Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:

> When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document 

We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. 

This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.

Relates to elastic#1078
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.