Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sequence numbers based optimistic concurrency control support to Engine #36467

Merged
merged 7 commits into from Dec 13, 2018

Conversation

bleskes
Copy link
Contributor

@bleskes bleskes commented Dec 11, 2018

This PR add support to engine operations for resolving and verifying the sequence number and primary term of the last modification to a document before performing an operation. This is infrastructure to move our optimistic concurrency control API to use sequence numbers instead of internal versioning.

Relates #36148
Relates #10708

PS there are still discussions about the parameter naming. I'll rename these once we have settled on something.

@bleskes bleskes added >enhancement v7.0.0 :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v6.6.0 labels Dec 11, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I am not happy with the naming. I think cas should not be in there. Can we just name them expectedSeqId expectedPrimaryTerm for now? That's what for instance Integer.compareAndSet uses as it's parameter names.

final long seqNo;
final long term;
if (loadSeqNo) {
NumericDocValues seqNos = context.reader().getNumericDocValues(SeqNoFieldMapper.NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should clean this up in master once backported. can you put a comment in there?

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

Thanks @s1monw and @dnhatn. I incorporated your suggestion.

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

run gradle build tests 1

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

The build failure was regarding refresh listeners. I don't think it's a related but probably signals a concurrency bug. I tried to reproduce this without success (I run ~300 time). Capturing the info here in case it's needed.

15:42:40 FAILURE 18.0s J0 | RefreshListenersTests.testConcurrentRefresh <<< FAILURES!
15:42:40    > Throwable #1: java.lang.AssertionError
15:42:40    > 	at __randomizedtesting.SeedInfo.seed([2848E94A583B5A6E:EF0293D9D997F804]:0)
15:42:40    > 	at org.elasticsearch.index.shard.RefreshListenersTests.lambda$testConcurrentRefresh$8(RefreshListenersTests.java:282)
15:42:40    > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:847)
15:42:40    > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:821)
15:42:40    > 	at org.elasticsearch.index.shard.RefreshListenersTests.testConcurrentRefresh(RefreshListenersTests.java:282)
./gradlew :server:test -Dtests.seed=2848E94A583B5A6E -Dtests.class=org.elasticsearch.index.shard.RefreshListenersTests -Dtests.method="testConcurrentRefresh" -Dtests.security.manager=true -Dtests.locale=pt-PT -Dtests.timezone=UCT -Dcompiler.java=11 -Druntime.java=8

@ywelsch
Copy link
Contributor

ywelsch commented Dec 12, 2018

The build failure was regarding refresh listeners. I don't think it's a related

Similar assertion recently failed on master (#24418 (comment)), so looks to be unrelated to your changes here

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

run gradle build tests 1

@bleskes bleskes merged commit f6b5d7e into elastic:master Dec 13, 2018
@bleskes bleskes deleted the cas_seq_no_engine branch December 13, 2018 07:08
bleskes added a commit to bleskes/elasticsearch that referenced this pull request Dec 13, 2018
…Engine (elastic#36467)

This commit add support to engine operations for resolving and verifying the sequence number and
primary term of the last modification to a document before performing an operation. This is
infrastructure to move our (optimistic concurrency control)[http://en.wikipedia.org/wiki/Optimistic_concurrency_control] API to use sequence numbers instead of internal versioning.

Relates elastic#36148
Relates elastic#10708
bleskes added a commit to bleskes/elasticsearch that referenced this pull request Dec 13, 2018
…Engine (elastic#36467)

This commit add support to engine operations for resolving and verifying the sequence number and
primary term of the last modification to a document before performing an operation. This is
infrastructure to move our (optimistic concurrency control)[http://en.wikipedia.org/wiki/Optimistic_concurrency_control] API to use sequence numbers instead of internal versioning.

Relates elastic#36148
Relates elastic#10708
bleskes added a commit that referenced this pull request Dec 13, 2018
…Engine (#36467)

This commit add support to engine operations for resolving and verifying the sequence number and
primary term of the last modification to a document before performing an operation. This is
infrastructure to move our (optimistic concurrency control)[http://en.wikipedia.org/wiki/Optimistic_concurrency_control] API to use sequence numbers instead of internal versioning.

Relates #36148
Relates #10708
tlrx added a commit that referenced this pull request Jan 9, 2019
tlrx added a commit that referenced this pull request Jan 10, 2019
tlrx added a commit that referenced this pull request Jan 11, 2019
tlrx added a commit that referenced this pull request Jan 15, 2019
tlrx added a commit that referenced this pull request Jan 15, 2019
tlrx added a commit that referenced this pull request Jan 22, 2019
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jan 23, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates elastic#36467
Closes elastic#37684
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jan 23, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates elastic#36467
Closes elastic#37684
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jan 24, 2019
dnhatn added a commit that referenced this pull request Jan 24, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates #36467
Closes #37684
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jan 24, 2019
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jan 28, 2019
dnhatn added a commit that referenced this pull request Jan 28, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates #36467
Closes #37684
tlrx added a commit that referenced this pull request Jan 29, 2019
tlrx added a commit that referenced this pull request Jan 29, 2019
tlrx added a commit that referenced this pull request Jan 29, 2019
tlrx added a commit that referenced this pull request Jan 30, 2019
tlrx added a commit that referenced this pull request Jan 30, 2019
talevy pushed a commit to talevy/elasticsearch that referenced this pull request Feb 15, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates elastic#36467
Closes elastic#37684
talevy added a commit that referenced this pull request Feb 16, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates #36467
Closes #37684
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 26, 2019
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 27, 2019
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 27, 2019
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 27, 2019
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 28, 2019
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v6.6.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants