Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Never leave stale delete tombstones in version map #29619

Merged
merged 5 commits into from Apr 20, 2018

Conversation

Projects
None yet
5 participants
@dnhatn
Copy link
Contributor

commented Apr 19, 2018

Today the VersionMap does not clean up a stale delete tombstone if it
does not require safe access. However, in a very rare situation due to
concurrent refreshes, the safe-access flag may be flipped over then an
engine accidentally consult that stale delete tombstone.

This commit ensures to never leave stale delete tombstones in a version
map by always pruning delete tombstones when putting a new index entry
regardless of the value of the safe-access flag.

Never leave stale delete tombstones in version map
Today the VersionMap does not clean up a stale delete tombstone if it
does not require safe access. However, in a very rare situation due to
concurrent refreshes, the safe-access flag may be flipped over then an
engine accidentally consult that stale delete tombstone.

This commit ensures to never leave stale delete tombstones in a version
map by always pruning delete tombstones when putting a new index entry
regardless of the value of the safe-access flag.
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Apr 19, 2018

@dnhatn

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2018

@DaveCTurner, I have a test simulating the violated case found by the formal model. However, I discussed with @bleskes and decided not to include it because the version-map test may be enough. You can refer this test at 2cff7fb.

@DaveCTurner
Copy link
Contributor

left a comment

I think this looks good, but I'm not going to be able to check it in detail before tomorrow now.

@bleskes
Copy link
Member

left a comment

prod code lgtm. I left a question on the tests.

@@ -312,6 +312,10 @@ void maybePutIndexUnderLock(BytesRef uid, IndexVersionValue version) {
if (maps.isSafeAccessMode()) {
putIndexUnderLock(uid, version);
} else {
// The existing delete tombstone is no longer the latest version (.i.e stale)
// when a newer version is processed regardless of the safe mode of the version map.
// Therefore, it's better to always prune that tombstone to avoid accidental accesses.

This comment has been minimized.

Copy link
@bleskes

bleskes Apr 19, 2018

Member

How about:
// Even though we don't store a record of the indexing operation (and mark as unsafe) we should still remove any previous delete for this uuid. Note that this is expected to not hurt performance because the tombstone is small (if not empty) when usafe is relevant.

This comment has been minimized.

Copy link
@dnhatn

dnhatn Apr 19, 2018

Author Contributor

I've updated the comment.

@@ -396,6 +397,50 @@ public void testPruneTombstonesWhileLocked() throws InterruptedException, IOExce
assertEquals(0, map.getAllTombstones().size());
}

public void testNeverLeaveStaleDeleteTombstone() throws Exception {

This comment has been minimized.

Copy link
@bleskes

bleskes Apr 19, 2018

Member

can you explain why we need concurrency here? (i.e., why a single threaded test is not enough)

This comment has been minimized.

Copy link
@dnhatn

dnhatn Apr 19, 2018

Author Contributor

A single threaded test is good enough ;)

@dnhatn

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2018

@bleskes I've addressed your comments. Can you please have another look? Thank you.

@dnhatn dnhatn requested a review from bleskes Apr 19, 2018

@bleskes
Copy link
Member

left a comment

LGTM. Left some comments. No need for another review.

} else {
versionMap.maybePutIndexUnderLock(uid, new IndexVersionValue(randomTranslogLocation(), version, 1, 1));
}
VersionValue storedValue = versionMap.getUnderLock(uid);

This comment has been minimized.

Copy link
@bleskes

bleskes Apr 19, 2018

Member

can we also check that this return value correlates with what we expect? i.e., if the doc was just deleted it should be deleted, if it was indexed it should never be marked as deleted? (and found if we're in safe mode). Also can you check this property holds also not directly after indexing/deleting? (i.e., after refresh, enforceSafe access etc). Said differently - maybe also randomly do nothing under lock?

@@ -396,6 +397,32 @@ public void testPruneTombstonesWhileLocked() throws InterruptedException, IOExce
assertEquals(0, map.getAllTombstones().size());
}

public void testNeverLeaveStaleDeleteTombstone() throws Exception {

This comment has been minimized.

Copy link
@bleskes

bleskes Apr 19, 2018

Member

Maybe name the test: "testRandomlyDeletingAndIndexingDoc"? it tests much more, which is good.

dnhatn added some commits Apr 19, 2018

@dnhatn

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2018

Thanks @bleskes and @DaveCTurner for reviewing.

@DaveCTurner I am merging this to catch the release train but feel free to leave your feedbacks.

@dnhatn dnhatn merged commit 9cf8b01 into elastic:master Apr 20, 2018

3 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

@dnhatn dnhatn deleted the dnhatn:never-leave-stale-tombstone branch Apr 20, 2018

dnhatn added a commit that referenced this pull request Apr 20, 2018

TEST: Unmute testPrimaryRelocationWhileIndexing
Previously we did not put an indexing to a version map if that map does
not require safe access but removed the existing delete tombstone only
if assertion enabled. In #29585, we removed the side-effect caused by
assertion then this test started failing. This failure can be explained
as follows:

- Step 1: Index a doc then delete that doc

- Step 2: The version map can switch to unsafe mode because of
  concurrent refreshes (implicitly called by flushes)

- Step 3: Index a document - the version map won't add this version
  value and won't prune the tombstone (previously it did)

- Step 4: Delete a document - this will return NOT_FOUND instead of
  DELETED because of the stale delete tombstone

This failure is actually fixed by #29619 in which we never leave stale
delete tombstones

Closes #29626

dnhatn added a commit that referenced this pull request Apr 20, 2018

Never leave stale delete tombstones in version map (#29619)
Today the VersionMap does not clean up a stale delete tombstone if it
does not require safe access. However, in a very rare situation due to
concurrent refreshes, the safe-access flag may be flipped over then an
engine accidentally consult that stale delete tombstone.

This commit ensures to never leave stale delete tombstones in a version
map by always pruning delete tombstones when putting a new index entry
regardless of the value of the safe-access flag.

dnhatn added a commit that referenced this pull request Apr 20, 2018

TEST: Unmute testPrimaryRelocationWhileIndexing
Previously we did not put an indexing to a version map if that map does
not require safe access but removed the existing delete tombstone only
if assertion enabled. In #29585, we removed the side-effect caused by
assertion then this test started failing. This failure can be explained
as follows:

- Step 1: Index a doc then delete that doc

- Step 2: The version map can switch to unsafe mode because of
  concurrent refreshes (implicitly called by flushes)

- Step 3: Index a document - the version map won't add this version
  value and won't prune the tombstone (previously it did)

- Step 4: Delete a document - this will return NOT_FOUND instead of
  DELETED because of the stale delete tombstone

This failure is actually fixed by #29619 in which we never leave stale
delete tombstones

Closes #29626

martijnvg added a commit that referenced this pull request Apr 20, 2018

Merge remote-tracking branch 'es/master' into ccr
* es/master: (32 commits)
  TEST: Unmute testPrimaryRelocationWhileIndexing
  Remove remaining tribe node references (#29574)
  Never leave stale delete tombstones in version map (#29619)
  Do not serialize common stats flags using ordinal (#29600)
  Remove stale comment from JVM stats (#29625)
  TEST: Mute testPrimaryRelocationWhileIndexing
  Remove bulk fallback for write thread pool (#29609)
  Fix an incorrect reference to 'zero_terms_docs' in match_phrase queries.
  Update the version compatibility for zero_terms_query in match_phrase.
  Account translog location to ram usage in version map
  Remove extra spaces from changelog
  Add support to match_phrase query for zero_terms_query. (#29598)
  Fix incorrect references to 'zero_terms_docs' in query parsing error messages. (#29599)
  Build: Move java home checks to pre-execution phase (#29548)
  Avoid side-effect in VersionMap when assertion enabled (#29585)
  [Tests] Remove accidental logger usage
  Add tests for ranking evaluation with aliases (#29452)
  Deprecate use of `htmlStrip` as name for HtmlStripCharFilter (#27429)
  Update plan for the removal of mapping types. (#29586)
  [Docs] Add rankEval method for Jva HL client
  ...

martijnvg added a commit that referenced this pull request Apr 20, 2018

Merge remote-tracking branch 'es/6.x' into ccr-6.x
* es/6.x: (28 commits)
  TEST: Unmute testPrimaryRelocationWhileIndexing
  Never leave stale delete tombstones in version map (#29619)
  Do not serialize common stats flags using ordinal (#29600)
  Remove stale comment from JVM stats (#29625)
  TEST: Mute testPrimaryRelocationWhileIndexing
  Remove 7.0.0 from 6.x changelog (#29621)
  Add support to match_phrase query for zero_terms_query. (#29598)
  Account translog location to ram usage in version map
  Avoid side-effect in VersionMap when assertion enabled (#29585)
  Build: Move java home checks to pre-execution phase (#29548)
  Add tests for ranking evaluation with aliases (#29452)
  [Test] Fix assertion in SearchDocumentationIT
  Deprecate use of `htmlStrip` as name for HtmlStripCharFilter (#27429)
  test: Assert deprecated http.enebled setting warning
  Update plan for the removal of mapping types. (#29586)
  [Docs] Add rankEval method for Jva HL client
  Make ranking evaluation details accessible for client
  Rename the bulk thread pool to write thread pool (#29593)
  [Test] Minor changes to rank_eval tests (#29577)
  test: Assert deprecated http.enebled setting warning
  ...

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.