Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable soft-deletes by default on 7.0.0 or later #36141

Merged
merged 19 commits into from Dec 11, 2018

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Dec 2, 2018

This change enables soft-deletes by default on ES 7.0.0 or later.

This change enables soft-deletes by default for 7.0+ indices.
@dnhatn dnhatn added >enhancement v7.0.0 :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. labels Dec 2, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm but there are two things that I wanted to double check:

  1. is there no documentation change needed?
  2. did you look at the consequences of enabling soft deletes regardless of the index version? I can see how it will make some ops based recoveries fail and go back to file copy but I think that will be rare enough to consider an unconditional default that holds for old data as well.

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this is the default there are several places in the CCR docs where this can be cleaned up. Also, the CCR requirements docs should be re-worded.

@dnhatn
Copy link
Member Author

dnhatn commented Dec 3, 2018

did you look at the consequences of enabling soft deletes regardless of the index version? I can see how it will make some ops based recoveries fail and go back to file copy but I think that will be rare enough to consider an unconditional default that holds for old data as well.

@bleskes There are two reasons that I enabled soft-deletes with the index-version condition:

  1. An engine with soft-deletes disabled can't open a soft-deletes Lucene index. If a new index is created in a mixed cluster (7.0 and 6.x) and a 6.x replica won't be able to open a 7.0 soft-deletes index.

  2. CCR requires soft-deletes enabled on the leader before following. If a 7.x leader index has soft-deletes enabled by default, a 6.x follower won't be able to recognize it. This approach does not cover well this case.

We need to set soft-deletes default setting explicitly when creating a new index in MetaDataCreateIndexService#IndexCreationTask. WDYT?

@bleskes
Copy link
Contributor

bleskes commented Dec 3, 2018

An engine with soft-deletes disabled can't open a soft-deletes Lucene index.

I take it you mean the 6.x node will not see that soft deletes are enabled because it has a different default?

CCR requires soft-deletes enabled on the leader before following.

This is a problem in your solution too?

We need to set soft-deletes default setting explicitly when creating a new index in MetaDataCreateIndexService#IndexCreationTask

I think it's good to explore this (although, I don't think it's enough). My goal is to discuss what the implications would be to make 7.x engines always run with soft deletes, so we can potentially remove the translog retention logic (beyond the global checkpoint aspects). I'm still debating this personally but I wanted to raise the discussion.

@s1monw
Copy link
Contributor

s1monw commented Dec 3, 2018

An engine with soft-deletes disabled can't open a soft-deletes Lucene index. If a new index is created in a mixed cluster (7.0 and 6.x) and a 6.x replica won't be able to open a 7.0 soft-deletes index.

can you explain why this is the case?

@dnhatn
Copy link
Member Author

dnhatn commented Dec 3, 2018

@s1monw If we use an implicit default value, it's possible to have a primary on 7.0 with soft-deletes enabled (by default), and a replica on 6.x with soft-deletes disabled (also by default). Then that replica won't be able to open a Lucene index copied from the primary because we set soft-deletes field in IndexWriterConfig iff soft-deletes is enabled. Setting soft-deletes setting explicitly when creating an index would solve these two issues entirely.

@dnhatn
Copy link
Member Author

dnhatn commented Dec 3, 2018

@bleskes I have updated this PR to set the soft-deletes setting explicitly when creating a new index.
@jasontedor I have adjusted the ccr doc.

Could you please have another look? Thank you!

@s1monw
Copy link
Contributor

s1monw commented Dec 3, 2018

IndexWriterConfig iff soft-deletes is enabled. Setting soft-deletes setting explicitly when creating an index would solve these two issues entirely.

yeah so I think we don't have to do that. We can just always set it and then we don't have any issues?

@dnhatn
Copy link
Member Author

dnhatn commented Dec 3, 2018

yeah so I think we don't have to do that. We can just always set it and then we don't have any issues?

@s1monw Yes, that's correct. Another issue is the validation in CCR that requires an explicit value.

@jasontedor
Copy link
Member

Another issue is the validation in CCR that requires an explicit value.

Let’s work to remove this then. I don’t like what you had to do in create index service for example. I’m okay saying if not explicitly set, check the index creating version and assume false if < some cutoff (e.g., 7.0.0 for simplicity) else true.

@dnhatn
Copy link
Member Author

dnhatn commented Dec 4, 2018

Thanks everyone for the inputs. I will provide the default value for the setting and adjust other parts accordingly.

@dnhatn dnhatn changed the title Enable soft-deletes by default for 7.0+ indices Enable soft-deletes by default on 7.0.0 or later Dec 4, 2018
Tim-Brooks pushed a commit to Tim-Brooks/elasticsearch that referenced this pull request Dec 4, 2018
Today we configure the soft-deletes field iff soft-deletes enabled.
Although this choice was correct, it prevents an engine with
soft-deletes disabled from opening a Lucene index with soft-deletes.
Moreover, this change should not have any side-effect if a Lucene index
does not have any soft-deletes.

Relates elastic#36141
dnhatn added a commit that referenced this pull request Dec 4, 2018
Today we configure the soft-deletes field iff soft-deletes enabled.
Although this choice was correct, it prevents an engine with
soft-deletes disabled from opening a Lucene index with soft-deletes.
Moreover, this change should not have any side-effect if a Lucene index
does not have any soft-deletes.

Relates #36141
@@ -101,7 +101,8 @@ which returns something similar to:
"translog_generation" : "2",
"max_seq_no" : "-1",
"sync_id" : "AVvFY-071siAOuFGEO9P", <1>
"max_unsafe_auto_id_timestamp" : "-1"
"max_unsafe_auto_id_timestamp" : "-1",
"min_retained_seq_no": "0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this added?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We store the min_retained_seq_no of SoftDeletesPolicy in a Lucene index commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is not part of this pr?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this change is a part of this PR. We bake this value iff soft-deletes is enabled. Previously, we did not enable soft-deletes but now we do by default; therefore a Lucene commit also by default has it.

dnhatn added a commit that referenced this pull request Dec 5, 2018
Today we configure the soft-deletes field iff soft-deletes enabled.
Although this choice was correct, it prevents an engine with
soft-deletes disabled from opening a Lucene index with soft-deletes.
Moreover, this change should not have any side-effect if a Lucene index
does not have any soft-deletes.

Relates #36141
* elastic/master: (36 commits)
  Add check for minimum required Docker version (elastic#36497)
  Minor search controller changes (elastic#36479)
  Add default methods to DocValueFormat (elastic#36480)
  Fix the mixed cluster REST test explain/11_basic_with_types.
  Modify `BigArrays` to take name of circuit breaker (elastic#36461)
  Move LoggedExec to minimumRuntime source set (elastic#36453)
  added 6.5.4 version
  Add test logging for elastic#35644
  Tests- added helper methods to ESRestTestCase for checking warnings (elastic#36443)
  SQL: move requests' parameters to requests JSON body (elastic#36149)
  [Zen2] Respect the no_master_block setting (elastic#36478)
  Require soft-deletes when access changes snapshot (elastic#36446)
  Switch more tests to zen2 (elastic#36367)
  [Painless] Add extensive tests for def to primitive casts (elastic#36455)
  Add setting to bypass Rollover action (elastic#36235)
  Try running CI against Zulu (elastic#36391)
  [DOCS] Reworked the shard allocation filtering info.  (elastic#36456)
  Log [initial_master_nodes] on formation failure (elastic#36466)
  converting ForbiddenPatternsTask to .java (elastic#36194)
  fixed typo
  ...
@jasontedor
Copy link
Member

@dnhatn I pushed one more documentation change. It is from documentation that I added in #36333 after you opened this pull request.

@dnhatn
Copy link
Member Author

dnhatn commented Dec 11, 2018

Thanks everyone :).

@dnhatn dnhatn merged commit 51800de into elastic:master Dec 11, 2018
@dnhatn dnhatn deleted the default-soft-deletes branch December 11, 2018 23:58
dnhatn added a commit that referenced this pull request Feb 25, 2019
Today when users upgrade to 7.0, existing indices will automatically
switch to soft-deletes without an opt-out option. With this change, 
we only enable soft-deletes by default for new indices.

Relates #36141
dnhatn added a commit that referenced this pull request Feb 25, 2019
Today when users upgrade to 7.0, existing indices will automatically
switch to soft-deletes without an opt-out option. With this change, 
we only enable soft-deletes by default for new indices.

Relates #36141
dnhatn added a commit that referenced this pull request Feb 26, 2019
Today when users upgrade to 7.0, existing indices will automatically
switch to soft-deletes without an opt-out option. With this change, 
we only enable soft-deletes by default for new indices.

Relates #36141
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants