New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index deletes not applied when cluster UUID has changed #16825

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
5 participants
@abeyad
Contributor

abeyad commented Feb 26, 2016

If a node was isolated from the cluster while a delete was happening, the node will ignore the deleted operation when rejoining as we couldn't detect whether the new master genuinely deleted the indices or it is a new fresh "reset" master that was started without the old data folder. We can now be smarter and detect these reset masters and actually delete the indices on the node if its not the case of a reset master.

Note that this new protection doesn't hold if the node was shut down. In that case it's indices will still be imported as dangling indices.

Closes #11665

@abeyad

This comment has been minimized.

Show comment
Hide comment
@abeyad

abeyad Feb 26, 2016

Contributor

@bleskes @dakrone @brwe feedback would be appreciated :)

Contributor

abeyad commented Feb 26, 2016

@bleskes @dakrone @brwe feedback would be appreciated :)

@dakrone

View changes

Show outdated Hide outdated core/src/main/java/org/elasticsearch/cluster/ClusterChangedEvent.java
@dakrone

View changes

Show outdated Hide outdated core/src/main/java/org/elasticsearch/cluster/ClusterChangedEvent.java
@@ -134,10 +142,18 @@ public boolean indexRoutingTableChanged(String index) {
return deleted == null ? Collections.<String>emptyList() : deleted;
}
/**
* Returns <code>true</code> iff the metadata for the cluster has changed between

This comment has been minimized.

@bleskes

bleskes Feb 29, 2016

Member

can we add a note saying this is an reference level equality and not a true equal.

@bleskes

bleskes Feb 29, 2016

Member

can we add a note saying this is an reference level equality and not a true equal.

This comment has been minimized.

@abeyad

abeyad Feb 29, 2016

Contributor

Done, I also added that note to every other reference equality check in the class

@abeyad

abeyad Feb 29, 2016

Contributor

Done, I also added that note to every other reference equality check in the class

public boolean metaDataChanged() {
return state.metaData() != previousState.metaData();
}
/**
* Returns <code>true</code> iff the {@link IndexMetaData} for a given index

This comment has been minimized.

@bleskes

bleskes Feb 29, 2016

Member

same here re type of equality.

@bleskes

bleskes Feb 29, 2016

Member

same here re type of equality.

}
// Tests that the indices change list is correct as well as metadata equality when the metadata has changed.
private static void metaDataChangesCheck(final boolean changeClusterUUID) {

This comment has been minimized.

@bleskes

bleskes Feb 29, 2016

Member

I think we can fold this test into one test and rarely change the cluster uuid?

@bleskes

bleskes Feb 29, 2016

Member

I think we can fold this test into one test and rarely change the cluster uuid?

This comment has been minimized.

@abeyad

abeyad Feb 29, 2016

Contributor

@bleskes I'm not clear on what is meant here? The testMetaDataChangesOnNoMasterChange and testMetaDataChangesOnNewClusterUUID tests each use this helper method as the test is essentially the same except for whether we change the cluster UUID or not. I figured we needed a test to check metadata changes when the cluster UUID is changed and one when it is not.

@abeyad

abeyad Feb 29, 2016

Contributor

@bleskes I'm not clear on what is meant here? The testMetaDataChangesOnNoMasterChange and testMetaDataChangesOnNewClusterUUID tests each use this helper method as the test is essentially the same except for whether we change the cluster UUID or not. I figured we needed a test to check metadata changes when the cluster UUID is changed and one when it is not.

@bleskes

View changes

Show outdated Hide outdated core/src/test/java/org/elasticsearch/cluster/ClusterChangedEventTests.java
@bleskes

This comment has been minimized.

Show comment
Hide comment
@bleskes

bleskes Feb 29, 2016

Member

I left some very minor comments. Looks good. I thought we had a test somewhere about index deletion with master change (which we should be able to enable now), but maybe I'm wrong. @brwe do you remember?

Member

bleskes commented Feb 29, 2016

I left some very minor comments. Looks good. I thought we had a test somewhere about index deletion with master change (which we should be able to enable now), but maybe I'm wrong. @brwe do you remember?

@bleskes

This comment has been minimized.

Show comment
Hide comment
@bleskes

bleskes Feb 29, 2016

Member

also re the pr description:

This commit fixes the issue by only reimporting indices from data nodes if and only
if the cluster UUID on the master node is different from the cluster
UUID of the previous cluster state on the data node

This inaccurate - on disk dangling indices will still be imported - for example if a node was off while the index was deleted. This change only deals with the case where a node was isolated from the cluster while a delete happened (but was up!) and where a master failure happened during a delete operation (but after the delete was already committed). I think we should something like:

If a node was isloated from the cluster while a delete was happening, the node will ignore the deleted operation when rejoining as we couldn't detect whether the new master genuinely deleted the indices or is a new fresh "resetted" master that was started without the old data folder. We can now be smarter and detect these resetted masters and actually delete the indices if this is not the case.

Note that this new protection doesn't hold if the node was shut down. In that case it's indices will still be imported as dangling indices.

Member

bleskes commented Feb 29, 2016

also re the pr description:

This commit fixes the issue by only reimporting indices from data nodes if and only
if the cluster UUID on the master node is different from the cluster
UUID of the previous cluster state on the data node

This inaccurate - on disk dangling indices will still be imported - for example if a node was off while the index was deleted. This change only deals with the case where a node was isolated from the cluster while a delete happened (but was up!) and where a master failure happened during a delete operation (but after the delete was already committed). I think we should something like:

If a node was isloated from the cluster while a delete was happening, the node will ignore the deleted operation when rejoining as we couldn't detect whether the new master genuinely deleted the indices or is a new fresh "resetted" master that was started without the old data folder. We can now be smarter and detect these resetted masters and actually delete the indices if this is not the case.

Note that this new protection doesn't hold if the node was shut down. In that case it's indices will still be imported as dangling indices.

@brwe

This comment has been minimized.

Show comment
Hide comment
@brwe

brwe Feb 29, 2016

Contributor

@brwe do you remember?

I think that is the one that @abeyad enabled here: https://github.com/elastic/elasticsearch/pull/16825/files#diff-c454159f60a0127854028ccb5502500eL1076 I know of no other one.

Contributor

brwe commented Feb 29, 2016

@brwe do you remember?

I think that is the one that @abeyad enabled here: https://github.com/elastic/elasticsearch/pull/16825/files#diff-c454159f60a0127854028ccb5502500eL1076 I know of no other one.

@bleskes

This comment has been minimized.

Show comment
Hide comment
@bleskes

bleskes Feb 29, 2016

Member

@brwe I missed that one line in the white space fixing. Jet lag... :( thanks!

Member

bleskes commented Feb 29, 2016

@brwe I missed that one line in the white space fixing. Jet lag... :( thanks!

@abeyad

This comment has been minimized.

Show comment
Hide comment
@abeyad

abeyad Feb 29, 2016

Contributor

I fixed the issues raised by @bleskes and @dakrone and all tests pass. I also updated the PR description and will change the commit message as well once the PR is approved.

Contributor

abeyad commented Feb 29, 2016

I fixed the issues raised by @bleskes and @dakrone and all tests pass. I also updated the PR description and will change the commit message as well once the PR is approved.

@bleskes

This comment has been minimized.

Show comment
Hide comment
@bleskes

bleskes Mar 1, 2016

Member

LGTM. Thanks @abeyad

Member

bleskes commented Mar 1, 2016

LGTM. Thanks @abeyad

Ali Beyad
Index deletes not applied when cluster UUID has changed
If a node was isolated from the cluster while a delete was happening,
the node will ignore the deleted operation when rejoining as we couldn't
detect whether the new master genuinely deleted the indices or it is a
new fresh "reset" master that was started without the old data folder.
We can now be smarter and detect these reset masters and actually delete
the indices on the node if its not the case of a reset master.

Note that this new protection doesn't hold if the node was shut down. In
that case it's indices will still be imported as dangling indices.

Closes #11665

@abeyad abeyad closed this in 83d1e09 Mar 1, 2016

@abeyad abeyad added v2.3.0 and removed review v2.3.0 labels Mar 1, 2016

abeyad pushed a commit to abeyad/elasticsearch that referenced this pull request Mar 1, 2016

Ali Beyad
Index deletes not applied when cluster UUID has changed
This commit backports commit 83d1e09
from master to 2.x.

Relates #16825

abeyad pushed a commit that referenced this pull request Mar 10, 2016

Ali Beyad
Index deletes not applied when cluster UUID has changed
This commit backports commit 83d1e09
from master to 2.x.

Relates #16825
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment