Changes in unassigned info and version might not be transferred as part of cluster state diffs #12387

imotov · 2015-07-21T22:24:20Z

The equalTo logic of ShardRouting doesn't take version and unassignedInfo into the account when compares shard routings. Since cluster state diff relies on equal to detect the changes that needs to be sent to other cluster, this omission might lead to changes not being properly propagated to other nodes in the cluster.

imotov · 2015-07-21T22:25:53Z

@bleskes could you take a look when you have a chance? Somehow, adding version to the equal triggers retry failures in RecoveryPercolatorTests. Any idea why?

elastic#12242 introduced a unique id for an assignment of shard to a node. We should use these id's to drive the decisions made by IndicesClusterStateService when processing the new cluster state sent by the master. If the local shard has a different allocation id than the new cluster state, the shard will be removed and a new one will be created. This fixes a couple of subtle bugs, most notably a node previously got confused if an incoming cluster state had a newly allocated shard in the initializing state and the local copy was started (which can happen if cluster state updates are bulk processed). In that case, the node have previously re-used the local copy instead of initializing a new one. Also, as set of utility methods was introduced on ShardRouting to do various types of matching with other shard routings, giving control about what exactly should be matched (same shard id, same allocation id, all but version and shard info etc.). This is useful here, but also prepares the grounds for the change needed in elastic#12387 (making ShardRouting.equals be strict and perform exact equality).

bleskes · 2015-07-23T10:23:56Z

I had a look. like the change (didn’t do a proper review). The source of trouble was IndicesClusterStateService being confused by the new equal semantics. I fixed this in #12397 . Once it’s in we can try again..

On 22 Jul 2015, at 00:26, Igor Motov notifications@github.com wrote:

@bleskes could you take a look when you have a chance? Somehow, adding version to the equal triggers retry failures in RecoveryPercolatorTests. Any idea why?

—
Reply to this email directly or view it on GitHub.

Also, as set of utility methods was introduced on ShardRouting to do various types of matching with other shard routings, giving control about what exactly should be matched (same shard id, same allocation id, all but version and shard info etc.). This is useful here, but also prepares the grounds for the change needed in elastic#12387 (making ShardRouting.equals be strict and perform exact equality). Closes elastic#12397

dakrone · 2015-08-07T13:17:41Z

@imotov I think this needs to get in before the 2.0 beta, but needs to be rebased now that Boaz fixed the remaining issue in #12397. Can you rebase and see if the nocommit can be removed?

imotov · 2015-08-07T21:43:43Z

@dakrone rebased, force pushed, and can use a review

dakrone · 2015-08-07T21:56:17Z

core/src/test/java/org/elasticsearch/test/XContentTestUtils.java

     */
-    public static boolean mapsEqualIgnoringArrayOrder(Map<String, Object> first, Map<String, Object> second) {
+    public static String mapsEqualIgnoringArrayOrder(String path, Map<String, Object> first, Map<String, Object> second) {


This is kind of confusing, I think instead of trying to use this method as both a comparing method and an explaining method, maybe create an additional method called differenceBetweenMaps or something that returns the explanation and leave this one as a boolean method?

Otherwise the name doesn't really match its type, and it's confusing in the comparison to be using a string as the expected null value instead of the build-in junit explanation parameter on all the assert* methods.

dakrone · 2015-08-07T21:56:55Z

Left a couple of comments about the testing methods, but the actual diff code looks good to me.

dakrone · 2015-08-10T15:53:44Z

LGTM

…rt of cluster state diffs The equalTo logic of ShardRouting doesn't take version and unassignedInfo into the account when compares shard routings. Since cluster state diff relies on equal to detect the changes that needs to be sent to other cluster, this omission might lead to changes not being properly propagated to other nodes in the cluster. Closes elastic#12387

imotov added >bug v2.0.0-beta1 review WIP labels Jul 21, 2015

imotov assigned bleskes Jul 21, 2015

bleskes mentioned this pull request Jul 22, 2015

Adapt IndicesClusterStateService to use allocation ids #12397

Closed

imotov force-pushed the unassigned-info-not-diffable branch from 1b23371 to 323f106 Compare August 7, 2015 21:33

dakrone reviewed Aug 7, 2015
View reviewed changes

clintongormley added the blocker label Aug 10, 2015

imotov force-pushed the unassigned-info-not-diffable branch from a1efe96 to 3354a81 Compare August 11, 2015 00:28

imotov merged commit 3354a81 into elastic:master Aug 11, 2015

clintongormley removed review WIP labels Aug 11, 2015

clintongormley added the :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. label Feb 13, 2018

clintongormley removed the :Cluster label Feb 13, 2018

imotov deleted the unassigned-info-not-diffable branch May 1, 2020 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes in unassigned info and version might not be transferred as part of cluster state diffs #12387

Changes in unassigned info and version might not be transferred as part of cluster state diffs #12387

imotov commented Jul 21, 2015

imotov commented Jul 21, 2015

bleskes commented Jul 23, 2015

dakrone commented Aug 7, 2015

imotov commented Aug 7, 2015

dakrone Aug 7, 2015

dakrone commented Aug 7, 2015

dakrone commented Aug 10, 2015

Changes in unassigned info and version might not be transferred as part of cluster state diffs #12387

Changes in unassigned info and version might not be transferred as part of cluster state diffs #12387

Conversation

imotov commented Jul 21, 2015

imotov commented Jul 21, 2015

bleskes commented Jul 23, 2015

dakrone commented Aug 7, 2015

imotov commented Aug 7, 2015

dakrone Aug 7, 2015

Choose a reason for hiding this comment

dakrone commented Aug 7, 2015

dakrone commented Aug 10, 2015