Optional Delayed Allocation on Node leave #11712

kimchy · 2015-06-17T02:22:43Z

Allow to set delayed allocation timeout on unassigned shards when a node leaves the cluster. This allows to wait for the node to come back for a specific period in order to try and assign the shards back to it to reduce shards movements and unnecessary relocations.

The setting is an index level setting under index.unassigned.node_left.delayed_timeout and defaults to 0 (== no delayed allocation). We might want to change the default, but lets do it in a different change to come up with the best value for it. The setting can be updated dynamically.

When shards are delayed, a log message with "info" level will notify which shards are being delayed and for how long.

An implementation note, we really only need to care about delaying allocation on unassigned replica shards. If the primary shard is uniassigned, anyhow we are going to wait for a copy of it, so really the only case to delay allocation is for replicas.

bleskes · 2015-06-17T07:05:37Z

core/src/main/java/org/elasticsearch/cluster/routing/RoutingService.java

+            }
+            if (scheduled) {
+                logger.info("delaying unassigned shard allocation, shards: {}", sb);
+            }
        } else {
            FutureUtils.cancel(scheduledRoutingTableFuture);


don't we need to cancel all the futures in delayedShardsToReroute ?

I will refactor it where its not relevant

bleskes · 2015-06-17T07:33:41Z

Did a review cycle. I like how things come together. One concern I had was the implementation in RoutingService where we maintain a queue of pending reroutes per unassigned (delayed) shard. I think it will be simpler to just use the single future we already have and set it every time to the next expected change moment (i.e., the minimum delay of all unassigned shards). On a setting change we can do a reroute all the time (which we might do already). Am I missing something?

bleskes · 2015-06-17T10:55:32Z

one more little thing - we need do some docs work as well, it's an important change. I can help if need be.

kimchy · 2015-06-17T19:13:44Z

I pushed another round, mainly simplifying the code, adding more unit tests, and addressing comments. @bleskes once we agree on this as the way forward, I will add docs

s1monw · 2015-06-17T19:50:46Z

core/src/main/java/org/elasticsearch/cluster/routing/RoutingService.java

+                registeredNextDelaySetting = nextDelaySetting;
+                TimeValue nextDelay = TimeValue.timeValueMillis(UnassignedInfo.findNextDelayedAllocationIn(settings, event.state()));
+                logger.info("delaying allocation for [{}] unassigned shards, next check in [{}]", UnassignedInfo.getNumberOfDelayedUnassigned(settings, event.state()), nextDelay);
+                registeredNextDelayFuture = threadPool.schedule(nextDelay, ThreadPool.Names.SAME, new Runnable() {


can we use AbstractRunnable here?

kimchy · 2015-06-18T08:13:00Z

@s1monw applied another round of changes

bleskes · 2015-06-18T09:41:03Z

core/src/main/java/org/elasticsearch/cluster/routing/UnassignedInfo.java

+    }
+
+    /**
+     * The delay in millis when delaying assigning the shard need to expire in.


Got confused by this and had to go to the code :) - I think this will be clearer "returns the time in millisecond until this unassigned shard can be reassigned."

will change

kimchy · 2015-06-18T11:56:44Z

@s1monw @bleskes pushed another set of changes

bleskes · 2015-06-18T13:07:36Z

LGTM +1

s1monw · 2015-06-18T14:01:26Z

LGTM makes sense @kimchy

Allow to set delayed allocation timeout on unassigned shards when a node leaves the cluster. This allows to wait for the node to come back for a specific period in order to try and assign the shards back to it to reduce shards movements and unnecessary relocations. The setting is an index level setting under `index.unassigned.node_left.delayed_timeout` and defaults to 0 (== no delayed allocation). We might want to change the default, but lets do it in a different change to come up with the best value for it. The setting can be updated dynamically. When shards are delayed, a log message with "info" level will notify how many shards are being delayed. An implementation note, we really only need to care about delaying allocation on unassigned replica shards. If the primary shard is unassigned, anyhow we are going to wait for a copy of it, so really the only case to delay allocation is for replicas. close elastic#11712

Allow to set delayed allocation timeout on unassigned shards when a node leaves the cluster. This allows to wait for the node to come back for a specific period in order to try and assign the shards back to it to reduce shards movements and unnecessary relocations. The setting is an index level setting under `index.unassigned.node_left.delayed_timeout` and defaults to 0 (== no delayed allocation). We might want to change the default, but lets do it in a different change to come up with the best value for it. The setting can be updated dynamically. When shards are delayed, a log message with "info" level will notify how many shards are being delayed. An implementation note, we really only need to care about delaying allocation on unassigned replica shards. If the primary shard is unassigned, anyhow we are going to wait for a copy of it, so really the only case to delay allocation is for replicas. close #11712

kimchy · 2015-06-18T15:33:13Z

pushed to master and 1.x, @clintongormley I forgot to add the docs, where do you think it makes sense to document this?

clintongormley · 2015-06-18T19:47:44Z

@kimchy i'd say in the Index Shard Allocation page: https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-allocation.html

plus a note on the cluster health page

bleskes · 2015-06-18T19:57:13Z

I think it’s also good to mention on the (rolling) upgrade docs: docs/reference/setup/upgrade.asciidoc

On 18 Jun 2015, at 21:47, Clinton Gormley notifications@github.com wrote:

@kimchy i'd say in the Index Shard Allocation page: https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-allocation.html

plus a note on the cluster health page

—
Reply to this email directly or view it on GitHub.

Relates to: elastic#11712

kimchy added v2.0.0-beta1 review v1.7.0 labels Jun 17, 2015

kimchy force-pushed the delayed_allocation_2 branch from 95ad53a to da7e279 Compare June 17, 2015 03:01

bleskes reviewed Jun 17, 2015
View reviewed changes

s1monw reviewed Jun 17, 2015
View reviewed changes

kimchy force-pushed the delayed_allocation_2 branch from b816349 to 93214d7 Compare June 18, 2015 07:36

bleskes reviewed Jun 18, 2015
View reviewed changes

kimchy force-pushed the delayed_allocation_2 branch from 0851073 to 792a545 Compare June 18, 2015 14:06

kimchy merged commit 792a545 into elastic:master Jun 18, 2015

kevinkluge removed the review label Jun 18, 2015

kimchy added >feature release highlight labels Jun 18, 2015

kimchy deleted the delayed_allocation_2 branch June 18, 2015 15:34

clintongormley added the :Allocation label Jun 18, 2015

clintongormley mentioned this pull request Jun 29, 2015

Docs: Documented delayed allocation settings #11921

Merged

szroland pushed a commit to szroland/elasticsearch that referenced this pull request Jun 30, 2015

Docs: Documented delayed allocation settings

53cb4e6

Relates to: elastic#11712

jpountz mentioned this pull request Aug 26, 2015

Allow for 'grace period expiration' before shard reallocation? #3569

Closed

lcawl added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Allocation labels Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optional Delayed Allocation on Node leave #11712

Optional Delayed Allocation on Node leave #11712

kimchy commented Jun 17, 2015

bleskes Jun 17, 2015

kimchy Jun 17, 2015

bleskes commented Jun 17, 2015

bleskes commented Jun 17, 2015

kimchy commented Jun 17, 2015

s1monw Jun 17, 2015

kimchy Jun 18, 2015

kimchy commented Jun 18, 2015

bleskes Jun 18, 2015

kimchy Jun 18, 2015

kimchy commented Jun 18, 2015

bleskes commented Jun 18, 2015

s1monw commented Jun 18, 2015

kimchy commented Jun 18, 2015

clintongormley commented Jun 18, 2015

bleskes commented Jun 18, 2015

Optional Delayed Allocation on Node leave #11712

Optional Delayed Allocation on Node leave #11712

Conversation

kimchy commented Jun 17, 2015

bleskes Jun 17, 2015

Choose a reason for hiding this comment

kimchy Jun 17, 2015

Choose a reason for hiding this comment

bleskes commented Jun 17, 2015

bleskes commented Jun 17, 2015

kimchy commented Jun 17, 2015

s1monw Jun 17, 2015

Choose a reason for hiding this comment

kimchy Jun 18, 2015

Choose a reason for hiding this comment

kimchy commented Jun 18, 2015

bleskes Jun 18, 2015

Choose a reason for hiding this comment

kimchy Jun 18, 2015

Choose a reason for hiding this comment

kimchy commented Jun 18, 2015

bleskes commented Jun 18, 2015

s1monw commented Jun 18, 2015

kimchy commented Jun 18, 2015

clintongormley commented Jun 18, 2015

bleskes commented Jun 18, 2015