Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid extra reroutes of delayed shards in RoutingService #12532

Merged
merged 1 commit into from Aug 5, 2015

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Jul 29, 2015

In order to avoid extra reroutes, RoutingService should avoid
scheduling a reroute of any shards where the delay is negative. To make
sure that we don't encounter a race condition between the
GatewayAllocator thinking a shard is delayed and RoutingService thinking
it is not, the GatewayAllocator will update the RoutingService with the
last time it checked in order to use a consistent "view" of the delay.

Resolves #12456
Relates to #12515 and #12456

This is a PR against 1.7 and will be forward-ported

@dakrone
Copy link
Member Author

dakrone commented Jul 29, 2015

@kimchy can you take a look?

@@ -57,6 +57,7 @@
private AtomicBoolean rerouting = new AtomicBoolean();
private volatile long registeredNextDelaySetting = Long.MAX_VALUE;
private volatile ScheduledFuture registeredNextDelayFuture;
private volatile long lastAllocateUnassignedShardsRun = System.currentTimeMillis();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to have 0 have the special "I've not even tried yet" meaning? I think setting it to now is kind-of funky because you aren't actually doing an allocation when you build the object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could set this to 0, but either way, gateway allocator should always run at least once before these does. So this is just because I wanted a non-null value.

@@ -167,12 +167,12 @@ public long getAllocationDelayTimeoutSetting(Settings settings, Settings indexSe
/**
* The time in millisecond until this unassigned shard can be reassigned.
*/
public long getDelayAllocationExpirationIn(Settings settings, Settings indexSettings) {
public long getDelayAllocationExpirationIn(long lastUnassignedShardsAllocated, Settings settings, Settings indexSettings) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call this timestamp? I think the "lastXXX" part depends on the context of calling it. If we do, would be good to rename the other ones in this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll rename to unassignedShardsAllocatedTimestamp

@kimchy
Copy link
Member

kimchy commented Aug 4, 2015

left a few comments, LGTM otherwise.

For master forward patching, the code changed a bit, so it will be slightly different. I think we should in GatewayAllocator take a timestamp, and pass it to ReplicaAllocator#allocateUnassigned, and have a simplified one allocateUnassigned that passes System.currentTimeInMillis.

In order to avoid extra reroutes, `RoutingService` should avoid
scheduling a reroute of any shards where the delay is negative. To make
sure that we don't encounter a race condition between the
GatewayAllocator thinking a shard is delayed and RoutingService thinking
it is not, the GatewayAllocator will update the RoutingService with the
last time it checked in order to use a consistent "view" of the delay.

Resolves elastic#12456
Relates to elastic#12515 and elastic#12456
@dakrone dakrone merged commit 93ac27a into elastic:1.7 Aug 5, 2015
@dakrone
Copy link
Member Author

dakrone commented Aug 5, 2015

Merged this, but for the 2.0 forward port I am going to open another PR.

@dakrone dakrone deleted the avoid-extra-reroutes branch August 7, 2015 13:42
@lcawl lcawl added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Allocation labels Feb 13, 2018
@clintongormley clintongormley added :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) and removed :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement v1.7.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants