Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid extra reroutes of delayed shards in RoutingService #12532

Merged
merged 1 commit into from Aug 5, 2015

Conversation

Projects
None yet
5 participants
@dakrone
Copy link
Member

dakrone commented Jul 29, 2015

In order to avoid extra reroutes, RoutingService should avoid
scheduling a reroute of any shards where the delay is negative. To make
sure that we don't encounter a race condition between the
GatewayAllocator thinking a shard is delayed and RoutingService thinking
it is not, the GatewayAllocator will update the RoutingService with the
last time it checked in order to use a consistent "view" of the delay.

Resolves #12456
Relates to #12515 and #12456

This is a PR against 1.7 and will be forward-ported

@dakrone

This comment has been minimized.

Copy link
Member Author

dakrone commented Jul 29, 2015

@kimchy can you take a look?

@nik9000

View changes

src/main/java/org/elasticsearch/cluster/routing/RoutingService.java Outdated
@@ -57,6 +57,7 @@
private AtomicBoolean rerouting = new AtomicBoolean();
private volatile long registeredNextDelaySetting = Long.MAX_VALUE;
private volatile ScheduledFuture registeredNextDelayFuture;
private volatile long lastAllocateUnassignedShardsRun = System.currentTimeMillis();

This comment has been minimized.

Copy link
@nik9000

nik9000 Jul 29, 2015

Contributor

Would it be useful to have 0 have the special "I've not even tried yet" meaning? I think setting it to now is kind-of funky because you aren't actually doing an allocation when you build the object.

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 29, 2015

Author Member

I could set this to 0, but either way, gateway allocator should always run at least once before these does. So this is just because I wanted a non-null value.

internalCluster().stopRandomNode(InternalTestCluster.nameFilter(node_1));

// This might run slowly on older hardware
ensureGreen(TimeValue.timeValueMinutes(2));

This comment has been minimized.

Copy link
@nik9000

nik9000 Jul 29, 2015

Contributor

Yikes! Is that really 2 minutes of 1 cpu running 100%?

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 29, 2015

Author Member

Some of our CI nodes are reeeeeaaaalllly slow, and since this creates 25 indices with 10 total shards each, I just wanted to make sure it doesn't time out. I do not expect it to take 2 minutes regardless (it's much faster for me). Another reason this test in annotated with @Slow

This comment has been minimized.

Copy link
@kimchy

kimchy Aug 4, 2015

Member

can we configure the delayed allocation to not be the default (1m) but something high enough to trigger what we are trying to fix, like 200ms? This will speed up the test.

This comment has been minimized.

Copy link
@dakrone

dakrone Aug 4, 2015

Author Member

Sure, I already randomize it between 4 and 15 seconds (since it should find the lowest delay setting), but I can lower it even more.

@kimchy

View changes

src/main/java/org/elasticsearch/cluster/routing/UnassignedInfo.java Outdated
@@ -167,12 +167,12 @@ public long getAllocationDelayTimeoutSetting(Settings settings, Settings indexSe
/**
* The time in millisecond until this unassigned shard can be reassigned.
*/
public long getDelayAllocationExpirationIn(Settings settings, Settings indexSettings) {
public long getDelayAllocationExpirationIn(long lastUnassignedShardsAllocated, Settings settings, Settings indexSettings) {

This comment has been minimized.

Copy link
@kimchy

kimchy Aug 4, 2015

Member

can we call this timestamp? I think the "lastXXX" part depends on the context of calling it. If we do, would be good to rename the other ones in this change.

This comment has been minimized.

Copy link
@dakrone

dakrone Aug 4, 2015

Author Member

Sure, I'll rename to unassignedShardsAllocatedTimestamp


@Override
public void onFailure(Throwable t) {
logger.warn("failed to schedule/execute reroute post unassigned shard", t);

This comment has been minimized.

Copy link
@kimchy

kimchy Aug 4, 2015

Member

we probably want to rest here as well: registeredNextDelaySetting = Long.MAX_VALUE;

This comment has been minimized.

Copy link
@dakrone

dakrone Aug 4, 2015

Author Member

Yes good idea, otherwise it assumes one was running.

@kimchy

This comment has been minimized.

Copy link
Member

kimchy commented Aug 4, 2015

left a few comments, LGTM otherwise.

For master forward patching, the code changed a bit, so it will be slightly different. I think we should in GatewayAllocator take a timestamp, and pass it to ReplicaAllocator#allocateUnassigned, and have a simplified one allocateUnassigned that passes System.currentTimeInMillis.

Avoid extra reroutes of delayed shards in RoutingService
In order to avoid extra reroutes, `RoutingService` should avoid
scheduling a reroute of any shards where the delay is negative. To make
sure that we don't encounter a race condition between the
GatewayAllocator thinking a shard is delayed and RoutingService thinking
it is not, the GatewayAllocator will update the RoutingService with the
last time it checked in order to use a consistent "view" of the delay.

Resolves #12456
Relates to #12515 and #12456

@dakrone dakrone force-pushed the dakrone:avoid-extra-reroutes branch to 93ac27a Aug 5, 2015

@dakrone dakrone merged commit 93ac27a into elastic:1.7 Aug 5, 2015

1 check passed

CLA Commit author has signed the CLA
Details

@dakrone dakrone removed the v2.0.0-beta1 label Aug 5, 2015

@dakrone

This comment has been minimized.

Copy link
Member Author

dakrone commented Aug 5, 2015

Merged this, but for the 2.0 forward port I am going to open another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.