Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a best effort waiting for ongoing recoveries to cancel on close #6741

Closed
wants to merge 2 commits into from

Conversation

bleskes
Copy link
Contributor

@bleskes bleskes commented Jul 4, 2014

Currently one can close the engine while there are still on going recoveries. This is not a problem because the engine is close in tandem with the shard it belongs to, which in turn cancels the recoveries. This does cause some issues in our tests as we check the no resources were left behind after an index was deleted, which trips if the recoveries are not canceled.

…n close

Currently one can close the engine while there are still on going recoveries. This is not a problem because the engine is close in tandem with the shard it belongs to, which in turn cancels the recoveries. This does cause some issues in our tests as we check the no resources were left behind after an index was deleted, which trips if the recoveries are not canceled.
// engine is closed)
if (onGoingRecoveries.get() > 0) {
logger.trace("best effort waiting for current [{}] ongoing recoveries to finish before closing the engine", onGoingRecoveries.get());
long waitUntil = System.currentTimeMillis() + 30000; // wait for 30s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 30s should be reduced to 10s and be configurable. This may be called during cluster state publishing which times out after 30s by default.

@s1monw
Copy link
Contributor

s1monw commented Jul 9, 2014

I don't understand this PR - why do we need to wait on the recovery and which resources are you talking about here?

@bleskes bleskes added v1.4.0 and removed v1.3.0 labels Jul 9, 2014
@bleskes
Copy link
Contributor Author

bleskes commented Jul 9, 2014

push to 1.4 pending more discussion

@s1monw
Copy link
Contributor

s1monw commented Sep 8, 2014

@bleskes I think we should have a dedicated API for this that our tests can all before the index is deleted. I think it can be useful even in production when you want to wait until everything is stable. We might be able to extend / leverage the ClusterHealth API to do this - it might be already capable of doing this?

@clintongormley clintongormley changed the title [Engine] add a best effort waiting for ongoing recoveries to cancel on close Internal: Add a best effort waiting for ongoing recoveries to cancel on close Sep 8, 2014
@bleskes
Copy link
Contributor Author

bleskes commented Sep 8, 2014

@s1monw the problem is that the master may think things are done but the node have not yet completed acting on it. We can add something that checks all the nodes, but it feels like an over kill.

I think we should just close this PR until we find a better solution. Agreed?

@s1monw
Copy link
Contributor

s1monw commented Sep 8, 2014

agreed :)

@bleskes bleskes closed this Sep 8, 2014
@clintongormley clintongormley added the :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. label Jun 7, 2015
@clintongormley clintongormley changed the title Internal: Add a best effort waiting for ongoing recoveries to cancel on close Add a best effort waiting for ongoing recoveries to cancel on close Jun 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement v1.4.0.Beta1 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants