Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovery should wipe the shard state file before starting recovery #10053

Closed
s1monw opened this issue Mar 10, 2015 · 4 comments · Fixed by #10179
Closed

Recovery should wipe the shard state file before starting recovery #10053

s1monw opened this issue Mar 10, 2015 · 4 comments · Fixed by #10179
Assignees

Comments

@s1monw
Copy link
Contributor

s1monw commented Mar 10, 2015

When we start recovery of a shard we should wipe the state file of the copy if it's present otherwise gateway allocating can get confused interpreting a shard that is not fully recovered ie. due to a recovery failure as a valid copy since we only write the state when the shard is started.

@s1monw
Copy link
Contributor Author

s1monw commented Mar 10, 2015

@brwe can you take care of this?

@bleskes
Copy link
Contributor

bleskes commented Mar 10, 2015

I wonder if the correct time to wipe any _state file is before the temp file rename. Until then, the recovery doesn’t mess with any non-temp files. If the recover is cancelled, we leave the target shard intact.

On 10 Mar 2015, at 16:48, Simon Willnauer notifications@github.com wrote:

@brwe can you take care of this?


Reply to this email directly or view it on GitHub.

@s1monw
Copy link
Contributor Author

s1monw commented Mar 15, 2015

@bleskes agreed.. we should remove it before we rename the first file.

@brwe
Copy link
Contributor

brwe commented Mar 17, 2015

Just for reference, here is the relevant test failure: http://build-us-00.elasticsearch.org/job/es_core_1x_small/1800/

@brwe brwe assigned s1monw and unassigned brwe Mar 19, 2015
s1monw added a commit to s1monw/elasticsearch that referenced this issue Mar 20, 2015
Today we leave the shard state behind even if a recovery is half finished
this causes in rare conditions shards to be recovered and promoted as
primaries that have never been fully recovered.

Closes elastic#10053
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants