Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug 2123: do not try to (confusingly) "repair" missing buckets, just fail #2232

Merged
merged 1 commit into from
Aug 17, 2019

Conversation

graydon
Copy link
Contributor

@graydon graydon commented Aug 16, 2019

As mentioned in #2123, the "repair" code is too clever for its own good: it's a sort of wishful-thinking about content addressed buckets that fails to recognize that it has at best a one-in-64 chance of working, and would only ever work when a bucket went missing after being published (and before being replaced with some other local one).

In practice this never happened: the only cases of bucket corruption or missing buckets we've seen in the field were fsync failures and people misconfiguring their BUCKET_DIR, and the "repair" work that kicks in is lost in the error logs, causing the user to totally misinterpret their stalled server as a broken history archive (because the work "can't find" the buckets it's looking for in the archive, and emits an error about that failure-to-download).

This PR just removes the attempt at repair. It is never going to work anyways, and it's better to just be honest and tell the user that with a clear exception. A corrupt bucket dir is actually corrupt and the best repair approach is to kill the node and fast-catch-up to a state before the failure, then let it replay forwards.

@MonsieurNicolas
Copy link
Contributor

r+ cbbbb1a

latobarita added a commit that referenced this pull request Aug 16, 2019
bug 2123: do not try to (confusingly) "repair" missing buckets, just fail

Reviewed-by: MonsieurNicolas
@latobarita latobarita merged commit cbbbb1a into stellar:master Aug 17, 2019
@graydon graydon deleted the bug-2123-repair-is-futile branch January 3, 2020 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants