Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean all logs after recovery #146

Closed
wants to merge 1 commit into from

Conversation

Tpt
Copy link
Contributor

@Tpt Tpt commented Oct 14, 2022

Allows to make sure no old log is staying after proper recovery

Fixes #145

Allows to make sure no old log is staying after proper recovery

Fixes paritytech#145
@Tpt Tpt requested review from arkpar and cheme October 14, 2022 10:09
@Tpt
Copy link
Contributor Author

Tpt commented Oct 14, 2022

I am not sure it's a clean fix of #145. It prevents more a cause of the bug than fixing it really.

@Tpt Tpt removed the request for review from cheme October 14, 2022 10:11
@arkpar
Copy link
Member

arkpar commented Oct 14, 2022

Could you also add a test from #145 please.

The issue is indeed that the old log was not removed and was replayed, screwing the header on the value table. There's a TODO to detect this issue at validation, which should have prevented it as well, once implemented.

When validating, logs are removed out of order. Each successfully validated and enacted log is added to the cleanup queue. But if there's an error, the erroneous log is dropped immediately and the cleanup queue is not processed at all.

While this fix does indeed cleans up everything, it is still in the wrong order. So if an (unlikely) crash happens when the logs are being cleaned, the bug would still be there.

@Tpt
Copy link
Contributor Author

Tpt commented Oct 14, 2022

@arkpar Thank you for your detailed explanation. So, I understand correctly the fix to do is to implement the TODO to detect this issue at validation.

And, maybe, process the cleanup queue in case of a validation error. But if we have the validation then it should not be required anymore if I understand correctly.

I am closing this PR and I'm going to write an other one.

@Tpt Tpt closed this Oct 14, 2022
@arkpar
Copy link
Member

arkpar commented Oct 14, 2022

Fixing the TODO wont be enough in the general case. If you don't change the number of value table entries, but just modify an existing value, you'll end up with a slightly different version of this bug. The log would be valid, but the value will be old. So I would not bother with it for now.

I suppose the most relevant fix would be to

  1. Not drop the log in clear_replay_log, but instead add it to the cleanup_queue
  2. Process the cleanup queue even the database is shut down with error in kill_logs.

@Tpt
Copy link
Contributor Author

Tpt commented Oct 14, 2022

@arkpar Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

After recovery on corrupted log, existing values might get overriden after unrelated commit processing
3 participants