Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

Add DB.Check(). #98

Merged
merged 1 commit into from
Mar 29, 2014
Merged

Add DB.Check(). #98

merged 1 commit into from
Mar 29, 2014

Conversation

benbjohnson
Copy link
Member

Work-in-progress

This pull request adds a consistency check to DB and bolt fsck to the CLI.

It currently implements the following checks:

  • Check that all reachable pages are branches or leafs.
  • Check that all pages below the high water mark are either reachable or in the freelist.

I still need to add more checks but this uncovered an issue in TestBucketDeleteQuick that I still need to resolve.

Fixes: #96

/cc @pkieltyka @tv42

@tv42
Copy link
Contributor

tv42 commented Mar 28, 2014

Could you please call it bolt check or something? fsck = file system check, and it's a needlessly obscure term.

@benbjohnson
Copy link
Member Author

Good call. I like "check". I was drawing a blank when naming it.

@pkieltyka
Copy link

Yea +1

On Mar 28, 2014, at 1:05 PM, Ben Johnson notifications@github.com wrote:

Good call. I like "check". I was drawing a blank when naming it.


Reply to this email directly or view it on GitHub.

@benbjohnson benbjohnson changed the title Add fsck. Add DB.Check(). Mar 29, 2014
@benbjohnson
Copy link
Member Author

Ok, it's now changed to DB.Check() and bolt check. I also fixed a freelist leak in Bucket.Delete().

benbjohnson added a commit that referenced this pull request Mar 29, 2014
@benbjohnson benbjohnson merged commit fcce876 into boltdb:master Mar 29, 2014
@benbjohnson benbjohnson deleted the fsck branch March 29, 2014 20:28
@pkieltyka
Copy link

Hey @benbjohnson would it be correct to run a db.Check() after opening the database? I was getting a bunch of errors with a db with just a few keys (made fresh), and the "bolt check" command was returning "..., page 10: unreachable unfreed, ..." .. still WIP I suppose?

@benbjohnson
Copy link
Member Author

@pkieltyka Yeah, it should be valid. This PR was just to add the "check" and verify the existing test cases. I have a "check" bug in a different project that uses Bolt that I'm tracking down.

Can you PR a test case for Bolt that shows the check failure? It should be easy to track down the issue once there's a test case.

@pkieltyka
Copy link

Turns out my test code in my app was not properly closing the db and on subsequent opens it was returning the check errors which was actually correct. I tried to write a test case in Bolt, but the only way to simulate an improper close I could see was running the test multiple times. But Bolt is still doing the right thing by reporting some 'page X: unreachable unfreed' after a bad close.

Can the check also fix these issues..? or perhaps it does already..?

@tv42
Copy link
Contributor

tv42 commented Mar 30, 2014

What's a "bad close"? There's no excuse for having a corrupt database after anything. @pkieltyka, please create a new issue with information on how this occurred for you.

@pkieltyka
Copy link

@tv42 here it is: #100

@benbjohnson
Copy link
Member Author

@pkieltyka I'm AFK this weekend but I'll look tomorrow. Thanks for the test case. It really helps. @tv42 is correct that the db should never be in a corrupted state. The worst thing that should happen is that any uncommitted transaction should just be rolled back.

Bolt doesn't protect against partial meta page writes but there is an issue open for that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cmd/bolt-fsck
3 participants