-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compact: add cli flag to control NoSync option #290
Conversation
I'd be interested in numbers on the "huge" impact on performance. On the stuff that got me looking at fsync behavior, though, the difference was in the multiple-orders-of-magnitude range; like, the same task went from 0.01s to 5s. (This was a pathological case involving creating, in parallel, 256 boltdb databases.) |
I have a ~100GB database, gave up running it without this option as it took ~∞ (it worked previously running it overnight), using this PR it took some expected reasonably long amount of time, maybe like half an hour or so. I'll measure it more scientifically for you and try to measure it again with your recent PR to get a sense of what effect completely disabling fsync has. |
okay, so I've run the The baseline measurement is the current HEAD of this repo ( time go run cmd/bbolt/main.go compact -o compact.bolt.db bolt.db
3021692928 -> 1807912960 bytes (gain=1.67x)
88.32s user 29.51s system 39% cpu 5:00.67 total The second measurement is this PR with both flags set to true ( time go run cmd/bbolt/main.go compact -no-sync -no-freelist-sync -o compact.bolt.db bolt.db
3021692928 -> 1793495040 bytes (gain=1.68x)
38.59s user 4.63s system 126% cpu 34.120 total In both cases the resultant file was as an aside: I combined this PR with #291 but didn't measure any significant difference. |
I would actually be happy to default these flags to It seems appropriate for the I think it's natural for a user to assume that if the operation emits a fatal error that the resultant database file is corrupt and to restart the operation.
|
cmd/bbolt/main.go
Outdated
Defaults to false | ||
|
||
-no-freelist-sync BOOL | ||
Skip syncing freelists to disk (fast but unsafe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe. The downside is to scan the entire database to re-compute the free list at the startup time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(If you use this together with NoSync, it is unsafe. But the root cause is the NoSync flag, not this one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I remove this option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does 'safe' mean in this context? If you mean safe from corruption due to kernel I/O errors, I'm not super worried in this context, that's unlikely right, like a power failure or someone unplugs the disk?
The compact command can either succeed or fail, if the target db is corrupt for any reason we can just fail loudly with an error.
Hi, this PR has been open over a year now, is there anything preventing it from being merged? |
Sorry for the late response. A couple of points from my side:
Anyway, please rebase this PR firstly. cc @ptabor as well. |
Signed-off-by: missinglink <insomnia@rcpt.at>
PR has been rebased, I have also removed the This is good to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you @missinglink
IMHO the default value of Is there any use case for |
The
compact
CLI command is very slow by default, this PR adds two new CLI flags which can be used to control thedb.NoSync
anddb.NoFreelistSync
values when creating the destination database.Usage:
Enabling these flags had a huge impact on performance, resulting in significantly expedited
compact
operations.I would potentially advocate defaulting them to
true
, but that's a discussion for another day..