Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: Rename data file during "format" command #1171

Open
sentientwaffle opened this issue Aug 29, 2023 · 4 comments
Open

CLI: Rename data file during "format" command #1171

sentientwaffle opened this issue Aug 29, 2023 · 4 comments

Comments

@sentientwaffle
Copy link
Member

Right now, tigerbeetle format <file> writes directly to <file>.

But if tigerbeetle format does not complete properly (e.g. it is interrupted by Ctrl+C, or one of the disk writes fails) then the file shouldn't be used by a replica.

To make it harder to mistake an incompletely-formatted data file as completely-formatted, tigerbeetle format should write to <file>.(random suffix), and then rename to <file> only when it is done writing.

(This should probably be implemented in main.zig, not vsr/format.zig.)

@jorangreef
Copy link
Member

jorangreef commented Aug 29, 2023

This is a good idea.

Thinking about this some more the last few minutes.

How can we solve this (the user experience issue) from another angle? For example, how can we indicate this in the superblock, rather than rely on the file system, to provide a better error when a data file is not completely formatted?

The reason being, that this would then work also for raw block devices where we can't rename, and that rename on Windows immediately after writing to a file, can sometimes suffer when an antivirus intervenes. The AV first wants to scan the file before it allows the rename, so IIRC you can get flaky EPERM.

@sentientwaffle
Copy link
Member Author

I think we already do that automatically -- when formatting, we first write the WAL, then we write the superblock (trailers then header). We could probably improve error messages.

But renaming moves the failure up a level -- the issue is visible from the file system, without running anything at all.

@jorangreef
Copy link
Member

(This should probably be implemented in main.zig, not vsr/format.zig.)

If I understand correctly, I actually think this should be in VSR, since VSR includes stable storage, and since the superblock is part of VSR as a framework, not state machine-specific. If it were state machine-specific, then it would be in main.zig.

@jorangreef
Copy link
Member

How can we detect the "incomplete format" issue for block devices (and provide a nicer error message)? Should we add code for that? Or could we attack it by minimizing the risk of it happening in the first place?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants