-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: schema is corrupt #7064
Labels
Comments
squalus
added a commit
to squalus/nix
that referenced
this issue
Sep 19, 2022
- call close explicitly in writeFile to prevent the close exception from being ignored - fsync after writing schema file to flush data to disk - fsync schema file parent to flush metadata to disk NixOS#7064
squalus
added a commit
to squalus/nix
that referenced
this issue
Sep 19, 2022
- call close explicitly in writeFile to prevent the close exception from being ignored - fsync after writing schema file to flush data to disk - fsync schema file parent to flush metadata to disk NixOS#7064
squalus
added a commit
to squalus/nix
that referenced
this issue
Sep 20, 2022
- call close explicitly in writeFile to prevent the close exception from being ignored - fsync after writing schema file to flush data to disk - fsync schema file parent to flush metadata to disk NixOS#7064
#7065 takes care of 1-3, but I'll keep this open because the atomic file write (point 4) could still be done to improve this. |
Minion3665
pushed a commit
to Minion3665/nix
that referenced
this issue
Feb 23, 2023
- call close explicitly in writeFile to prevent the close exception from being ignored - fsync after writing schema file to flush data to disk - fsync schema file parent to flush metadata to disk NixOS#7064
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I have seen multiple cases of "schema is corrupt" error messages in a production environment. This tends to happen on NixOS systems that have unexpected power cuts.
In this case, it's an ext4 file system and the schema file is empty.
Steps To Reproduce
I have a minimal test case that simulates a power cut with NixOS tests and reproduces the problem here: https://github.com/squalus/nix-durability-tests. It can be run on several different file system.
This will hopefully print a "schema is corrupt" error message.
Expected behavior
The schema file should never be invalid, even if there's an unexpected power cut.
nix-env --version
outputnix-env (Nix) 2.8.1
Additional context
Some possible causes:
close(2)
are ignored innix::writeFile
. (Fromman close
: Failing to check the return value when closing a file may lead to silent loss of data.)fsync(2)
is not run on the file after writing the contents. This means the data may not be fully flushed to disk.fsync(2)
is not run on the parent directory after closing the file. This means the directory may have outdated contents. (This wouldn't cause an empty file, but it could cause a mismatch. I haven't yet observed this problem.)rename(2)
, like in https://github.com/google/renameio.Point 2 was addressed in this PR, but it was never merged: #1956
More background: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
The text was updated successfully, but these errors were encountered: