Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix possible data race when committing block files #12696

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
8 participants
@eklitzke
Copy link
Member

eklitzke commented Mar 15, 2018

It was recently pointed out to me that calling fsync() or fdatasync() on a new file is not sufficient to ensure it's persisted to disk, as the existence of the file itself is stored in the directory inode. This means that ensuring that a new file is actually committed also requires an fsync() on the parent directory. There are lots of discussions about this topic online, e.g. here. This only applies to new files, calling fsync() on an old file is always fine.

In theory this means that the block index can race block persistence, as a poorly timed power outage could cause us to commit data to the block index that gets lost in the filesystem. The change here ensures that we call fsync() on the blocks directory after committing new block files. I checked the LevelDB source code and they already do this when updating their writeahead log. In theory this could happen at the same time as a chain split and that could cause you to come back online and then miss the block you had committed to the index, which would put you permanently out of sync between the two. This seems pretty far fetched, but we should handle this case correctly anyway.

I'm using a new autoconf macro as well, AC_CHECK_FUNCS(). It checks that a function is available and then defines a HAVE_* macro if it is, analogous to AC_CHECK_HEADERS. Right now autoscan complains a lot about the fact that we're not using this, so I figured I might as well start now while I was in this part of the code.

Apparently Windows doesn't have an similar method of syncing filesystem metadata---I'm not an expert on that though.

Also not strictly related to this change, but I have been working on a lot of platform-specific PRs recently and want to refactor util.h and util.cpp so the platform-specific bits are isolated from the generic util stuff. I intend to create an issue later today to describe how I think that should be done so I can get feedback before starting that work.

@eklitzke

This comment has been minimized.

Copy link
Member Author

eklitzke commented Mar 15, 2018

Apparently macOS erroneously reports that it has fdatasync(), discussion here: haskell/unix#37

Fix possible data race when committing block files
It was recently pointed out to me that calling fsync() or fdatasync() on a new
file is not sufficient to ensure it's persisted to disk, a the existence of the
file itself is stored in the directory inode. This means that ensuring that a
new file is actually committed also requires an fsync() on the parent directory.

This change ensures that we call fsync() on the blocks directory after
committing new block files.

@eklitzke eklitzke force-pushed the eklitzke:fsync branch to 4894e36 Mar 15, 2018

@promag
Copy link
Member

promag left a comment

utACK 4894e36.

* Sync directory contents. This is required on some environments to ensure that
* newly created files are committed to disk.
*/
void DirectoryCommit(const fs::path &dirname);

This comment has been minimized.

@promag

promag Mar 15, 2018

Member

nit, fs:path& dirname. The same in the implementation.

@donaloconnor
Copy link
Contributor

donaloconnor left a comment

utACK 4894e36

@donaloconnor

This comment has been minimized.

Copy link
Contributor

donaloconnor commented Mar 15, 2018

Just in case (I'm not very familiar with this part of Bitcoin): is there a reason this isn't done for the undo data also? If it makes sense there then maybe it's okay to do it once at the end of the function instead of in both?

@eklitzke

This comment has been minimized.

Copy link
Member Author

eklitzke commented Mar 16, 2018

While I was looking at this again (originally to commit the undo data as @donaloconnor suggested) I decided it would be nice if we could only sync the parent directory if we know there's actually a new block file on disk. The block files are 128 MB, so most of the time we're flushing we don't need to sync the parent directory.

The full logic for all of the paths that can cause a file to be created is really confusing. First I tried adding a global variable that tracks if a new file has been opened, so that we can only sync the directory when that's set. Then I was trying to understand the fFinalize flag so I could add a doxygen doc string to FlushBlockToDisk(). The full code path for how this can get called is pretty confusing and the methods aren't well documented, but it seemed like it was true in the same cases that there would be a new file, since we "finalize" (i.e. truncate) the file when the is not "known to already reside on disk". But why exactly we would need to truncate a file that doesn't exist yet is unclear. I think it's to support reorgs, but it was kind of hard to wrap my head around.

Since this is pretty important logic I feel like some of these code paths at least need better comments. I am going to take another look through the code tomorrow and see if I can work it out. Maybe we end up always syncing the parent directory anyway since that's always safe, but I want to make sure I have a better grasp of all of the logic in this file.

@ryanofsky
Copy link
Contributor

ryanofsky left a comment

Realise you are still working on this but, utACK 4894e36 for the current version of the code, which does look good. PR description is a little out of date, though. Part about AC_CHECK_FUNCS should probably be removed.

@@ -1614,6 +1614,7 @@ void static FlushBlockFile(bool fFinalize = false)
if (fFinalize)
TruncateFile(fileOld, vinfoBlockFile[nLastBlockFile].nSize);
FileCommit(fileOld);
DirectoryCommit(GetDataDir() / "blocks");

This comment has been minimized.

@ryanofsky

ryanofsky Mar 22, 2018

Contributor

Realize you are currently working on this, but if you decide to go with this code as is, you should definitely comment here that call this isn't needed in the case where new files aren't being added, in case somebody wants to try to optimize this later.

This comment has been minimized.

@luke-jr

luke-jr Jun 12, 2018

Member

Won't it be a reasonably fast noop if the directory hasn't changed?

#elif HAVE_FDATASYNC
fdatasync(fileno(file));
#else
fsync(fileno(file));

This comment has been minimized.

@ryanofsky

ryanofsky Mar 22, 2018

Contributor

You aren't changing this, but it would be nice if these functions returned errors so failures could be logged.

void DirectoryCommit(const fs::path &dirname)
{
#ifndef WIN32
FILE* file = fsbridge::fopen(dirname, "r");
fsync(fileno(file));

This comment has been minimized.

@luke-jr

luke-jr Jun 12, 2018

Member

Is fsync really enough here, or should we be calling FileCommit on it?

@fanquake

This comment has been minimized.

Copy link
Member

fanquake commented Jul 25, 2018

Closing with "Up for grabs".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.