Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix possible data race when committing block files #14501

Merged
merged 7 commits into from Jan 7, 2021
Merged

Conversation

@luke-jr
Copy link
Member

@luke-jr luke-jr commented Oct 17, 2018

Reviving #12696

@luke-jr luke-jr force-pushed the luke-jr:fsync_dir branch Oct 17, 2018
src/util.cpp Outdated Show resolved Hide resolved
@luke-jr
Copy link
Member Author

@luke-jr luke-jr commented Nov 22, 2018

(GitHub and DrahtBot are misdetecting a rebase needed here; it is a clean merge still... can't at least DrahtBot get fixed??)

@ken2812221
Copy link
Contributor

@ken2812221 ken2812221 commented Nov 22, 2018

You have to move src/util.cpp to src/util/system.cpp?

@luke-jr
Copy link
Member Author

@luke-jr luke-jr commented Nov 22, 2018

git follows the move just fine.

@MarcoFalke
Copy link
Member

@MarcoFalke MarcoFalke commented Nov 22, 2018

@luke-jr There are different merge strategies and flags (https://git-scm.com/docs/git-merge#git-merge-mergetool) and GitHub uses a rather strict one, which they were unable to disclose to me.

It appear that this pull request hasn't had any review yet, so doing a rebase comes at no cost at all.

@DrahtBot
Copy link
Contributor

@DrahtBot DrahtBot commented Sep 30, 2019

There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.
@DrahtBot DrahtBot closed this Sep 30, 2019
@meshcollider meshcollider reopened this Oct 13, 2019
@luke-jr luke-jr force-pushed the luke-jr:fsync_dir branch from 62fb7bb to 425b654 Oct 13, 2019
@luke-jr
Copy link
Member Author

@luke-jr luke-jr commented Oct 13, 2019

Rebased

@DrahtBot
Copy link
Contributor

@DrahtBot DrahtBot commented Oct 30, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@adaminsky
Copy link
Contributor

@adaminsky adaminsky commented Aug 7, 2020

Is this still being worked on? With #19614 recently merged, all that needs to be done is add the DirectoryCommit function. I'm happy to help.

I saw the previous discussion about if calling fsync on an unchanged directory has overhead, and my understanding is that the directory's dirty bit will not be set, so no disk writes will occur. Therefore, the current method looks good to me.

@luke-jr luke-jr force-pushed the luke-jr:fsync_dir branch from 425b654 to 3a0c9b5 Aug 20, 2020
@luke-jr
Copy link
Member Author

@luke-jr luke-jr commented Aug 20, 2020

Rebased and added a fix for #19614 (which was completely broken)

eklitzke and others added 4 commits Mar 15, 2018
It was recently pointed out to me that calling fsync() or fdatasync() on a new
file is not sufficient to ensure it's persisted to disk, a the existence of the
file itself is stored in the directory inode. This means that ensuring that a
new file is actually committed also requires an fsync() on the parent directory.

This change ensures that we call fsync() on the blocks directory after
committing new block files.
@luke-jr luke-jr force-pushed the luke-jr:fsync_dir branch from 3a0c9b5 to ef71229 Aug 25, 2020
fanquake added a commit that referenced this pull request Aug 31, 2020
… LevelDB

c4b85ba Bugfix: Define and use HAVE_FDATASYNC correctly outside LevelDB (Luke Dashjr)

Pull request description:

  Fixes a bug introduced in #19614

  The LevelDB-specific fdatasync check was only using `AC_SUBST`, which works for Makefiles, but doesn't define anything for C++. Furthermore, the #define is typically 0 or 1, never undefined.

  This fixes both issues by defining it and checking its value instead of whether it is merely defined.

  Pulled out of #14501 by fanquake's request

ACKs for top commit:
  fanquake:
    ACK c4b85ba - thanks for catching and fixing my mistake.
  laanwj:
     Code review ACK c4b85ba

Tree-SHA512: 91d5d426ba000b4f3ee7e2315635e24bbb23ceff16269ddf4f65a63d25fc9e9cf94a3b236eed2f8031cc36ddcf78aeb5916efcb244f415943a8a12f907ede8f9
sidhujag added a commit to syscoin/syscoin that referenced this pull request Aug 31, 2020
…outside LevelDB

c4b85ba Bugfix: Define and use HAVE_FDATASYNC correctly outside LevelDB (Luke Dashjr)

Pull request description:

  Fixes a bug introduced in bitcoin#19614

  The LevelDB-specific fdatasync check was only using `AC_SUBST`, which works for Makefiles, but doesn't define anything for C++. Furthermore, the #define is typically 0 or 1, never undefined.

  This fixes both issues by defining it and checking its value instead of whether it is merely defined.

  Pulled out of bitcoin#14501 by fanquake's request

ACKs for top commit:
  fanquake:
    ACK c4b85ba - thanks for catching and fixing my mistake.
  laanwj:
     Code review ACK c4b85ba

Tree-SHA512: 91d5d426ba000b4f3ee7e2315635e24bbb23ceff16269ddf4f65a63d25fc9e9cf94a3b236eed2f8031cc36ddcf78aeb5916efcb244f415943a8a12f907ede8f9
FILE* file = fsbridge::fopen(dirname, "r");
if (file) {
fsync(fileno(file));
fclose(file);

This comment has been minimized.

@Empact

Empact Sep 2, 2020
Member

How about log the possible errors here? E.g.

bitcoin/src/util/system.cpp

Lines 1033 to 1034 in 0adb80f

if (fsync(fileno(file)) != 0 && errno != EINVAL) {
LogPrintf("%s: fsync failed: %d\n", __func__, errno);

This comment has been minimized.

@laanwj

laanwj Dec 21, 2020
Member

If you do that, please restrict it to logging once. Logging fsync errors every time can get very verbose on some network file systems otherwise, which have lots of corner cases with regard to synchronization (more so on directories, we had to patch leveldb for this once).

@laanwj
Copy link
Member

@laanwj laanwj commented Jan 7, 2021

Code review ACK ef71229

@laanwj laanwj merged commit 86a8b35 into bitcoin:master Jan 7, 2021
5 checks passed
5 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@cirrus-ci
x86_64 Linux [GOAL: install] [focal] [depends, sanitizers: thread (TSan), no gui] Task Summary
Details
@cirrus-ci
x86_64 Linux [GOAL: install] [focal] [no depends, only system libs, fuzzers under valgrind] Task Summary
Details
@cirrus-ci
x86_64 Linux [GOAL: install] [focal] [no depends, only system libs, sanitizers: address/leak (ASan + LSan) + undefined (UBSan) + integer] Task Summary
Details
@laanwj laanwj removed this from Blockers in High-priority for review Jan 7, 2021
sidhujag added a commit to syscoin/syscoin that referenced this pull request Jan 7, 2021
ef71229 util: Check for file being NULL in DirectoryCommit (Luke Dashjr)
4574904 Fix possible data race when committing block files (Evan Klitzke)
220bb16 util: Introduce DirectoryCommit commit function to sync a directory (Evan Klitzke)
ce5cbae util.h: Document FileCommit function (Evan Klitzke)
844d650 util: Prefer Mac-specific F_FULLSYNC over fdatasync in FileCommit (Evan Klitzke)
f6cec0b util: Refactor FileCommit from an #if sequence nested in #else, to a sequence of #elif (Evan Klitzke)

Pull request description:

  Reviving bitcoin#12696

ACKs for top commit:
  laanwj:
    Code review ACK ef71229

Tree-SHA512: 07d650990ef4c18d645dee3f9a199a940683ad17557d79d93979a76c4e710d8d70e6eae01d1a5991494a24a7654eb7db868be0c34a31e70b2509945d95bc9cce
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

10 participants