New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util: Make LockDirectory thread-safe, consistent, and fix OpenBSD 6.2 build #12422
Conversation
@meshcollider A question about this code - what is supposed to be the semantics of According to the documentation http://www.boost.org/doc/libs/master/doc/html/interprocess/synchronization_mechanisms.html#interprocess.synchronization_mechanisms.file_lock
The result of that will depend if the lock was taken in the same thread (? I'd expect process, in the case of interprocess locks). I think the intent of the function is to return * Or will it? or will it first try to construct a new Edit: I've tested this, and it seems
|
Also (as noted by goatpig on IRC) this function is currently not thread-safe. That's no issue right now for 0.16, I think, as this is only used by the init thread at init time, but will be with dynamic loading of wallets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent of the function is to return true if the process acquired the lock, or already had it, and false if it didn't have the lock already and failed to acquire it? Correct?
Yep, exactly.
A bit of background: when only a single directory was being locked (i.e. when the datadir locking function was separate to the walletdir locking function), the lock was declared as static. But because of the abstraction into a more general directory locking function, @ryanofsky suggested changing it to a map of locks, one per directory. Didn't realize this would cause issues, the proposed change looks good to me. utACK dce11b8
f50e675
to
df8592e
Compare
@meshcollider I've added a unit test, test_LockDirectory. Can you please verify I'm testing the intended behavior of the function? |
95d4f3f
to
ac2bf0f
Compare
So this is interesting: my test already found a divergence between the behavior on Linux and Windows. On Linux,
On Windows, both cases fail:
|
I guess this is due to inconsistent behavior of try_lock. Probably best not to rely on this behavior. Maybe we should replace: lock = locks.emplace(...).first->second;
if (!lock->try_lock()) {
return false;
} with something like: auto inserted = locks.emplace(...);
if (inserted.second && !inserted.first.second->try_lock()) {
locks.erase(inserted.first)
return false;
} |
I agree - I think we shouldn't insert locks that aren't held into the map at all. This makes it possible to use 'already in map' as early-out. While implementing and testing this, I ran into something really frustrating, I found the |
Pushed a new commit, which hopefully fixes the tests on windows: util: Fix multiple use of LockDirectoryThis commit fixes problems with calling LockDirectory multiple times on the same directory, or from multiple threads.
|
src/util.cpp
Outdated
return false; | ||
} | ||
if (probe_only) { | ||
lock.unlock(); | ||
lock->unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It think this explicit unlock is unnecessary now, as the lock will fall out of scope when locks.emplace
is not called to move it to the map.
src/util.cpp
Outdated
static std::map<std::string, std::unique_ptr<boost::interprocess::file_lock>> locks; | ||
// Protect the map with a mutex | ||
static std::mutex cs; | ||
std::unique_lock<std::mutex> ulock(cs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::lock_guard should suffice for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK ca4c802. Code looks good though obviously can be squashed. Left minor comment but feel free to ignore it (or complain about it, as may be your wont).
src/test/util_tests.cpp
Outdated
@@ -603,4 +607,71 @@ BOOST_AUTO_TEST_CASE(test_ParseFixedPoint) | |||
BOOST_CHECK(!ParseFixedPoint("1.", 8, &amount)); | |||
} | |||
|
|||
static const std::string LOCKNAME = ".lock"; | |||
|
|||
static void TestOtherThread(fs::path dirname, bool *result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe define TestOtherThread as a lambda instead of an external function. TestOther code would seem easier to understand in the context where it launches instead of out here. Also this would allow test case to be self contained and not need external LOCKNAME/TestOther declarations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually I do prefer (especially) the TestOtherProcess code to be in a self-contained function instead of in-line, because the stuff runs in a separate process. Also making the separate-process test work on windows (not going to do so in this pull) would involve some factoring in that direction anyhow.
If the LOCKNAME constant is a problem it could be passed in as parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah I do see the argument for making it self-contained, too. Really not sure here...
Edit: however, unlike for the thread, inlining TestOtherProcess won't put the code in a context where it's easier to understand, but at the beginning of the function in fork()
else.
ca4c802
to
6a0a3d7
Compare
added commit:
squashed e548d69 2018_01_openbsd_util_fix_v0 -> 6a0a3d7 |
Hmm though it compiles, the new test is failing on OpenBSD, which was the original goal of this PR. Talking about scope drift. Will investigate :)
Fixed: this had to do with different handling of SIGCHLD on BSD versus Linux. |
fs::path pathLockFile = directory / lockfile_name; | ||
FILE* file = fsbridge::fopen(pathLockFile, "a"); // empty lock file; created if it doesn't exist. | ||
|
||
// If a lock for this directory already exists in the map, don't try to re-lock it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe note that this just working around the lack of std::map::try_emplace, which was added in c++17.
Edit: Heh, they even use this exact construction as an example:
Unlike insert or emplace, these functions do not move from rvalue arguments if the insertion does not happen, which makes it easy to manipulate maps whose values are move-only types, such as std::map<std::string, std::unique_ptr>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused here, to be honest. Does try_emplace
avoid constructing the object entirely if it already exists in the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See theuni@aa04588 for my attempt at addressing the notes above.
} | ||
|
||
// Create empty lock file if it doesn't exist. | ||
FILE* file = fsbridge::fopen(pathLockFile, "a"); | ||
if (file) fclose(file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May as well bail early if the fopen fails, we're just going to end up in a boost exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played with that thought, but decided against it - that would just add extra code for an error case that is handled by boost, later?
boost::interprocess::file_lock& lock = locks.emplace(pathLockFile.string(), pathLockFile.string().c_str()).first->second; | ||
if (!lock.try_lock()) { | ||
auto lock = MakeUnique<boost::interprocess::file_lock>(pathLockFile.string().c_str()); | ||
if (!lock->try_lock()) { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize it isn't a regression, but this really needs to be logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In both cases it's used, it's logged in the caller function, so I think that is redundant?
boost::interprocess::file_lock& lock = locks.emplace(pathLockFile.string(), pathLockFile.string().c_str()).first->second; | ||
if (!lock.try_lock()) { | ||
auto lock = MakeUnique<boost::interprocess::file_lock>(pathLockFile.string().c_str()); | ||
if (!lock->try_lock()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the probe_only case, this won't be unlocked when lock
destructs, as interprocess::file_lock isn't RAII. Was it intended to work that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh are you sure? the whole intent of keeping the things in a std::map is that they get unlocked when they are destructed. Anecdotally this seems to work on windows, at least, the call to ReleaseDirectoryLocks()
seems to release the locks. I can try further.
Edit: shouldn't -daemon
be broken if true? It's what uses probe_only
to probe (acquire and release) the lock from the parent process, before getting it in the child process. At least on Linux it seems to work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation is confusing here (http://www.boost.org/doc/libs/1_66_0/doc/html/interprocess/synchronization_mechanisms.html#interprocess.synchronization_mechanisms.file_lock):
A file locking is a class that has process lifetime. This means that if a process holding a file lock ends or crashes, the operating system will automatically unlock it. This feature is very useful in some situations where we want to assure automatic unlocking even when the process crashes and avoid leaving blocked resources in the system. A file lock is constructed using the name of the file as an argument:
This would suggest we don't even need to hold on to the file_lock
after locking them, as the locks always have process lifetime. But the test outcomes seem to suggest RAII behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grr, the "theuni requested changes" is a bit overstated... There are a few fixes here and I can't produce any real-world issues, so utACK from me for 0.16 for the sake of not dragging it out.
Further discussion for master, or 0.16 if you still feel like messing with it:
Strange, your tests results (RAII behavior) don't line up with what I see from the docs/code. I'll play with tests.
You mentioned:
The
FILE* file = fsbridge::fopen(pathLockFile, "a");
wipes the 'we own this lock' administration
And from the Boost docs:
In POSIX, when two file descriptors are used to lock a file if a descriptor is closed, all file locks set by the calling process are cleared.
Is it possible that it was the fclose rather than the fopen? Because that would explain everything neatly, I think. In the probe_only
case, You create a lock, lock it, then destroy it. Then on the next invocation you fopen/fclose it, which basically gives a clean start before you create a new lock, lock again, etc.
To be clear, my concern is that in the near future we might adapt LockDirectory() and add UnlockDirectory() for some feature like multi-wallet-dirs, only to be hit with a hard-to-diagnose bug when UnlockDirectory() quietly doesnt' work as intended.
The boost docs suggest using RAII generics here:
scoped_lock and sharable_lock can be used to make file locking easier in the presence of exceptions, just like with mutexes.
And file_lock's destructor is pretty straightforward, I don't see how it could be doing any unlocking:
// m_file_hnd: typedef int (void* for win)
// ipcdetail::close_file: ::close (CloseHandle() for win)
inline file_lock::~file_lock()
{
if(m_file_hnd != ipcdetail::invalid_file()){
ipcdetail::close_file(m_file_hnd);
m_file_hnd = ipcdetail::invalid_file();
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that it was the fclose rather than the fopen? Because that would explain everything neatly, I think. In the probe_only case, You create a lock, lock it, then destroy it. Then on the next invocation you fopen/fclose it, which basically gives a clean start before you create a new lock, lock again, etc.
Whoa, sneaky. Yes, that's possible. Let's try the tests that by using an already-existing directory with an existing lock file, without that fopen
part.
LGTM, yep. Ugh OS inconsistencies make things like this so fun. Code changes look good though. cfields changes too, utACK |
Now that I had to add |
In the new commit I added unit tests for lock probing, to see if |
To check this, I applied the following patch, removing all manual file and directory handling: diff --git a/src/test/util_tests.cpp b/src/test/util_tests.cpp
index 4b2da3e..80d76e6 100644
--- a/src/test/util_tests.cpp
+++ b/src/test/util_tests.cpp
@@ -647,7 +647,7 @@ static void TestOtherProcess(fs::path dirname, std::string lockname, int fd)
BOOST_AUTO_TEST_CASE(test_LockDirectory)
{
- fs::path dirname = fs::temp_directory_path() / fs::unique_path();
+ fs::path dirname("/tmp/locktest");
const std::string lockname = ".lock";
#ifndef WIN32
// Revert SIGCHLD to default, otherwise boost.test will catch and fail on
@@ -667,10 +667,6 @@ BOOST_AUTO_TEST_CASE(test_LockDirectory)
}
BOOST_CHECK_EQUAL(close(fd[0]), 0); // Parent: close child end
#endif
- // Lock on non-existent directory should fail
- BOOST_CHECK_EQUAL(LockDirectory(dirname, lockname), false);
-
- fs::create_directories(dirname);
// Probing lock on new directory should succeed
BOOST_CHECK_EQUAL(LockDirectory(dirname, lockname, true), true);
@@ -730,7 +726,6 @@ BOOST_AUTO_TEST_CASE(test_LockDirectory)
#endif
// Clean up
ReleaseDirectoryLocks();
- fs::remove_all(dirname);
}
BOOST_AUTO_TEST_SUITE_END()
diff --git a/src/util.cpp b/src/util.cpp
index dcf7ed3..aa29a04 100644
--- a/src/util.cpp
+++ b/src/util.cpp
@@ -392,10 +392,6 @@ bool LockDirectory(const fs::path& directory, const std::string lockfile_name, b
return true;
}
- // Create empty lock file if it doesn't exist.
- FILE* file = fsbridge::fopen(pathLockFile, "a");
- if (file) fclose(file);
-
try {
auto lock = MakeUnique<boost::interprocess::file_lock>(pathLockFile.string().c_str());
if (!lock->try_lock()) { Then manually created the lock directory and file: mkdir /tmp/locktest && touch /tmp/locktest/.lock Result:
... yep, it just works. Curious.
If you can't find a problem with my methodology above, I agree, let's get back to this discussion after 0.16. |
|
This commit fixes problems with calling LockDirectory multiple times on the same directory, or from multiple threads. It also fixes the build on OpenBSD. - Wrap the boost::interprocess::file_lock in a std::unique_ptr inside the map that keeps track of per-directory locks. This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no observable effect otherwise. - Protect the locks map using a mutex. - Make sure that only locks that are successfully acquired are inserted in the map. - Open the lock file for appending only if we know we don't have the lock yet - The `FILE* file = fsbridge::fopen(pathLockFile, "a");` wipes the 'we own this lock' administration, likely because it opens a new fd for the locked file then closes it.
Add a unit test for LockDirectory, introduced in bitcoin#11281.
46d4632
to
1d4cbd2
Compare
squashed into two commits (one that makes the changes to LockDatadirectory, one that adds the unit test) |
…ix OpenBSD 6.2 build 1d4cbd2 test: Add unit test for LockDirectory (Wladimir J. van der Laan) fc888bf util: Fix multiple use of LockDirectory (Wladimir J. van der Laan) Pull request description: Wrap the `boost::interprocess::file_lock` in a `std::unique_ptr` inside the map that keeps track of per-directory locks. This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no effect otherwise. Also add a unit test, make the function thread-safe, and fix Linux versus Windows behavior inconsistency. Meant to fix #12413. Tree-SHA512: 1a94c714c932524a51212c46e8951c129337d57b00fd3da5a347c6bcf6a947706cd440f39df935591b2079995136917f71ca7435fb356f6e8a128c509a62ec32
This commit fixes problems with calling LockDirectory multiple times on the same directory, or from multiple threads. It also fixes the build on OpenBSD. - Wrap the boost::interprocess::file_lock in a std::unique_ptr inside the map that keeps track of per-directory locks. This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no observable effect otherwise. - Protect the locks map using a mutex. - Make sure that only locks that are successfully acquired are inserted in the map. - Open the lock file for appending only if we know we don't have the lock yet - The `FILE* file = fsbridge::fopen(pathLockFile, "a");` wipes the 'we own this lock' administration, likely because it opens a new fd for the locked file then closes it. Github-Pull: #12422 Rebased-From: fc888bf Tree-SHA512: c8b8942fad9ea9d9e55f8e5e360f7cf3a8b00cd17e6bc5ec5895e1e6ddcbca796e62e82856e82f0562869a624d75ad510e108077461bb47a87b2b52be0aba866
This commit fixes problems with calling LockDirectory multiple times on the same directory, or from multiple threads. It also fixes the build on OpenBSD. - Wrap the boost::interprocess::file_lock in a std::unique_ptr inside the map that keeps track of per-directory locks. This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no observable effect otherwise. - Protect the locks map using a mutex. - Make sure that only locks that are successfully acquired are inserted in the map. - Open the lock file for appending only if we know we don't have the lock yet - The `FILE* file = fsbridge::fopen(pathLockFile, "a");` wipes the 'we own this lock' administration, likely because it opens a new fd for the locked file then closes it. Github-Pull: bitcoin#12422 Rebased-From: fc888bf Tree-SHA512: c8b8942fad9ea9d9e55f8e5e360f7cf3a8b00cd17e6bc5ec5895e1e6ddcbca796e62e82856e82f0562869a624d75ad510e108077461bb47a87b2b52be0aba866
Add a unit test for LockDirectory, introduced in bitcoin#11281. Github-Pull: bitcoin#12422 Rebased-From: 1d4cbd2 Tree-SHA512: 8186f4b22c65153e30ec1e0b68be20fd59d34e1fe565d99f3b5a302780125953b867fb14c37144de3fd7baf5751df94bf3901eebbb7b820491ca452706d4e205
…, and fix OpenBSD 6.2 build 1d4cbd2 test: Add unit test for LockDirectory (Wladimir J. van der Laan) fc888bf util: Fix multiple use of LockDirectory (Wladimir J. van der Laan) Pull request description: Wrap the `boost::interprocess::file_lock` in a `std::unique_ptr` inside the map that keeps track of per-directory locks. This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no effect otherwise. Also add a unit test, make the function thread-safe, and fix Linux versus Windows behavior inconsistency. Meant to fix bitcoin#12413. Tree-SHA512: 1a94c714c932524a51212c46e8951c129337d57b00fd3da5a347c6bcf6a947706cd440f39df935591b2079995136917f71ca7435fb356f6e8a128c509a62ec32
…, and fix OpenBSD 6.2 build 1d4cbd2 test: Add unit test for LockDirectory (Wladimir J. van der Laan) fc888bf util: Fix multiple use of LockDirectory (Wladimir J. van der Laan) Pull request description: Wrap the `boost::interprocess::file_lock` in a `std::unique_ptr` inside the map that keeps track of per-directory locks. This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no effect otherwise. Also add a unit test, make the function thread-safe, and fix Linux versus Windows behavior inconsistency. Meant to fix bitcoin#12413. Tree-SHA512: 1a94c714c932524a51212c46e8951c129337d57b00fd3da5a347c6bcf6a947706cd440f39df935591b2079995136917f71ca7435fb356f6e8a128c509a62ec32
…, and fix OpenBSD 6.2 build 1d4cbd2 test: Add unit test for LockDirectory (Wladimir J. van der Laan) fc888bf util: Fix multiple use of LockDirectory (Wladimir J. van der Laan) Pull request description: Wrap the `boost::interprocess::file_lock` in a `std::unique_ptr` inside the map that keeps track of per-directory locks. This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no effect otherwise. Also add a unit test, make the function thread-safe, and fix Linux versus Windows behavior inconsistency. Meant to fix bitcoin#12413. Tree-SHA512: 1a94c714c932524a51212c46e8951c129337d57b00fd3da5a347c6bcf6a947706cd440f39df935591b2079995136917f71ca7435fb356f6e8a128c509a62ec32
63e0be6 [Remove] By-pass logprint-scanner restriction. (furszy) 280ced3 utils: Fix broken Windows filelock (Chun Kuan Lee) be89860 utils: Convert Windows args to utf-8 string (Chun Kuan Lee) e8cfa6e Call unicode API on Windows (Chun Kuan Lee) 1a02a8a tests: Add test case for std::ios_base::ate (Chun Kuan Lee) 2e57cd4 Move boost/std fstream to fsbridge (furszy) 9d8bcd4 utils: Add fsbridge fstream function wrapper (Chun Kuan Lee) d59d48d utils: Convert fs error messages from multibyte to utf-8 (ken2812221) 9ef58cc Logging: use "fmterr" variable name for errors instead of general "e" that can be used by any other function. (furszy) dd94241 utils: Use _wfopen and _wreopen on Windows (Chun Kuan Lee) 3993641 add unicode compatible file_lock for Windows (Chun Kuan Lee) 48349f8 Provide relevant error message if datadir is not writable. (murrayn) Pull request description: As the software is currently using the ANSI encoding on Windows, the user's language settings could affect the proper functioning of the node/wallet, to the point of not be able to open some non-ASCII name files and directories. This solves the Windows encoding issues, completing the entire bitcoin#13869 work path (and some other required backports). Enabling for example users that use non-english characters in directories and file names to be accepted. Backported PRs: * bitcoin#12422. * bitcoin#12630. * bitcoin#13862. * bitcoin#13866. * bitcoin#13877. * bitcoin#13878. * bitcoin#13883. * bitcoin#13884. * bitcoin#13886. * bitcoin#13888. * bitcoin#14192. * bitcoin#13734. * bitcoin#14426. This is built on top of other two PRs that i have open #2423 and #2369. Solves old issues #940 and #2163. TODO: * Backport `assert_start_raises_init_error` and `ErrorMatch` in TestNode` (bitcoin#12718) ACKs for top commit: Fuzzbawls: ACK 63e0be6 random-zebra: ACK 63e0be6 and merging... Tree-SHA512: cb1f7c23abb5b7b3af50bba18652cc2cad93fd7c2fca9c16ffd3fee34c4c152a3b666dfa87fe6b44c430064dcdee4367144dcb4a41203c91b0173b805bdb3d7d
Wrap the
boost::interprocess::file_lock
in astd::unique_ptr
inside the map that keeps track of per-directory locks.This fixes a build issue with the clang 4.0.0+boost-1.58.0p8 version combo on OpenBSD 6.2, and should have no effect otherwise.
Also add a unit test, make the function thread-safe, and fix Linux versus Windows behavior inconsistency.
Meant to fix #12413.