New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI failure: ThreadSanitizer: data race /usr/lib/llvm-12/bin/../include/c++/v1/ios:523:12 in std::__1::ios_base::width() const
#23366
Comments
Indeed, this is not an issue with #15719, but with master |
Looking at https://cirrus-ci.com/github/bitcoin/bitcoin/master, I guess this problem does not happen reliably on master, but is nondeterministic? IIRC, when I was seeing similar TSAN errors months ago in #15719, the problem seemed to be deterministic and would happen every time. The I think the TSAN suppression seems safe, but am not completely confident about it. And even if it is thread safe it still means multiple threads could be writing to stdout at the same time and producing garbled output. Also, I'm not sure why this bug would start happening now if #21526 was merged a month ago, so wondering if something else might have changed recently? Or if this happens infrequently enough to not be noticed? Or if this was noticed but was just ignored a little while? Would be nice if there was a way to easily grep cirrus logs and find all the instances of this. |
As mentioned The problem is, it seems that io manipulators are not threadsafe: the stream members are not atomic, and there are no locks: https://github.com/llvm/llvm-project/blob/release/12.x/libcxx/include/ios#L372 But many methods call This is done here: https://github.com/llvm/llvm-project/blob/main/libcxx/include/locale#L1400 which is called whenever e.g. So even when the standard says that concurrent access to |
Interesting that no one else ran into this before us. Maybe a bug should be reported, since I couldn't find any https://bugs.llvm.org/buglist.cgi?quicksearch=race%20ios_base&list_id=226152 |
96c7db9 test: Add ios_base::width tsan suppression (Hennadii Stepanov) Pull request description: This PR: - adds tsan suppression for intermittent failures in CI ``` SUMMARY: ThreadSanitizer: data race /usr/lib/llvm-12/bin/../include/c++/v1/ios:523:12 in std::__1::ios_base::width() const ``` - fixes bitcoin#23366 ACKs for top commit: laanwj: Concept and code review ACK 96c7db9 Tree-SHA512: fcad296e8da4a6d94dcbb011c3d9b3d07f6983818be16cfff8341a035fa6abe2777ae72409c9bc83083097660408a850c1e9cd6f0ad3ea7976e4a4768f1e1858
I guess #23370 resolved CI failures, but from Martin's comment #23366 (comment), it seems like the suppression isn't ideal because there is a real data race. I think good followups would be to:
Could maybe mark this issue up for grabs if this makes sense and is of interest. |
If the boost test logging has internal locking, that seems a fine alternative. Are you referring to |
I was being vague because I didn't look into where these cout writes were coming from. But from the cirrus TSAN output above, one To be able to remove the suppression, all these these cout writes would need to be protect with a lock. We can lock I'm not sure if this would be worth it, but it seems something like this would be needed to remove the suppression. |
Someone (not me) created a bug for exactly this issue, with a nice short reproducer: https://bugs.llvm.org/show_bug.cgi?id=52509 In C++20 there will be |
bugs.llvm was deleted. The bug is now here: llvm/llvm-project#51851 |
96c7db9 test: Add ios_base::width tsan suppression (Hennadii Stepanov) Pull request description: This PR: - adds tsan suppression for intermittent failures in CI ``` SUMMARY: ThreadSanitizer: data race /usr/lib/llvm-12/bin/../include/c++/v1/ios:523:12 in std::__1::ios_base::width() const ``` - fixes bitcoin#23366 ACKs for top commit: laanwj: Concept and code review ACK 96c7db9 Tree-SHA512: fcad296e8da4a6d94dcbb011c3d9b3d07f6983818be16cfff8341a035fa6abe2777ae72409c9bc83083097660408a850c1e9cd6f0ad3ea7976e4a4768f1e1858
96c7db9 test: Add ios_base::width tsan suppression (Hennadii Stepanov) Pull request description: This PR: - adds tsan suppression for intermittent failures in CI ``` SUMMARY: ThreadSanitizer: data race /usr/lib/llvm-12/bin/../include/c++/v1/ios:523:12 in std::__1::ios_base::width() const ``` - fixes bitcoin#23366 ACKs for top commit: laanwj: Concept and code review ACK 96c7db9 Tree-SHA512: fcad296e8da4a6d94dcbb011c3d9b3d07f6983818be16cfff8341a035fa6abe2777ae72409c9bc83083097660408a850c1e9cd6f0ad3ea7976e4a4768f1e1858
96c7db9 test: Add ios_base::width tsan suppression (Hennadii Stepanov) Pull request description: This PR: - adds tsan suppression for intermittent failures in CI ``` SUMMARY: ThreadSanitizer: data race /usr/lib/llvm-12/bin/../include/c++/v1/ios:523:12 in std::__1::ios_base::width() const ``` - fixes bitcoin#23366 ACKs for top commit: laanwj: Concept and code review ACK 96c7db9 Tree-SHA512: fcad296e8da4a6d94dcbb011c3d9b3d07f6983818be16cfff8341a035fa6abe2777ae72409c9bc83083097660408a850c1e9cd6f0ad3ea7976e4a4768f1e1858
Maybe we can set https://en.cppreference.com/w/cpp/io/ios_base/sync_with_stdio as a work-around to get thread-safety? Edit: nvm, this should already be set by default? |
I guess |
I'm seeing this race failure in two unrelated PRs during the
validation_chainstate_tests/chainstate_update_tip
case:https://cirrus-ci.com/task/5125235994263552?logs=ci#L4140 from #22702
https://cirrus-ci.com/task/5269464468946944?logs=ci#L4142 from #21206
I'm not sure if this is a real bug or spurious, but I did work around a similar problem previously (in #15719) by adding a suppression:
bitcoin/test/sanitizer_suppressions/tsan
Lines 33 to 40 in 81be7ff
The text was updated successfully, but these errors were encountered: