Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: define BOOST_NO_CONFIG #26148

Closed
wants to merge 3 commits into from
Closed

Conversation

fanquake
Copy link
Member

Define BOOST_NO_CONFIG, which according to Boost is what we should be doing:

// define this to disable all config options,
// excluding the user config. Use if your
// setup is fully ISO compliant, and has no
// useful extensions, or for autoconf generated
// setups:
// #define BOOST_NO_CONFIG

Defining BOOST_NO_CONFIG, means we need some additional defines to tell Boost to avoid trying to use code that has been removed in C++17. Namely std::auto_ptr (BOOST_NO_AUTO_PTR) and std::bind1st (BOOST_NO_CXX98_BINDERS).

Partially split out of #24742, as this should be ok to do independently. With this change, we'll be able to prune another ~80 Boost headers from our depends bundle.

@hebasto
Copy link
Member

hebasto commented Sep 21, 2022

we'll be able to prune another ~80 Boost headers from our depends bundle.

Concept ACK on it.

@fanquake
Copy link
Member Author

Hopefully fixed all the Boost Process nonsense by telling ASIO to be standalone.

Copy link

@amovfx amovfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I built to try and repro the check error. But everything seems working on my end.
I dug thought the error and it looks like python tests are failing.
Hope this helps.
OK Traceback (most recent call last): File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/create_cache.py", line 27, in <module> CreateCache().main() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 156, in main exit_code = self.shutdown() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 311, in shutdown self.stop_nodes() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 567, in stop_nodes node.stop_node(wait=wait, wait_until_stopped=False) File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_node.py", line 338, in stop_node self.stop(wait=wait) File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_node.py", line 186, in __getattr__ assert self.rpc_connected and self.rpc is not None, self._node_msg("Error: no RPC connection") AssertionError: [node 0] Error: no RPC connection 2022-09-21T15:15:08.767000Z TestFramework (INFO): Initializing test directory /tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20220921_151507/cache 2022-09-21T15:15:09.525000Z TestFramework (ERROR): Unexpected exception caught during testing Traceback (most recent call last): File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 132, in main self.setup() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 294, in setup self.setup_chain() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 385, in setup_chain self._initialize_chain() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 779, in _initialize_chain self.start_node(CACHE_NODE_ID) File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 534, in start_node node.wait_for_rpc_connection() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_node.py", line 227, in wait_for_rpc_connection raise FailedToStartError(self._node_msg( test_framework.test_node.FailedToStartError: [node 0] bitcoind exited with status 66 during initialization 2022-09-21T15:15:09.576000Z TestFramework (INFO): Stopping nodes [node 0] Cleaning up leftover process Running Unit Tests for Test Framework Modules Traceback (most recent call last): File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_runner.py", line 842, in <module> main() File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_runner.py", line 479, in main run_tests( File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_runner.py", line 533, in run_tests subprocess.check_output([sys.executable, tests_dir + 'create_cache.py'] + flags + ["--tmpdir=%s/cache" % tmpdir]) File "/usr/lib/python3.10/subprocess.py", line 420, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/usr/lib/python3.10/subprocess.py", line 524, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/usr/bin/python3', '/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/create_cache.py', '--cachedir=/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/cache', '--timeout-factor=40', '--configfile=/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/../config.ini', '--tmpdir=/tmp/cirrus-ci-build/ci/scratch/test_runner//test_runner_₿_🏃_20220921_151507/cache']' returned non-zero exit status 1.

@fanquake
Copy link
Member Author

Hopefully fixed all the Boost Process nonsense by telling ASIO to be standalone.

Looks like the only issue now is in the TSan job https://github.com/bitcoin/bitcoin/runs/8474237074:

WARNING: ThreadSanitizer: data race (pid=28184)
  Read of size 4 at 0x7b1c000103ac by main thread (mutexes: write M131597):
    #0 void boost::signals2::detail::connection_body_base::dec_slot_refcount<boost::signals2::detail::connection_body_base>(boost::signals2::detail::garbage_collecting_lock<boost::signals2::detail::connection_body_base>&) const /tmp/cirrus-ci-build/depends/x86_64-pc-linux-gnu/include/boost/signals2/connection.hpp:124:11 (bitcoind+0x176ec7)
    #1 void boost::signals2::detail::connection_body_base::nolock_disconnect<boost::signals2::detail::connection_body_base>(boost::signals2::detail::garbage_collecting_lock<boost::signals2::detail::connection_body_base>&) const /tmp/cirrus-ci-build/depends/x86_64-pc-linux-gnu/include/boost/signals2/connection.hpp:77:13 (bitcoind+0x176af3)
    #2 boost::signals2::detail::connection_body_base::disconnect() /tmp/cirrus-ci-build/depends/x86_64-pc-linux-gnu/include/boost/signals2/connection.hpp:69:11 (bitcoind+0x176af3)
    #3 boost::signals2::connection::disconnect() const /tmp/cirrus-ci-build/depends/x86_64-pc-linux-gnu/include/boost/signals2/connection.hpp:270:25 (bitcoind+0x172098)
    #4 AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) src/init.cpp:1612:46 (bitcoind+0x162ba5)
    #5 AppInit(node::NodeContext&, int, char**) src/bitcoind.cpp:234:43 (bitcoind+0x148799)
    #6 main src/bitcoind.cpp:278:13 (bitcoind+0x148799)

@amovfx
Copy link

amovfx commented Sep 21, 2022

I would really like to learn how to debug and fix something like this. I just ran all the functional tests locally and they passed.
I guess I need to configure for Thread Sanitizers? This is new to me.

I'm trying this
https://jonatack.github.io/articles/how-to-compile-bitcoin-core-and-run-the-tests

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 23, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@jarolrod
Copy link
Member

concept ack

@theuni
Copy link
Member

theuni commented Sep 23, 2022

One observation while working through this... I tried compiling with -Werror on accident, and a bunch of things jumped out at me:

It seems a compiler config doesn't get chosen: https://github.com/boostorg/config/blob/1cff5e37bb69f42ffab73438de61e34c71258f19/include/boost/config.hpp#L34

In turn, this means that things like BOOST_FALLTHROUGH don't get set at all, as they're set by the compiler config: https://github.com/boostorg/config/blob/1cff5e37bb69f42ffab73438de61e34c71258f19/include/boost/config/detail/suffix.hpp#L999

As a result we get a bunch of warnings (or errors) about it.

Arguably that's a boost bug as [[fallthrough]] is now standardized with c++17 and is no-longer compiler specific, but it makes me feel a little uneasy about other such features.

@theuni
Copy link
Member

theuni commented Sep 23, 2022

Working through this some more:

-DBOOST_NO_CONFIG ends up disabling BOOST_COMPILER_CONFIG, BOOST_STDLIB_CONFIG, and BOOST_PLATFORM_CONFIG.

After reproducing the problem locally, I disabled each one by one until discovering that the culprit is BOOST_PLATFORM_CONFIG.

For linux, that ends up setting a few defines: https://github.com/boostorg/config/blob/1cff5e37bb69f42ffab73438de61e34c71258f19/include/boost/config/platform/linux.hpp

And pulls in some posix defines as well: https://github.com/boostorg/config/blob/1cff5e37bb69f42ffab73438de61e34c71258f19/include/boost/config/detail/posix_features.hpp

As a next step, I added defines until tests passed (or at least didn't crash in the very beginning as before, I haven't let them run all the way through yet) using the following defines:

-DBOOST_NO_USER_CONFIG -DBOOST_NO_AUTO_PTR -DBOOST_NO_CXX98_BINDERS -DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION -DBOOST_NO_CXX98_FUNCTION_BASE -DBOOST_NO_PLATFORM_CONFIG -DBOOST_HAS_STDINT_H -DBOOST_HAS_GETTIMEOFDAY -DBOOST_HAS_NANOSLEEP -DBOOST_HAS_PTHREAD_YIELD -DBOOST_HAS_PTHREADS -DBOOST_HAS_CLOCK_GETTIME -DBOOST_HAS_SCHED_YIELD -DBOOST_HAS_PTHREAD_MUTEXATTR_SETTYPE

I then narrowed down the culprit: -DBOOST_HAS_PTHREADS

Adding that to BOOST_CPPFLAGS eliminates the race and makes perfect sense, but I haven't yet investigated why and I'm stopping there for the day.

It seems that

for autoconf generated setups

means "for builders who have combed through every one of our defines and tested/set them as necessary". I'm not entirely opposed, but that's a scary proposition :)

@theuni
Copy link
Member

theuni commented Sep 23, 2022

Actually, the problem here is pretty immediately obvious:

https://github.com/boostorg/signals2/blob/49ed1573fa91bb8c60d0a20a5d5e36d8b6d94ea9/include/boost/signals2/mutex.hpp#L27

signals2 doesn't have a std threads implementation, only platform-specific ones. Though the interface is so braindead simple that I wonder if we could just upstream a standards-compliant one: https://github.com/boostorg/signals2/blob/49ed1573fa91bb8c60d0a20a5d5e36d8b6d94ea9/include/boost/signals2/detail/lwm_pthreads.hpp

@theuni
Copy link
Member

theuni commented Sep 26, 2022

signals2 doesn't have a std threads implementation, only platform-specific ones. Though the interface is so braindead simple that I wonder if we could just upstream a standards-compliant one: https://github.com/boostorg/signals2/blob/49ed1573fa91bb8c60d0a20a5d5e36d8b6d94ea9/include/boost/signals2/detail/lwm_pthreads.hpp

Here's a quick depends patch for boost that does this: theuni@0c308c8

Note that it is likely quite broken for boost generally as it re-enables threading for c++11 and up, whereas before it would've only been enabled when platform-specific threading was chosen. IMO this is a correct approach, but lots of other boost libs likely need to be taught to understand that.

It's enough to keep our tests from crashing at startup though.

@theuni
Copy link
Member

theuni commented Sep 26, 2022

Yeah, considering that this silently broke thread-safety, I'm pretty uneasy with this change. (Hooray for TSAN catching it though!)

At least, if we're going to go forward with it, we need some way of comparing all of the features that are now disabled because they haven't been opted-in by the compiler/stdlib/platform configs. That could be done by compiling both ways with -E -dm and diffing the output.

Judging by the patching required above, it would seem that these code-paths are not encountered very often, so it'll probably require a good bit of patching/upstreaming before we have all of our supported features turned back on.

Is there maybe some subset (of user|compiler|stdlib|platform) that could be disabled and still fix the problem with asio?

@fanquake
Copy link
Member Author

Is there maybe some subset (of user|compiler|stdlib|platform) that could be disabled and still fix the problem with asio?

I need to catch up on your discussion above, but the ASIO change, along with BOOST_NO_AUTO_PTR and BOOST_NO_CXX98_BINDERS should all be stand alone, and could be split from this PR.

@fanquake
Copy link
Member Author

Closing for now. Will circle back to this; maybe after we've followed up with Signals2.

@fanquake fanquake closed this Nov 30, 2022
@fanquake fanquake deleted the boost_no_config branch September 11, 2023 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants