Thread / shutdown cleanup #2357

merged 7 commits into from Apr 4, 2013


None yet

8 participants


The shutdown code has bothered me for a long time, with fShutdown and vnThreadsRunning and a bunch of ad-hoc Sleeps().

Reports of shutdown-on-exit bugs in 0.8 prompted me to clean it all up.

This pull reworks the way we keep track of threads, using boost::thread_groups or boost::thread pointers.

Instead of polling for a global fShutdown flag, threads now just MilliSleep(), wait on boost mutexes, or periodically call boost::this_thread::interruption_point(), and are interrupted using boost's thread interruption mechanisms.

All of the cleanup removes 400 lines of code, and, I hope, makes the code easier to follow.

Tested on OSX extensively, and compiled/tested lightly on Windows XP.


Rebased, and added -rpcthreads to --help output.

But the SIGTERM handler is not working properly, which is why the pull tester is upset...

Diapolo commented Mar 19, 2013

I really like the reduced complexity of the code, I didn't test it out yet though.


I can't reproduce the pull-tester issue in an Ubuntu VM, but I did fix a compiler warning.

Diapolo commented Mar 20, 2013

Let's see what pull-tester is doing now, seems the old entries are gone and I wanted to also take a look :).

@sipa sipa and 3 others commented on an outdated diff Mar 21, 2013
@@ -299,6 +304,7 @@ bool static Bind(const CService &addr, unsigned int flags) {
" -rpcport=<port> " + _("Listen for JSON-RPC connections on <port> (default: 8332 or testnet: 18332)") + "\n" +
" -rpcallowip=<ip> " + _("Allow JSON-RPC connections from specified IP address") + "\n" +
" -rpcconnect=<ip> " + _("Send commands to node running on <ip> (default:") + "\n" +
+ " -rpcthreads=<n> " + _("Use this mean threads to service RPC calls (default: 4)") + "\n" +
sipa Mar 21, 2013 Member


Diapolo Mar 21, 2013

^^ I vote for Set the number of threads to service RPC calls (1-16, 0=auto, default: 0) and we should perhaps consider the same routine as you added for the -par switch then.

// -rpcthreads=0 means autodetect, but nRPCThreads==0 means no concurrency
    nRPCThreads = GetArg("-rpcthreads", 0);
    if (nRPCThreads == 0)
        nRPCThreads = boost::thread::hardware_concurrency();
    if (nRPCThreads <= 1)
        nRPCThreads = 0;
    else if (nRPCThreads > MAX_RPC_THREADS)
        nRPCThreads = MAX_RPC_THREADS;
gavinandresen Mar 21, 2013 Member


More RPC threads is not better, because RPC calls are normally single-threaded anyway due to lock contention.

Diapolo Mar 21, 2013

Okay, didn't know that, but the message should be in a similar format and some checks for min / max would be also nice IMHO.

rebroad Apr 1, 2013 Contributor

"Use this mean threads" doesn't appear to me to be English.

jgarzik commented Mar 22, 2013

ACK (with some inline minor nits mentioned)

sipa commented Mar 23, 2013

I really like this change (except the polling to check whether someone interrupted... i wonder if one can call a condition variable's notify in a signal handler...). I haven't gone through the code in detail, but I see nothing terrible.

I've tested both bitcoind and bitcoin-qt under valgrind, with several ways of causing exit. All seems good.


I tested it under OSX 10.8.3 with a command line build of bitcoind and it does not stop when I send the "stop" rpc command.


There is definitely a bug with the signal handler / RPC stop on OSX at least, if I try to stop shortly after startup the "stop" get swallowed and further stops have no effect.


Fixed two issues, and made one improvement:

  1. Running -daemon, the shutdown detection thread was being started before the fork(). Oops. I moved the -daemon code to run earlier, in AppInit() (which has the added benefit of simplifying the code a bit).
  2. A request to shutdown during AppInit2() startup could get lost. I fixed that by making the detect shutdown thread run and repeatedly interrupt the main thread group until it, itself, is interrupted.


Simplified the Qt code by using a QTimer to look for fRequestShutdown getting set, instead of the more complicated boost signal --> slot --> Qt signal --> slot.


on freebsd I needed to the add boost_timer lib to the linker


right now I got this while closing with "stop":

Assertion failed: (!pthread_mutex_lock(&m)), function lock, file /usr/local/include/boost/thread/pthread/recursive_mutex.hpp


@denis2342 : when you say "right now", do you mean with this patch or without it?


yes, with the patch.

sorry for the delay, I am on the road.


Sent from my iPhone

On 29.03.2013, at 18:29, Gavin Andresen wrote:

@denis2342 : when you say "right now", do you mean with this patch or without it?

Reply to this email directly or view it on GitHub.

@laanwj laanwj commented on the diff Mar 31, 2013
- Sleep(500);
+ MilliSleep(500);
laanwj Mar 31, 2013 Member

It looks like the Sleep calls were directly replaced by MilliSleep calls. Sleep, however, takes seconds. Was it intended to multiply with a factor 1000 too?

sipa commented Mar 31, 2013

@laanwj No, Sleep() (as opposed to sleep()) was an own function in util.cpp that takes milliseconds as argument. I'm in favor of renaming it to MilliSleep() to reduce confusion.

laanwj commented Mar 31, 2013

Yes, agreed. This is certainly an improvement in naming, then.


Squashed a few commits, rebased, and changed unsigned constants from 'u' to 'U'.

I'm still puzzled as to why shutdown isn't happening immediately on the pull-tester machine, I can't reproduce that problem on either my Mac or a debian VM. I'll debug directly on the pull-tester machine next.


got it again with the latest changes while using "stop":

./bitcoind stop
Bitcoin server stopping
Assertion failed: (!pthread_mutex_lock(&m)), function lock, file /usr/local/include/boost/thread/pthread/recursive_mutex.hpp, line 105.


@denis2342 can you run under a debugger and get a call stack for that assertion? I'm guessing something is being called after main() exits, but I'm just guessing...


Sigh. That took way too long to figure out.

Boost 1.40 (running on pull-tester) versus 1.53 (running on my machine) difference; boost 1.40's thread_group cannot interrupt_all() if another thread is waiting on join_all().

I'll rework the code to be boost 1.40-compatible.

gavinandresen added some commits Mar 7, 2013
@gavinandresen gavinandresen Shutdown cleanup prep-work
Create a boost::thread_group object at the qt/bitcoind main-loop level
that will hold pointers to all the main-loop threads.

This will replace the vnThreadsRunning[] array.

For testing, ported the BitcoinMiner threads to use its
own boost::thread_group.
@gavinandresen gavinandresen Fix signed/unsigned comparison warnings 87b9931
@gavinandresen gavinandresen Rename util.h Sleep --> MilliSleep
Two reasons for this change:
1. Need to always use boost::thread's sleep, even on Windows, so the
sleeps can be interrupted (prior code used Windows' built-in Sleep).

2. I always forgot what units the old Sleep took.
Diapolo commented Apr 3, 2013

@gavinandresen Why are we supporting such an ancient Boost version anyway? I don't get that :).


Automatic sanity-testing: PASSED, see for binaries and test log.
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at
Contact BlueMatt on freenode if something looks broken.


I'm going to pull this, I suspect @denis2342 is seeing the same intermittent crash-at-shutdown problems that we saw before on OSX (or are you seeing them on freebsd?)

@gavinandresen gavinandresen merged commit a0a437c into bitcoin:master Apr 4, 2013
@gavinandresen gavinandresen deleted the gavinandresen:shutdowncleanup branch Apr 4, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment