-
Notifications
You must be signed in to change notification settings - Fork 38k
refactor: make checkqueue manage the threads by itself (also removed some boost dependencies) #14464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test/lint/lint-includes.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to PR title boost/thread/condition_variable.hpp
should go as well? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only remove them in checkqueue, the another place to use it is in scheduler.
Concept ACK Towards a Boost free future! :-) |
src/checkqueue.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could just call this interrupted
src/checkqueue.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an
if
only has a single-statementthen
-clause, it can appear
on the same line as theif
, without braces. In every other case,
braces are required, and thethen
andelse
clauses must appear
correctly indented on a new line.
https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#coding-style-c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-format doesn't catch this. Done manually.
src/validation.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pluralize to threads
utACK |
utACK 9aacef638f0c2159930640762daf524942052a9d |
utACK 9aacef6 |
src/checkqueue.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny nit: only one master thread so maybe condMaster.notify_one()
is more consistent. No difference to code though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weak concept Ack.
Necessary step to get rid of boost, but conflicts with some unmerged work (#9938, JeremyRubin#1) which does the same thing.
Overall, this appears to be correct.
Before merge I'd like to see some benchmarks, the checkqueue code is very performance sensitive, sometimes the boost condition variables and mutexs difference from std's can be substantial. (I'm not kidding -- the time to wake-up and condition variable sleep behavior is one of the major performance issues with this code).
src/checkqueue.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the condition as a callback to wait.
e.g., cond.wait(lock, [&] { return interrupted || !queue.empty(); })
Addressed comments |
@JeremyRubin
This PR 625c55a70c376af9a94a2bb118c39685d3b82b33
|
That's great, thanks for checking that -- much bigger change than I expected it to be. Looking at the numbers, it seems not only to be faster on the microbenchmark, but more consistent. This merits running two additional tests, if you have energy for it:
The second test is important because one thing I noticed when working on this code previously is that it was trivially easy to get speedups on this code at the expense of general performance. E.g., getting rid of condition variables (and just spinning until jobs came in) was much faster, but meant that the OS scheduler would be giving our CheckQueue workers run-time when there was nothing to do, impacting other processes on the computer (most likely because of causing more cache invalidation, rather than the impact of losing a few cycles of a process switch). Hence my suspicion: the 20% speedup is because the standard condition variables are not using system sleep (which plays better with the OS). This can actually negatively impact the process too as our main thread might be getting context switched out while we are trying to add jobs or something rather than waiting for notify. |
src/checkqueue.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could assert(! (fMaster && nTodo == 0)) here.
If fMaster were ever true here, we would deadlock.
Hence, there is no point to check that...
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
src/checkqueue.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After tested a few times, return false
or somthing else? seem to make the node have to reindex next time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue could be that when you interrupt, you don't want to make fMaster return false causing the block to be marked false (which could be written into cache somehow).
Is interrupt guaranteed to wait until the current job is done? Making it guaranteed to wait probably could help fix that behavior... One way of doing that is to separate out when you wake up if there's an active job or not, and don't allow to quit when there is an active job until it's done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend adapting the quitting logic from https://github.com/JeremyRubin/bitcoin/blob/aa847431e266fb08289edbac3665ac7c93a81b8a/src/checkqueue.h if you want a reference, which should be more or less correct (albeit with a lot of extensions you can ignore).
This version is not correct either if I'm correct about the block getting flushed with validity cached during shutdown. Now you might incorrectly cache that a block was valid when it may not have been. You need to finish processing the block, then shut down. Or, if a shutdown is triggered, abandon the block in the master (you could make a trivalent return: false, true, shutdown). |
Actually I think that's not right, I think this version will just hang because nTodo will be != 0 and queue.empty() will be true, so the master would sleep and never get woken? |
@JeremyRubin |
Your best bet IMO is to have a CheckQueueControl mutex which is separate from the CheckQueue mutex. Make interrupting get both... |
@JeremyRubin I am going to test this to see if there is anything broken. |
Where's the recursive lock? |
In test_bitcoin, it constuct |
Interrupt should be called in the destructor of the test which should happen after any queuecontrols are destructed though? |
I think it would be better to make the checkqueue to control their worker threads instead of calling |
Could try a |
This commit add a Interrupt function for CCheckQueue that it can handle interrupt by itself instead of relying on boost thread interrupt
replace boost::mutex with debuggable Mutex replace boost::condition_variable with std::condition_variable add const specifier to fMaster and nBatchSize add clang thread safety attributes move init value of member of CCheckQueue from constructor to definition
Manage CCheckQueue workder threads by itself. Do not expose the threads outside the object.
@ken2812221 Why the close? :-) |
The CI was failed and I don't have time to find the bugs out recently, so I closed it, I might reopen this when I has more time working on this. |
bb6fcc7 refactor: Drop boost::thread stuff in CCheckQueue (Hennadii Stepanov) 6784ac4 bench: Use CCheckQueue local thread pool (Hennadii Stepanov) dba3069 test: Use CCheckQueue local thread pool (Hennadii Stepanov) 0151177 Add local thread pool to CCheckQueue (Hennadii Stepanov) 0ef9386 refactor: Use member initializers in CCheckQueue (Hennadii Stepanov) Pull request description: This PR: - gets rid of `boost::thread_group` in the `CCheckQueue` class - allows thread safety annotation usage in the `CCheckQueue` class - is alternative to #14464 (#18710 (comment), #18710 (comment)) Also, with this PR (I hope) it could be easier to resurrect a bunch of brilliant ideas from #9938. Related: #17307 ACKs for top commit: laanwj: Code review ACK bb6fcc7 LarryRuane: ACK bb6fcc7 jonatack: Code review ACK bb6fcc7 and verified rebase to master builds cleanly with unit/functional tests green Tree-SHA512: fddeb720d5a391b48bb4c6fa58ed34ccc3f57862fdb8e641745c021841c8340e35c5126338271446cbd98f40bd5484f27926aa6c3e76fa478ba1efafe72e73c1
bb6fcc7 refactor: Drop boost::thread stuff in CCheckQueue (Hennadii Stepanov) 6784ac4 bench: Use CCheckQueue local thread pool (Hennadii Stepanov) dba3069 test: Use CCheckQueue local thread pool (Hennadii Stepanov) 0151177 Add local thread pool to CCheckQueue (Hennadii Stepanov) 0ef9386 refactor: Use member initializers in CCheckQueue (Hennadii Stepanov) Pull request description: This PR: - gets rid of `boost::thread_group` in the `CCheckQueue` class - allows thread safety annotation usage in the `CCheckQueue` class - is alternative to bitcoin#14464 (bitcoin#18710 (comment), bitcoin#18710 (comment)) Also, with this PR (I hope) it could be easier to resurrect a bunch of brilliant ideas from bitcoin#9938. Related: bitcoin#17307 ACKs for top commit: laanwj: Code review ACK bb6fcc7 LarryRuane: ACK bb6fcc7 jonatack: Code review ACK bb6fcc7 and verified rebase to master builds cleanly with unit/functional tests green Tree-SHA512: fddeb720d5a391b48bb4c6fa58ed34ccc3f57862fdb8e641745c021841c8340e35c5126338271446cbd98f40bd5484f27926aa6c3e76fa478ba1efafe72e73c1
This PR removes use of the following object in checkqueue
boost::condition_variable
boost::mutex
boost::unique_lock
Instead of using
boost::thread_group::interrupt_all
to triggerboost::thread_interrupted
exception, it should callInterruptScriptCheck()
orscriptcheckqueue.Interrupt()
to interrupt the checkqueue thread.Also, it adds three functions:
StartScriptCheck()
InterruptScriptCheck()
StopScriptCheck()
for clearer logic for script check threads.