New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: guard vRecvGetData with cs_vRecv and orphan_work_set with g_cs_orphans #19911
Conversation
Concept ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
I've left a couple of style comments inline. Will review fully once this is rebased.
3c02151
to
3fe4081
Compare
The thread sanitizer is complaining about locking order; I'm looking into it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
Could I suggest some code reorganization to keep locking order constant: |
src/net_processing.cpp
Outdated
LogPrint(BCLog::NET, "Peer %d sent us a getblocktxn for a block > %i deep\n", pfrom.GetId(), MAX_BLOCKTXN_DEPTH); | ||
inv.hash = req.blockhash; | ||
WITH_LOCK(pfrom.cs_vRecv, pfrom.vRecvGetData.push_back(inv)); | ||
// The message processing loop will go around again (without pausing) and we'll respond then (without cs_main) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a slight sequencing change, but perhaps we should either call ProcessGetData()
here, or not call it in the GETDATA
processing for consistency. It seems from this comment that the only reason we weren't calling ProcessGetData()
here was that cs_main
was being held, which is no longer the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be a behavior change?
I've pushed a new series of commits which move
@jnewbery I tried to cherry-pick some of your commits but they did not move cleanly given the scope of your branch's change; I also didn't do one of the renames you did to minimize the change. Let me know if there's an appropriate way to credit you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Neha! A couple of small comments inline.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
f23710c
to
fe476e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK and almost code review ACK, still reviewing 939880b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit Move m_orphan_work_set to net_processing, you're not removing CNode::m_orphan_work_set.
Other than that, looks good!
This requires slightly reorganizing the logic in GETBLOCKTXN to maintain locking order.
-BEGIN VERIFY SCRIPT- sed -i 's/vRecvGetData/m_getdata_requests/g' src/net_processing.cpp -END VERIFY SCRIPT-
Good catch, I lost that in the last rebase! Fixed. |
Code review ACK da0988d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK da0988d.
Mind making the first rename commit a scripted-diff as well?
@@ -2363,6 +2366,8 @@ void PeerManager::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDat | |||
return; | |||
} | |||
|
|||
PeerRef peer = GetPeerRef(pfrom.GetId()); | |||
if (peer == nullptr) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8803aee, nit:
if (peer == nullptr) return; | |
if (!peer) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to be consistent with other ways this is checked in the file, for example
bitcoin/src/net_processing.cpp
Line 999 in 8803aee
if (peer == nullptr) return false; |
bitcoin/src/net_processing.cpp
Line 1146 in 8803aee
if (peer == nullptr) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is required to be consistent with a bad style of the surrounding code. Are we going to check every raw/smart pointer against nullptr
explicitly everywhere in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both are fine. We don't express a preference in our style guide. I think @hebasto's suggestion is more idiomatic, and if I were writing those lines again, I'd use the !ptr
form.
@@ -3865,12 +3870,15 @@ bool PeerManager::ProcessMessages(CNode* pfrom, std::atomic<bool>& interruptMsgP | |||
{ | |||
bool fMoreWork = false; | |||
|
|||
PeerRef peer = GetPeerRef(pfrom->GetId()); | |||
if (peer == nullptr) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8803aee, nit:
if (peer == nullptr) return false; | |
if (!peer) return false; |
@@ -3887,7 +3889,10 @@ bool PeerManager::ProcessMessages(CNode* pfrom, std::atomic<bool>& interruptMsgP | |||
// this maintains the order of responses | |||
// and prevents vRecvGetData to grow unbounded | |||
if (!pfrom->vRecvGetData.empty()) return true; | |||
if (!peer->m_orphan_work_set.empty()) return true; | |||
{ | |||
LOCK(g_cs_orphans); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
673247b, nit:
LOCK(g_cs_orphans); | |
LOCK(::g_cs_orphans); |
src/net_processing.cpp
Outdated
@@ -1754,7 +1757,10 @@ void static ProcessGetData(CNode& pfrom, const CChainParams& chainparams, CConnm | |||
{ | |||
AssertLockNotHeld(cs_main); | |||
|
|||
std::deque<CInv>::iterator it = pfrom.vRecvGetData.begin(); | |||
PeerRef peer = GetPeerRef(pfrom.GetId()); | |||
if (peer == nullptr) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2d9f2fc, nit:
if (peer == nullptr) return; | |
if (!peer) return; |
src/net_processing.cpp
Outdated
PeerRef peer = GetPeerRef(pfrom.GetId()); | ||
if (peer == nullptr) return; | ||
|
||
std::deque<CInv>::iterator it = peer->vRecvGetData.begin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2d9f2fc, nit:
std::deque<CInv>::iterator it = peer->vRecvGetData.begin(); | |
auto it = peer->vRecvGetData.begin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have guidance about auto
usage in our style guide. I personally think auto
should only be used if the type is immediately obvious from the surrounding code, and it's saving redundant and verbose keystrokes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have guidance about
auto
usage in our style guide.
Yes. And I think we should have it.
I personally think
auto
should only be used if the type is immediately obvious from the surrounding code, and it's saving redundant and verbose keystrokes.
https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hebasto. I hadn't seen that particular blog post before, but I am familiar with the arguments for and against auto
. Scott Meyers also dedicates a chapter to it in Exceptional Modern C++.
Here, I think it's useful to explicitly name the type, since CInv
is part of the interface. Sure, we could switch out std::deque
for another container at some point, but to know what it->IsGenTxMsg()
is doing a few lines further down, you need to know that it
is an iterator over some container of CInv
s. I'd much rather the code told me that explicitly than having to go find out what m_getdata_requests
is from somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a very interesting post! I'm not sure I entirely agree with his reasoning against the readability argument, but perhaps we could debate that somewhere else :) I think this could go either way, and I prefer having the type explicit.
src/net_processing.cpp
Outdated
if (!peer->vRecvGetData.empty()) | ||
fMoreWork = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add braces or make one line?
It can't be, since |
Indeed. Sorry. |
I think all comments have been addressed except the remaining review comments which I can summarize to the following nits:
I don't think these are important, and I already have one ACK on the code so I'd prefer not to change them. However, if others disagree, I can do so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK da0988d, I have reviewed the code and it looks correct, I agree it can be merged.
My only concern was resolved by @jnewbery. My other comments are non-blocking nits obviously.
I don't think these are important, and I already have one ACK on the code so I'd prefer not to change them. However, if others disagree, I can do so.
Now you've got two of them 😃
review ACK da0988d 🐬 Show signature and timestampSignature:
Timestamp of file with hash |
…_work_set with g_cs_orphans da0988d scripted-diff: rename vRecvGetData (Neha Narula) ba95181 Guard vRecvGetData (now in net processing) with its own mutex (Neha Narula) 2d9f2fc Move vRecvGetData to net processing (Neha Narula) 673247b Lock before checking if orphan_work_set is empty; indicate it is guarded (Neha Narula) 8803aee Move m_orphan_work_set to net_processing (Neha Narula) 9c47cb2 [Rename only] Rename orphan_work_set to m_orphan_work_set. (Neha Narula) Pull request description: Add annotations to guard `vRecvGetData` and `orphan_work_set` and fix up places where they were accessed without a lock. There is no current data race because they happen to be accessed by only one thread, but this might not always be the case. Original discussion: bitcoin#18861 (comment) ACKs for top commit: MarcoFalke: review ACK da0988d 🐬 jnewbery: Code review ACK da0988d hebasto: ACK da0988d, I have reviewed the code and it looks correct, I agree it can be merged. Tree-SHA512: 31cadd319ddc9273a87e77afc4db7339fd636e816b5e742eba5cb32927ac5cc07a672b2268d2d38a75a0f1b17d93836adab9acf7e52f26ea9a43f54efa57257e
Summary: This helps distinguish the member from any local variables. This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [1/6] bitcoin/bitcoin@9c47cb2 Test Plan: `ninja` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D10526
Summary: This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [2/6] bitcoin/bitcoin@8803aee Depends on D10526 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D10527
Summary: This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [3/6] bitcoin/bitcoin@673247b Depends on D10527 Test Plan: ``` cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DENABLE_SANITIZERS=thread ninja && ninja check check-functional ``` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D10528
Summary: This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [3/7] bitcoin/bitcoin@2d9f2fc Depends on D10528 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D10529
Summary: This requires slightly reorganizing the logic in GETBLOCKTXN to maintain locking order. This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [5/6] bitcoin/bitcoin@ba95181 see D10187 for the removal of the `!cs_main` negative lock Depends on D10529 Test Plan: ``` cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DENABLE_SANITIZERS=thread ninja && ninja check check-functional ``` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D10530
Summary: ``` -BEGIN VERIFY SCRIPT- sed -i 's/vRecvGetData/m_getdata_requests/g' src/net_processing.cpp -END VERIFY SCRIPT- ``` This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [6/6] bitcoin/bitcoin@da0988d Depends on D10530 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D10531
Add annotations to guard
vRecvGetData
andorphan_work_set
and fix up places where they were accessed without a lock. There is no current data race because they happen to be accessed by only one thread, but this might not always be the case.Original discussion: #18861 (comment)