Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: guard vRecvGetData with cs_vRecv and orphan_work_set with g_cs_orphans #19911

Merged
merged 6 commits into from Oct 19, 2020

Conversation

narula
Copy link
Contributor

@narula narula commented Sep 7, 2020

Add annotations to guard vRecvGetData and orphan_work_set and fix up places where they were accessed without a lock. There is no current data race because they happen to be accessed by only one thread, but this might not always be the case.

Original discussion: #18861 (comment)

@practicalswift
Copy link
Contributor

Concept ACK

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

I've left a couple of style comments inline. Will review fully once this is rebased.

src/net_processing.cpp Outdated Show resolved Hide resolved
src/net_processing.cpp Outdated Show resolved Hide resolved
@narula narula force-pushed the cs_vRecv branch 2 times, most recently from 3c02151 to 3fe4081 Compare September 7, 2020 21:15
@narula
Copy link
Contributor Author

narula commented Sep 7, 2020

The thread sanitizer is complaining about locking order; I'm looking into it.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

src/net_processing.cpp Outdated Show resolved Hide resolved
@hebasto
Copy link
Member

hebasto commented Sep 8, 2020

@narula

The thread sanitizer is complaining about locking order; I'm looking into it.

Could I suggest some code reorganization to keep locking order constant:
https://github.com/hebasto/bitcoin/commits/pr19911-0908 ?

src/net.h Outdated Show resolved Hide resolved
src/net_processing.cpp Outdated Show resolved Hide resolved
LogPrint(BCLog::NET, "Peer %d sent us a getblocktxn for a block > %i deep\n", pfrom.GetId(), MAX_BLOCKTXN_DEPTH);
inv.hash = req.blockhash;
WITH_LOCK(pfrom.cs_vRecv, pfrom.vRecvGetData.push_back(inv));
// The message processing loop will go around again (without pausing) and we'll respond then (without cs_main)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a slight sequencing change, but perhaps we should either call ProcessGetData() here, or not call it in the GETDATA processing for consistency. It seems from this comment that the only reason we weren't calling ProcessGetData() here was that cs_main was being held, which is no longer the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be a behavior change?

@narula
Copy link
Contributor Author

narula commented Sep 14, 2020

I've pushed a new series of commits which move vRecvGetData and orphan_work_set to Peer and then guard them appropriately. It's a bit more of a change, but happy to help move things over there if that's the end goal.

I do have some commits that move both vRecvGetData and orphan_work_set to the Peer struct in https://github.com/jnewbery/bitcoin/commits/2020-06-cs-main-split-needs-rebase (everything from
END move tx data to net processing // START move work queue data to net processing to END move work queue data to net processing // START move subversion to net_processing). It's slightly orthogonal to this PR, but perhaps we should just do that now while we're touching these fields?

@jnewbery I tried to cherry-pick some of your commits but they did not move cleanly given the scope of your branch's change; I also didn't do one of the renames you did to minimize the change. Let me know if there's an appropriate way to credit you.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Neha! A couple of small comments inline.

src/net_processing.cpp Outdated Show resolved Hide resolved
src/net_processing.cpp Outdated Show resolved Hide resolved
@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 19, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK and almost code review ACK, still reviewing 939880b.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit Move m_orphan_work_set to net_processing, you're not removing CNode::m_orphan_work_set.

Other than that, looks good!

This requires slightly reorganizing the logic in GETBLOCKTXN to
maintain locking order.
-BEGIN VERIFY SCRIPT-
sed -i 's/vRecvGetData/m_getdata_requests/g' src/net_processing.cpp
-END VERIFY SCRIPT-
@narula
Copy link
Contributor Author

narula commented Oct 14, 2020

In commit Move m_orphan_work_set to net_processing, you're not removing CNode::m_orphan_work_set.

Other than that, looks good!

Good catch, I lost that in the last rebase! Fixed.

@jnewbery
Copy link
Contributor

Code review ACK da0988d

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK da0988d.

Mind making the first rename commit a scripted-diff as well?

src/net_processing.cpp Show resolved Hide resolved
@@ -2363,6 +2366,8 @@ void PeerManager::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDat
return;
}

PeerRef peer = GetPeerRef(pfrom.GetId());
if (peer == nullptr) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8803aee, nit:

Suggested change
if (peer == nullptr) return;
if (!peer) return;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to be consistent with other ways this is checked in the file, for example

if (peer == nullptr) return false;
and
if (peer == nullptr) return;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is required to be consistent with a bad style of the surrounding code. Are we going to check every raw/smart pointer against nullptr explicitly everywhere in the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both are fine. We don't express a preference in our style guide. I think @hebasto's suggestion is more idiomatic, and if I were writing those lines again, I'd use the !ptr form.

@@ -3865,12 +3870,15 @@ bool PeerManager::ProcessMessages(CNode* pfrom, std::atomic<bool>& interruptMsgP
{
bool fMoreWork = false;

PeerRef peer = GetPeerRef(pfrom->GetId());
if (peer == nullptr) return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8803aee, nit:

Suggested change
if (peer == nullptr) return false;
if (!peer) return false;

src/net_processing.cpp Show resolved Hide resolved
@@ -3887,7 +3889,10 @@ bool PeerManager::ProcessMessages(CNode* pfrom, std::atomic<bool>& interruptMsgP
// this maintains the order of responses
// and prevents vRecvGetData to grow unbounded
if (!pfrom->vRecvGetData.empty()) return true;
if (!peer->m_orphan_work_set.empty()) return true;
{
LOCK(g_cs_orphans);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

673247b, nit:

Suggested change
LOCK(g_cs_orphans);
LOCK(::g_cs_orphans);

@@ -1754,7 +1757,10 @@ void static ProcessGetData(CNode& pfrom, const CChainParams& chainparams, CConnm
{
AssertLockNotHeld(cs_main);

std::deque<CInv>::iterator it = pfrom.vRecvGetData.begin();
PeerRef peer = GetPeerRef(pfrom.GetId());
if (peer == nullptr) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2d9f2fc, nit:

Suggested change
if (peer == nullptr) return;
if (!peer) return;

PeerRef peer = GetPeerRef(pfrom.GetId());
if (peer == nullptr) return;

std::deque<CInv>::iterator it = peer->vRecvGetData.begin();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2d9f2fc, nit:

Suggested change
std::deque<CInv>::iterator it = peer->vRecvGetData.begin();
auto it = peer->vRecvGetData.begin();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have guidance about auto usage in our style guide. I personally think auto should only be used if the type is immediately obvious from the surrounding code, and it's saving redundant and verbose keystrokes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have guidance about auto usage in our style guide.

Yes. And I think we should have it.

I personally think auto should only be used if the type is immediately obvious from the surrounding code, and it's saving redundant and verbose keystrokes.

https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hebasto. I hadn't seen that particular blog post before, but I am familiar with the arguments for and against auto. Scott Meyers also dedicates a chapter to it in Exceptional Modern C++.

Here, I think it's useful to explicitly name the type, since CInv is part of the interface. Sure, we could switch out std::deque for another container at some point, but to know what it->IsGenTxMsg() is doing a few lines further down, you need to know that it is an iterator over some container of CInvs. I'd much rather the code told me that explicitly than having to go find out what m_getdata_requests is from somewhere else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very interesting post! I'm not sure I entirely agree with his reasoning against the readability argument, but perhaps we could debate that somewhere else :) I think this could go either way, and I prefer having the type explicit.

Comment on lines 3931 to 3932
if (!peer->vRecvGetData.empty())
fMoreWork = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2d9f2fc

nit: add braces or make one line?

src/net_processing.cpp Outdated Show resolved Hide resolved
src/net_processing.cpp Show resolved Hide resolved
@jnewbery
Copy link
Contributor

Mind making the first rename commit a scripted-diff as well?

It can't be, since orphan_work_set is currently used as the name for both a member variable and a local variable.

@hebasto
Copy link
Member

hebasto commented Oct 14, 2020

Mind making the first rename commit a scripted-diff as well?

It can't be, since orphan_work_set is currently used as the name for both a member variable and a local variable.

Indeed. Sorry.

@narula
Copy link
Contributor Author

narula commented Oct 15, 2020

I think all comments have been addressed except the remaining review comments which I can summarize to the following nits:

  1. change if (peer == nullptr) to if (!peer) in the places where I add a new nullptr check
  2. fix up an existing if without braces: net: guard vRecvGetData with cs_vRecv and orphan_work_set with g_cs_orphans #19911 (comment) in a line I touch

I don't think these are important, and I already have one ACK on the code so I'd prefer not to change them. However, if others disagree, I can do so.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK da0988d, I have reviewed the code and it looks correct, I agree it can be merged.

My only concern was resolved by @jnewbery. My other comments are non-blocking nits obviously.

I don't think these are important, and I already have one ACK on the code so I'd prefer not to change them. However, if others disagree, I can do so.

Now you've got two of them 😃

@maflcko
Copy link
Member

maflcko commented Oct 17, 2020

review ACK da0988d 🐬

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

review ACK da0988daf1d665a4644ad3f1ddf3f8a8bdd88cde 🐬
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUh3Kwv/Z92YkezB0vKDt8zCSK2kekyViu2esvi5EQIv7t59sPpc1lksTFvC75lX
rmr8hVvOHb6l1pMTGEXLJHVgamn4B/xd5jnJCEGcI9GR1bamL6wRlSKLZEyhnVjV
Siyp7qPFIQPTQxUsUhgJxeLqqYWV1TLmxRvekBnwtfksqpHTBR6icrBsQpg7Nu7Q
f9uiggfUZq3UrnyzF9dJqYUG5CyGiymnRH9YUE89W+7cg/L7RllSQh6K5lrbeb9b
WwAbJL64iDR3Npv4d9bwIQkZNgadyZA8yYw/kskriD9Gt+Qxg7V0VnBbExhve3mp
vc2i6mpPqgDS0n4a+tv3pB7g4zvv9w8kLNt4wXiBIETVdbfH5acdDw3GvTOAhQBl
Nwu0KhCxCkXtWDP56vJo2D0V8WsaHMQv53DUAjsAKHot7c0ihXd+V3zhwUHhJXEP
d1BHU3W+II9DB0c2e9gXyiS/ZHWbEhzwvtdj2pVAtETJG6dt8gM9aEmdzh7Lnfga
QWbax8bo
=//W2
-----END PGP SIGNATURE-----

Timestamp of file with hash 350e455a9cfde318c3ab04c709fac25fca0a0b8205b4cfd5c3a0765e95f4dff4 -

@fanquake fanquake merged commit c92aa83 into bitcoin:master Oct 19, 2020
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Oct 19, 2020
…_work_set with g_cs_orphans

da0988d scripted-diff: rename vRecvGetData (Neha Narula)
ba95181 Guard vRecvGetData (now in net processing) with its own mutex (Neha Narula)
2d9f2fc Move vRecvGetData to net processing (Neha Narula)
673247b Lock before checking if orphan_work_set is empty; indicate it is guarded (Neha Narula)
8803aee Move m_orphan_work_set to net_processing (Neha Narula)
9c47cb2 [Rename only] Rename orphan_work_set to m_orphan_work_set. (Neha Narula)

Pull request description:

  Add annotations to guard `vRecvGetData` and `orphan_work_set` and fix up places where they were accessed without a lock. There is no current data race because they happen to be accessed by only one thread, but this might not always be the case.

  Original discussion: bitcoin#18861 (comment)

ACKs for top commit:
  MarcoFalke:
    review ACK da0988d 🐬
  jnewbery:
    Code review ACK da0988d
  hebasto:
    ACK da0988d, I have reviewed the code and it looks correct, I agree it can be merged.

Tree-SHA512: 31cadd319ddc9273a87e77afc4db7339fd636e816b5e742eba5cb32927ac5cc07a672b2268d2d38a75a0f1b17d93836adab9acf7e52f26ea9a43f54efa57257e
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 25, 2021
Summary:
This helps distinguish the member from any local variables.

This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [1/6]
bitcoin/bitcoin@9c47cb2

Test Plan: `ninja`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10526
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 25, 2021
Summary:
This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [2/6]
bitcoin/bitcoin@8803aee
Depends on D10526

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Subscribers: Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10527
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 25, 2021
Summary:
This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [3/6]
bitcoin/bitcoin@673247b

Depends on D10527

Test Plan:

```
cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DENABLE_SANITIZERS=thread
ninja && ninja check check-functional
```

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10528
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 25, 2021
Summary:
This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [3/7]
bitcoin/bitcoin@2d9f2fc

Depends on D10528

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10529
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 25, 2021
Summary:
This requires slightly reorganizing the logic in GETBLOCKTXN to
maintain locking order.

This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [5/6]
bitcoin/bitcoin@ba95181

see D10187 for the removal of the `!cs_main` negative lock

Depends on D10529

Test Plan:
```
cmake .. -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DENABLE_SANITIZERS=thread
ninja && ninja check check-functional
```

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Subscribers: Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10530
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 25, 2021
Summary:
```
-BEGIN VERIFY SCRIPT-
sed -i 's/vRecvGetData/m_getdata_requests/g' src/net_processing.cpp
-END VERIFY SCRIPT-
```
This is a backport of [[bitcoin/bitcoin#19911 | core#19911]] [6/6]
bitcoin/bitcoin@da0988d

Depends on D10530

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10531
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants