Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Net: Massive speedup. Net locks overhaul #9441

Merged
merged 16 commits into from Jan 13, 2017
Merged

Net: Massive speedup. Net locks overhaul #9441

merged 16 commits into from Jan 13, 2017

Conversation

@theuni
Copy link
Member

@theuni theuni commented Dec 29, 2016

Depends on (includes) #9289. This is what I ultimately set out to fix with the net refactor.

In my (short) tests, it cuts network latency to ~50%. In some cases, more like 30%.

Test method: 1 fresh node (macbook) connected to a single other node (desktop). Testnet. Running until synced to block 10000. Results:

new = patched with this PR.
old = running #9289.
client / server
old / old: 100000 in 8:05
new / old: 100000 in 3:24
new / new: 100000 in 2:16

The results are very reproducible, always within a few seconds. Not only is it a nice improvement for the fresh node, but it compounds when its peer is running the updated code as well.

I had hoped to have the abstraction complete in time for 0.14, but it looks like that's unlikely at this point. For reference, it would look something more like theuni@1a6b10a.

Anyway, this is an attempt to fix the actual issue in time for 0.14, and putting off the last bit of refactor until after that. This addresses the issue observed in #9415, but also cleans up the nasty locking issues.

I labeled this WIP because there are probably still several racy bits. Quite a bit of test-writing remains.

See the individual commits for the details. tl;dr: We currently either process a peer's message or read from their socket, but never both simultaneously. The changes here remove that restriction.


std::deque<CNetMessage>::iterator it = pfrom->vRecvMsg.begin();
while (!pfrom->fDisconnect && it != pfrom->vRecvMsg.end()) {
// Don't bother if send buffer is too full to respond anyway
Copy link
Contributor

@rebroad rebroad Dec 29, 2016

I'm not sure if github is messing with the formatting, but this line looks unnecessarily indented.

Copy link
Member Author

@theuni theuni Dec 29, 2016

Yes, the indentation was just kept to avoid creating a huge whitespace diff. This can be cleaned up in a move-only commit as a next step.

@theuni
Copy link
Member Author

@theuni theuni commented Dec 31, 2016

@sipa I've spent all day on changing this around some and breaking this up into logical commits in order to satisfy myself (and i hope others) that it's safe. I'll push up a fresh version in a few min, and drop the WIP tag.

@theuni theuni changed the title WIP: Massive net speedup. Net locks overhaul. Net: Massive speedup. Net locks overhaul Dec 31, 2016
@theuni
Copy link
Member Author

@theuni theuni commented Dec 31, 2016

Updated. Don't let the number of commits scare you off, the diffstat isn't too bad.

I attempted to break this up into small/simple commits, in order to slowly remove the need for cs_vRecvMsg. The last commit actually removes it.

I'm satisfied that this shouldn't be introducing any new races, as only a handful of vars were actually touched on the SocketHandler thread.

Edit: time-ordered.

if(notify)
condMsgProc.notify_one();
RecordBytesRecv(nBytes);
}
Copy link
Member Author

@theuni theuni Dec 31, 2016

Whoops, rebase gone bad here. It ends up right in the end. Will fix.


auto it = pfrom->vRecvMsg.begin();
while (!pfrom->fDisconnect && it != pfrom->vRecvMsg.end()) {
if (!pfrom->fDisconnect && it != pfrom->vRecvMsg.end()) {
Copy link
Member Author

@theuni theuni Dec 31, 2016

Need to clarify in the commit message: It was almost always the case that only one message was processed in this while loop. See the break at line 2538.

This commit changes the behavior so that it's always one message processed per-loop. The only messages that show different behavior are the ones that used to "continue" the loop. I think it's perfectly reasonable to skip to the next node for processing in that case.

Copy link
Member

@sipa sipa Jan 5, 2017

Can you clarify what the rationale is for switching from a ProcessMessages that processes multiple messages to just a single one? I'm not opposed to the change, but I'd like to understand what the reason for the change is.

Copy link
Contributor

@TheBlueMatt TheBlueMatt Jan 5, 2017

Point is this is not a change. See #9441 (comment)

@dcousens
Copy link
Contributor

@dcousens dcousens commented Jan 3, 2017

@theuni is this worth testing now?

@theuni
Copy link
Member Author

@theuni theuni commented Jan 3, 2017

@dcousens Very much so!

int64_t nTime = GetTimeMicros();
LOCK(cs_vRecvMsg);
nRecvBytes += nBytes;
nLastRecv = nTime * 1000;
Copy link
Contributor

@TheBlueMatt TheBlueMatt Jan 3, 2017

I think you meant /, not * here?

Copy link
Member Author

@theuni theuni Jan 3, 2017

Heh, sure did


int64_t nTime = GetTimeMicros();
LOCK(cs_vRecvMsg);
nRecvBytes += nBytes;
Copy link
Contributor

@TheBlueMatt TheBlueMatt Jan 3, 2017

It might be nice in the RPC for mapRecvBytesPerMsgCmd to add up to nRecvBytes (ie to increment this in the loop per-msg instead of after receiving (which would also mean no api change)).

Copy link
Member Author

@theuni theuni Jan 3, 2017

Sure, that just means more locking. Though this is completely uncontested now, so that's not really a problem.

@TheBlueMatt
Copy link
Contributor

@TheBlueMatt TheBlueMatt commented Jan 3, 2017

See IRC discussion:

<BlueMatt> cfields: hmmm, regarding #9441, do we really want to run the entire ProcessMessages loop (calling SendMessages potentially umpteen times) just to process multiple messages from the same node?
<BlueMatt> (ie remove the loop inside ProcessMessages and move it to ThreadProcessMessages)
<sipa> BlueMatt: i was wondering about that too
<sipa> BlueMatt: but there is hardly any (contentious) locking going on anymore in between
<BlueMatt> yea, but SendMessages.....
<sipa> Ah.
<sipa> i hadn't considered that
<sipa> that defeats the purpose
<sipa> hmm, i was about to say that it negatively impacts batching of invs and addrs
<sipa> but since we have explicit random delays for those, i don't think that's really an issue anymore
<BlueMatt> yea, I dont think it breaks anything
<BlueMatt> just repeatedly calls SendMessages to do nothing
<sipa> oh, it won't break anything

@theuni
Copy link
Member Author

@theuni theuni commented Jan 3, 2017

@TheBlueMatt If I understand you correctly, as mentioned in the summary, that's the part that was cut out here for simplicity. The next set of changes would combine the loops and move them into net_processing. See here for how it looked in a previous iteration: theuni@1a6b10a#diff-eff7adeaec73a769788bb78858815c91R271

I'd be happy to add that here, but I'm afraid it adds a significant review burden to this PR.

@theuni
Copy link
Member Author

@theuni theuni commented Jan 3, 2017

Also from IRC:

<BlueMatt> I assume you didnt touch cs_vSend due to https://github.com/bitcoin/bitcoin/pull/9419/commits/c214d120a363a05ba9afdccff6b4bda6e29ae7c4, cfields?
<cfields> BlueMatt: yes, cs_vSend wasn't touched because of your PR. I've just been operating under the assumption that the lock around SendMessages will be removed by one PR or another.

@theuni
Copy link
Member Author

@theuni theuni commented Jan 3, 2017

Digging further into the IRC conversation above, I'd like to clarify that the interaction here between ProcessMessages and SendMessages is not changed substantially in this PR.

@TheBlueMatt had missed a subtle part of the current behavior, and maybe @sipa too, so I'd just like to explicitly point out the current break here: https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L2543, added in #3180

For the sake of processing in a round-robin fashion, we currently end up running through SendMessages multiple times while draining a node's received message queue. That behavior is far from ideal, and needs to be addressed, but is not changed here*. I have plans to work on this in next steps, but not for 0.14.

*It's only changed in the case of an invalid header, or bad message checksum.

@gmaxwell
Copy link
Contributor

@gmaxwell gmaxwell commented Jan 4, 2017

cfields: hmmm, regarding #9441, do we really want to run the entire ProcessMessages loop (calling SendMessages potentially umpteen times) just to process multiple messages from the same node?
(ie remove the loop inside ProcessMessages and move it to ThreadProcessMessages)

Clearly people weren't reading PRs on this or you would have noticed that it only processes one at a time right now. I actually fixed it to process as many as it could until it encountered lock contention, then specifically pinged people for input on the effect. After no response, I unfixed it because of concern about the latency impact. Handling multiple messages at a time is good, if they're fast ones...

@sipa
Copy link
Member

@sipa sipa commented Jan 4, 2017

theuni added 5 commits Jan 4, 2017
Surprisingly this hasn't been causing me any issues while testing, probably
because it requires lots of large blocks to be flying around.

Send/Recv corks need tests!
This will be needed so that the message processor can cork incoming messages
These conditions are problematic to check without locking, and we shouldn't be
relying on the refcount to disconnect.
when vRecvMsg becomes a private buffer, it won't make sense to allow other
threads to mess with it anymore.
@sipa
Copy link
Member

@sipa sipa commented Jan 4, 2017

Needs rebase.

This is left-over from before there was proper accounting. Hitting 2x the
sendbuffer size should not be possible.
theuni added 6 commits Jan 13, 2017
In order to sleep accurately, the message handler needs to know if _any_ node
has more processing that it should do before the entire thread sleeps.

Rather than returning a value that represents whether ProcessMessages
encountered a message that should trigger a disconnnect, interpret the return
value as whether or not that node has more work to do.

Also, use a global fProcessWake value that can be set by other threads,
which takes precedence (for one cycle) over the messagehandler's decision.

Note that the previous behavior was to only process one message per loop
(except in the case of a bad checksum or invalid header). That was changed in
PR bitcoin#3180.

The only change here in that regard is that the current node now falls to the
back of the processing queue for the bad checksum/invalid header cases.
This separates the storage of messages from the net and queued messages for
processing, allowing the locks to be split.
Messages are dumped very quickly from the socket handler to the processor, so
it's the depth of the processing queue that's interesting.

The socket handler checks the process queue's size during the brief message
hand-off and pauses if necessary, and the processor possibly unpauses each time
a message is popped off of its queue.
Similar to the recv flag, but this one indicates whether or not the net's send
buffer is full.

The socket handler checks the send queue when a new message is added and pauses
if necessary, and possibly unpauses after each message is drained from its buffer.
vRecvMsg is now only touched by the socket handler thread.

The accounting vars (nRecvBytes/nLastRecv/mapRecvBytesPerMsgCmd) are also
only used by the socket handler thread, with the exception of queries from
rpc/gui. These accesses are not threadsafe, but they never were. This needs to
be addressed separately.

Also, update comment describing data flow
@theuni
Copy link
Member Author

@theuni theuni commented Jan 13, 2017

Squashed, and I believe this is now merge-ready.

@sipa / @morcos See https://github.com/theuni/bitcoin/tree/connman-locks-tmp for the unsquashed version if you'd like. It's codewise-identical to e60360e.

@morcos
Copy link
Member

@morcos morcos commented Jan 13, 2017

re-utACK e60360e

@TheBlueMatt
Copy link
Contributor

@TheBlueMatt TheBlueMatt commented Jan 13, 2017

Confirmed there is no diff-tree to e60360e from my previous utACK

@sipa
Copy link
Member

@sipa sipa commented Jan 13, 2017

utACK e60360e

@sipa sipa merged commit e60360e into bitcoin:master Jan 13, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
sipa added a commit that referenced this issue Jan 13, 2017
e60360e net: remove cs_vRecvMsg (Cory Fields)
991955e net: add a flag to indicate when a node's send buffer is full (Cory Fields)
c6e8a9b net: add a flag to indicate when a node's process queue is full (Cory Fields)
4d712e3 net: add a new message queue for the message processor (Cory Fields)
c5a8b1b net: rework the way that the messagehandler sleeps (Cory Fields)
c72cc88 net: remove useless comments (Cory Fields)
ef7b5ec net: Add a simple function for waking the message handler (Cory Fields)
f5c36d1 net: record bytes written before notifying the message processor (Cory Fields)
60befa3 net: handle message accounting in ReceiveMsgBytes (Cory Fields)
56212e2 net: set message deserialization version when it's actually time to deserialize (Cory Fields)
0e973d9 net: remove redundant max sendbuffer size check (Cory Fields)
6042587 net: wait until the node is destroyed to delete its recv buffer (Cory Fields)
f6315e0 net: only disconnect if fDisconnect has been set (Cory Fields)
5b4a8ac net: make GetReceiveFloodSize public (Cory Fields)
e5bcd9c net: make vRecvMsg a list so that we can use splice() (Cory Fields)
53ad9a1 net: fix typo causing the wrong receive buffer size (Cory Fields)
@fanquake fanquake moved this from In progress to Done in P2P refactor Mar 3, 2017
random-zebra added a commit to PIVX-Project/PIVX that referenced this issue Aug 27, 2020
e733939 [Cleanup] Remove CConnman::Copy(Release)NodeVector, now unused (random-zebra)
b09da57 [Refactor] Proper CConnman encapsulation of mnsync (random-zebra)
219061e [Refactor] Decouple SyncWithNode from CMasternodeSync::Process() (random-zebra)
e1f3620 [Refactor] Proper CConnman encapsulation of proposals / votes sync (random-zebra)
f38c9f9 miner: don't try to access connman if it doesn't exist (Fuzzbawls)
92ab619 Address feedback (Fuzzbawls)
9d5346a Do not set an addr time penalty when a peer advertises itself. (Gregory Maxwell)
fe12ff9 net: No longer send local address in addrMe (Wladimir J. van der Laan)
14d6917 net: misc header cleanups (Cory Fields)
34ba0de net: make proxy receives interruptible (Cory Fields)
1bd97ef net: make net processing interruptable (Fuzzbawls)
e24b4cc net: make net interruptible (Cory Fields)
814d0de net: add CThreadInterrupt and InterruptibleSleep (Cory Fields)
1443f2a net: a few small cleanups before replacing boost threads (Cory Fields)
58eabe4 net: move MAX_FEELER_CONNECTIONS into connman (Cory Fields)
874baba Convert ForEachNode* functions to take a templated function argument rather than a std::function to eliminate std::function overhead (Jeremy Rubin)
9485ab0 Made the ForEachNode* functions in src/net.cpp more pragmatic and self documenting (Jeremy Rubin)
c1e59ad minor net cleanups (Fuzzbawls)
07ae004 net: move vNodesDisconnected into CConnman (Cory Fields)
276c946 net: add nSendBufferMaxSize/nReceiveFloodSize to CConnection::Options (Cory Fields)
22a9aff net: Introduce CConnection::Options to avoid passing so many params (Cory Fields)
e4891bf net: Drop StartNode/StopNode and use CConnman directly (Cory Fields)
431575c net: pass CClientUIInterface into CConnman (Cory Fields)
48de47e net: Pass best block known height into CConnman (Cory Fields)
15eed91 net: move max/max-outbound to CConnman (Cory Fields)
2bf0921 net: move semOutbound to CConnman (Cory Fields)
481929f net: move nLocalServices/nRelevantServices to CConnman (Cory Fields)
bcee6ae net: move SendBufferSize/ReceiveFloodSize to CConnman (Cory Fields)
6865469 net: move send/recv statistics to CConnman (Cory Fields)
1cec418 net: SocketSendData returns written size (Cory Fields)
2bb9dfa net: move messageHandlerCondition to CConnman (Cory Fields)
9c5a0df net: move nLocalHostNonce to CConnman (Cory Fields)
a1394ef net: move nLastNodeId to CConnman (Cory Fields)
3804c29 net: move whitelist functions into CConnman (Cory Fields)
dbde9be net: create generic functor accessors and move vNodes to CConnman (Cory Fields)
2e02467 net: Add most functions needed for vNodes to CConnman (Cory Fields)
5667e61 net: move added node functions to CConnman (Cory Fields)
37487ed net: Add oneshot functions to CConnman (Cory Fields)
facf878 net: move ban and addrman functions into CConnman (Cory Fields)
091aaf2 net: handle nodesignals in CConnman (Cory Fields)
1e9fa0f net: move OpenNetworkConnection into CConnman (Cory Fields)
573200f net: Move socket binding into CConnman (Cory Fields)
7762b97 net: Pass CConnection to wallet rather than using the global (Fuzzbawls)
00591b8 net: Pass CConnman around as needed (Cory Fields)
2cd3d39 net: Add rpc error for missing/disabled p2p functionality (Cory Fields)
f08e316 net: Create CConnman to encapsulate p2p connections (Cory Fields)
66337dc net: move CBanDB and CAddrDB out of net.h/cpp (Fuzzbawls)
10969f6 gui: add NodeID to the peer table (Cory Fields)
58044ac Fix some locks (Pieter Wuille)
f9f8926 Do not shadow variables in networking code (Pavel Janík)

Pull request description:

  This is the finalization of the upstream PRs being tracked in #1374 to update much of our networking code to more closely follow upstream improvements. The following PRs are included here:

  - bitcoin#8466
    Do not shadow variables in networking code
  - bitcoin#8606
    Fix some locks
  - bitcoin#8085
    Begin encapsulation
  - bitcoin#9289
    drop boost::thread_group
  - bitcoin#8740
    No longer send local address in addrMe
  - bitcoin#8661
    Do not set an addr time penalty when a peer advertises itself

  Additionally, during conflict resolution and backporting of 8085, the following additional upstream PR was included:
  - bitcoin#8715
    only delete CConnman if it's been created

  Still TODO in future PRs:
  - bitcoin#9037
  - bitcoin#9441
  - bitcoin#9609
  - bitcoin#9626
  - bitcoin#9708
  - bitcoin#12381
  - bitcoin#18584

ACKs for top commit:
  furszy:
    same here, code ACK e733939. Nice work  .
  furszy:
    ACK e733939 .
  random-zebra:
    ACK e733939 and merging...

Tree-SHA512: 0fc3cca76d9ddc13f75fc9d48c48d215e6c0e0381377c0318436176a0e0ead73b511e0061a4b63f7052874730b6da8b0ffc2c94cba034bcc39aad4212b69ee22
random-zebra added a commit to PIVX-Project/PIVX that referenced this issue Sep 7, 2020
30d5c66 net: correct addrman logging (Fuzzbawls)
8a2b7fe Don't send layer2 messages to peers that haven't completed the handshake (Fuzzbawls)
dc10100 [bugfix] Making tier two thread interruptable. (furszy)
2ae76aa Move CNode::addrLocal access behind locked accessors (Fuzzbawls)
470482f Move CNode::addrName accesses behind locked accessors (Fuzzbawls)
35365e1 Move [clean|str]SubVer writes/copyStats into a lock (Fuzzbawls)
d816a86 Make nServices atomic (Matt Corallo)
8a66add Make nStartingHeight atomic (Matt Corallo)
567c9b5 Avoid copying CNodeStats to make helgrind OK with buggy std::string (Matt Corallo)
aea5211 Make nTimeConnected const in CNode (Matt Corallo)
cf46680 net: fix a few races. (Fuzzbawls)
c916fcf net: add a lock around hSocket (Cory Fields)
cc8a93c net: rearrange so that socket accesses can be grouped together (Cory Fields)
6f731dc Do not add to vNodes until fOneShot/fFeeler/fAddNode have been set (Matt Corallo)
07c8d33 Ensure cs_vNodes is held when using the return value from FindNode (Matt Corallo)
110a44b Delete some unused (and broken) functions in CConnman (Matt Corallo)
08a12e0 net: log an error rather than asserting if send version is misused (Cory Fields)
cd8b82c net: Disallow sending messages until the version handshake is complete (Fuzzbawls)
54b454b net: don't run callbacks on nodes that haven't completed the version handshake (Cory Fields)
2be6877 net: deserialize the entire version message locally (Fuzzbawls)
444f599 Dont deserialize nVersion into CNode (Fuzzbawls)
f30f10e net: remove cs_vRecvMsg (Fuzzbawls)
5812f9e net: add a flag to indicate when a node's send buffer is full (Fuzzbawls)
5ec4db2 net: Hardcode protocol sizes and use fixed-size types (Wladimir J. van der Laan)
de87ea6 net: Consistent checksum handling (Wladimir J. van der Laan)
d4bcd25 net: push only raw data into CConnman (Cory Fields)
b79e416 net: add CVectorWriter and CNetMsgMaker (Cory Fields)
63c51d3 net: No need to check individually for disconnection anymore (Cory Fields)
07d8c7b net: don't send any messages before handshake or after fdisconnect (Cory Fields)
9adfc7f net: Set feelers to disconnect at the end of the version message (Cory Fields)
f88c06c net: handle version push in InitializeNode (Cory Fields)
04d39c8 net: construct CNodeStates in place (Cory Fields)
40a6c5d net: remove now-unused ssSend and Fuzz (Cory Fields)
681c62d drop the optimistic write counter hack (Cory Fields)
9f939f3 net: switch all callers to connman for pushing messages (Cory Fields)
8f9011d connman is in charge of pushing messages (Cory Fields)
f558bb7 serialization: teach serializers variadics (Cory Fields)
01ea667 net: Use deterministic randomness for CNode's nonce, and make it const (Cory Fields)
de1ad13 net: constify a few CNode vars to indicate that they're threadsafe (Cory Fields)
34050a3 Move static global randomizer seeds into CConnman (Pieter Wuille)
1ce349f net: add a flag to indicate when a node's process queue is full (Fuzzbawls)
5581b47 net: add a new message queue for the message processor (Fuzzbawls)
701b578 net: rework the way that the messagehandler sleeps (Fuzzbawls)
7e55dbf net: Add a simple function for waking the message handler (Cory Fields)
47ea844 net: record bytes written before notifying the message processor (Cory Fields)
ffd4859 net: handle message accounting in ReceiveMsgBytes (Cory Fields)
8cee696 net: log bytes recv/sent per command (Fuzzbawls)
754400e net: set message deserialization version when it's time to deserialize (Fuzzbawls)
d2b8e0a net: make CMessageHeader a dumb storage class (Fuzzbawls)
cc24eff net: remove redundant max sendbuffer size check (Fuzzbawls)
32ab0c0 net: wait until the node is destroyed to delete its recv buffer (Cory Fields)
6e3f71b net: only disconnect if fDisconnect has been set (Cory Fields)
1b0beb6 net: make GetReceiveFloodSize public (Cory Fields)
229697a net: make vRecvMsg a list so that we can use splice() (Fuzzbawls)
d2d71ba net: fix typo causing the wrong receive buffer size (Cory Fields)
50bb09d Add test-before-evict discipline to addrman (Ethan Heilman)

Pull request description:

  This is a combination of multiple upstream PRs focused on optimizing the P2P networking flow after the introduction of CConnman encapsulation, and a few older PRs that were previously missed to support the later optimizations. The PRs are as follows:

  - bitcoin#9037 - net: Add test-before-evict discipline to addrman
  - bitcoin#5151 - make CMessageHeader a dumb storage class
  - bitcoin#6589 - log bytes recv/sent per command
  - bitcoin#8688 - Move static global randomizer seeds into CConnman
  - bitcoin#9050 - net: make a few values immutable, and use deterministic randomness for the localnonce
  - bitcoin#8708 - net: have CConnman handle message sending
  - bitcoin#9128 - net: Decouple CConnman and message serialization
  - bitcoin#8822 - net: Consistent checksum handling
  - bitcoin#9441 - Net: Massive speedup. Net locks overhaul
  - bitcoin#9609 - net: fix remaining net assertions
  - bitcoin#9626 - Clean up a few CConnman cs_vNodes/CNode things
  - bitcoin#9698 - net: fix socket close race
  - bitcoin#9708 - Clean up all known races/platform-specific UB at the time PR was opened
    - Excluded bitcoin/bitcoin@512731b and bitcoin/bitcoin@d8f2b8a, to be done in a separate PR

ACKs for top commit:
  furszy:
    code ACK 30d5c66 , testnet sync from scratch went well and tested with #1829 on top as well and all good.
  furszy:
     mainnet sync went fine, ACK 30d5c66 .
  random-zebra:
    ACK 30d5c66 and merging...

Tree-SHA512: 09689554f53115a45f810b47ff75d887fa9097ea05992a638dbb6055262aeecd82d6ce5aaa2284003399d839b6f2c36f897413da96cfa2cd3b858387c3f752c1
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

10 participants