Net: Massive speedup. Net locks overhaul #9441
Depends on (includes) #9289. This is what I ultimately set out to fix with the net refactor.
In my (short) tests, it cuts network latency to ~50%. In some cases, more like 30%.
Test method: 1 fresh node (macbook) connected to a single other node (desktop). Testnet. Running until synced to block 10000. Results:
The results are very reproducible, always within a few seconds. Not only is it a nice improvement for the fresh node, but it compounds when its peer is running the updated code as well.
I had hoped to have the abstraction complete in time for 0.14, but it looks like that's unlikely at this point. For reference, it would look something more like theuni@1a6b10a.
Anyway, this is an attempt to fix the actual issue in time for 0.14, and putting off the last bit of refactor until after that. This addresses the issue observed in #9415, but also cleans up the nasty locking issues.
I labeled this WIP because there are probably still several racy bits. Quite a bit of test-writing remains.
See the individual commits for the details. tl;dr: We currently either process a peer's message or read from their socket, but never both simultaneously. The changes here remove that restriction.
The text was updated successfully, but these errors were encountered:
Updated. Don't let the number of commits scare you off, the diffstat isn't too bad.
I attempted to break this up into small/simple commits, in order to slowly remove the need for cs_vRecvMsg. The last commit actually removes it.
I'm satisfied that this shouldn't be introducing any new races, as only a handful of vars were actually touched on the SocketHandler thread.
See IRC discussion:
@TheBlueMatt If I understand you correctly, as mentioned in the summary, that's the part that was cut out here for simplicity. The next set of changes would combine the loops and move them into net_processing. See here for how it looked in a previous iteration: theuni@1a6b10a#diff-eff7adeaec73a769788bb78858815c91R271
I'd be happy to add that here, but I'm afraid it adds a significant review burden to this PR.
Also from IRC:
Digging further into the IRC conversation above, I'd like to clarify that the interaction here between ProcessMessages and SendMessages is not changed substantially in this PR.
@TheBlueMatt had missed a subtle part of the current behavior, and maybe @sipa too, so I'd just like to explicitly point out the current break here: https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L2543, added in #3180
For the sake of processing in a round-robin fashion, we currently end up running through SendMessages multiple times while draining a node's received message queue. That behavior is far from ideal, and needs to be addressed, but is not changed here*. I have plans to work on this in next steps, but not for 0.14.
*It's only changed in the case of an invalid header, or bad message checksum.
Clearly people weren't reading PRs on this or you would have noticed that it only processes one at a time right now. I actually fixed it to process as many as it could until it encountered lock contention, then specifically pinged people for input on the effect. After no response, I unfixed it because of concern about the latency impact. Handling multiple messages at a time is good, if they're fast ones...
Surprisingly this hasn't been causing me any issues while testing, probably because it requires lots of large blocks to be flying around. Send/Recv corks need tests!
In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR bitcoin#3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases.
This separates the storage of messages from the net and queued messages for processing, allowing the locks to be split.
Messages are dumped very quickly from the socket handler to the processor, so it's the depth of the processing queue that's interesting. The socket handler checks the process queue's size during the brief message hand-off and pauses if necessary, and the processor possibly unpauses each time a message is popped off of its queue.
Similar to the recv flag, but this one indicates whether or not the net's send buffer is full. The socket handler checks the send queue when a new message is added and pauses if necessary, and possibly unpauses after each message is drained from its buffer.
vRecvMsg is now only touched by the socket handler thread. The accounting vars (nRecvBytes/nLastRecv/mapRecvBytesPerMsgCmd) are also only used by the socket handler thread, with the exception of queries from rpc/gui. These accesses are not threadsafe, but they never were. This needs to be addressed separately. Also, update comment describing data flow
e60360e net: remove cs_vRecvMsg (Cory Fields) 991955e net: add a flag to indicate when a node's send buffer is full (Cory Fields) c6e8a9b net: add a flag to indicate when a node's process queue is full (Cory Fields) 4d712e3 net: add a new message queue for the message processor (Cory Fields) c5a8b1b net: rework the way that the messagehandler sleeps (Cory Fields) c72cc88 net: remove useless comments (Cory Fields) ef7b5ec net: Add a simple function for waking the message handler (Cory Fields) f5c36d1 net: record bytes written before notifying the message processor (Cory Fields) 60befa3 net: handle message accounting in ReceiveMsgBytes (Cory Fields) 56212e2 net: set message deserialization version when it's actually time to deserialize (Cory Fields) 0e973d9 net: remove redundant max sendbuffer size check (Cory Fields) 6042587 net: wait until the node is destroyed to delete its recv buffer (Cory Fields) f6315e0 net: only disconnect if fDisconnect has been set (Cory Fields) 5b4a8ac net: make GetReceiveFloodSize public (Cory Fields) e5bcd9c net: make vRecvMsg a list so that we can use splice() (Cory Fields) 53ad9a1 net: fix typo causing the wrong receive buffer size (Cory Fields)
e733939 [Cleanup] Remove CConnman::Copy(Release)NodeVector, now unused (random-zebra) b09da57 [Refactor] Proper CConnman encapsulation of mnsync (random-zebra) 219061e [Refactor] Decouple SyncWithNode from CMasternodeSync::Process() (random-zebra) e1f3620 [Refactor] Proper CConnman encapsulation of proposals / votes sync (random-zebra) f38c9f9 miner: don't try to access connman if it doesn't exist (Fuzzbawls) 92ab619 Address feedback (Fuzzbawls) 9d5346a Do not set an addr time penalty when a peer advertises itself. (Gregory Maxwell) fe12ff9 net: No longer send local address in addrMe (Wladimir J. van der Laan) 14d6917 net: misc header cleanups (Cory Fields) 34ba0de net: make proxy receives interruptible (Cory Fields) 1bd97ef net: make net processing interruptable (Fuzzbawls) e24b4cc net: make net interruptible (Cory Fields) 814d0de net: add CThreadInterrupt and InterruptibleSleep (Cory Fields) 1443f2a net: a few small cleanups before replacing boost threads (Cory Fields) 58eabe4 net: move MAX_FEELER_CONNECTIONS into connman (Cory Fields) 874baba Convert ForEachNode* functions to take a templated function argument rather than a std::function to eliminate std::function overhead (Jeremy Rubin) 9485ab0 Made the ForEachNode* functions in src/net.cpp more pragmatic and self documenting (Jeremy Rubin) c1e59ad minor net cleanups (Fuzzbawls) 07ae004 net: move vNodesDisconnected into CConnman (Cory Fields) 276c946 net: add nSendBufferMaxSize/nReceiveFloodSize to CConnection::Options (Cory Fields) 22a9aff net: Introduce CConnection::Options to avoid passing so many params (Cory Fields) e4891bf net: Drop StartNode/StopNode and use CConnman directly (Cory Fields) 431575c net: pass CClientUIInterface into CConnman (Cory Fields) 48de47e net: Pass best block known height into CConnman (Cory Fields) 15eed91 net: move max/max-outbound to CConnman (Cory Fields) 2bf0921 net: move semOutbound to CConnman (Cory Fields) 481929f net: move nLocalServices/nRelevantServices to CConnman (Cory Fields) bcee6ae net: move SendBufferSize/ReceiveFloodSize to CConnman (Cory Fields) 6865469 net: move send/recv statistics to CConnman (Cory Fields) 1cec418 net: SocketSendData returns written size (Cory Fields) 2bb9dfa net: move messageHandlerCondition to CConnman (Cory Fields) 9c5a0df net: move nLocalHostNonce to CConnman (Cory Fields) a1394ef net: move nLastNodeId to CConnman (Cory Fields) 3804c29 net: move whitelist functions into CConnman (Cory Fields) dbde9be net: create generic functor accessors and move vNodes to CConnman (Cory Fields) 2e02467 net: Add most functions needed for vNodes to CConnman (Cory Fields) 5667e61 net: move added node functions to CConnman (Cory Fields) 37487ed net: Add oneshot functions to CConnman (Cory Fields) facf878 net: move ban and addrman functions into CConnman (Cory Fields) 091aaf2 net: handle nodesignals in CConnman (Cory Fields) 1e9fa0f net: move OpenNetworkConnection into CConnman (Cory Fields) 573200f net: Move socket binding into CConnman (Cory Fields) 7762b97 net: Pass CConnection to wallet rather than using the global (Fuzzbawls) 00591b8 net: Pass CConnman around as needed (Cory Fields) 2cd3d39 net: Add rpc error for missing/disabled p2p functionality (Cory Fields) f08e316 net: Create CConnman to encapsulate p2p connections (Cory Fields) 66337dc net: move CBanDB and CAddrDB out of net.h/cpp (Fuzzbawls) 10969f6 gui: add NodeID to the peer table (Cory Fields) 58044ac Fix some locks (Pieter Wuille) f9f8926 Do not shadow variables in networking code (Pavel Janík) Pull request description: This is the finalization of the upstream PRs being tracked in #1374 to update much of our networking code to more closely follow upstream improvements. The following PRs are included here: - bitcoin#8466 Do not shadow variables in networking code - bitcoin#8606 Fix some locks - bitcoin#8085 Begin encapsulation - bitcoin#9289 drop boost::thread_group - bitcoin#8740 No longer send local address in addrMe - bitcoin#8661 Do not set an addr time penalty when a peer advertises itself Additionally, during conflict resolution and backporting of 8085, the following additional upstream PR was included: - bitcoin#8715 only delete CConnman if it's been created Still TODO in future PRs: - bitcoin#9037 - bitcoin#9441 - bitcoin#9609 - bitcoin#9626 - bitcoin#9708 - bitcoin#12381 - bitcoin#18584 ACKs for top commit: furszy: same here, code ACK e733939. Nice work
☕. furszy: ACK e733939 . random-zebra: ACK e733939 and merging... Tree-SHA512: 0fc3cca76d9ddc13f75fc9d48c48d215e6c0e0381377c0318436176a0e0ead73b511e0061a4b63f7052874730b6da8b0ffc2c94cba034bcc39aad4212b69ee22
30d5c66 net: correct addrman logging (Fuzzbawls) 8a2b7fe Don't send layer2 messages to peers that haven't completed the handshake (Fuzzbawls) dc10100 [bugfix] Making tier two thread interruptable. (furszy) 2ae76aa Move CNode::addrLocal access behind locked accessors (Fuzzbawls) 470482f Move CNode::addrName accesses behind locked accessors (Fuzzbawls) 35365e1 Move [clean|str]SubVer writes/copyStats into a lock (Fuzzbawls) d816a86 Make nServices atomic (Matt Corallo) 8a66add Make nStartingHeight atomic (Matt Corallo) 567c9b5 Avoid copying CNodeStats to make helgrind OK with buggy std::string (Matt Corallo) aea5211 Make nTimeConnected const in CNode (Matt Corallo) cf46680 net: fix a few races. (Fuzzbawls) c916fcf net: add a lock around hSocket (Cory Fields) cc8a93c net: rearrange so that socket accesses can be grouped together (Cory Fields) 6f731dc Do not add to vNodes until fOneShot/fFeeler/fAddNode have been set (Matt Corallo) 07c8d33 Ensure cs_vNodes is held when using the return value from FindNode (Matt Corallo) 110a44b Delete some unused (and broken) functions in CConnman (Matt Corallo) 08a12e0 net: log an error rather than asserting if send version is misused (Cory Fields) cd8b82c net: Disallow sending messages until the version handshake is complete (Fuzzbawls) 54b454b net: don't run callbacks on nodes that haven't completed the version handshake (Cory Fields) 2be6877 net: deserialize the entire version message locally (Fuzzbawls) 444f599 Dont deserialize nVersion into CNode (Fuzzbawls) f30f10e net: remove cs_vRecvMsg (Fuzzbawls) 5812f9e net: add a flag to indicate when a node's send buffer is full (Fuzzbawls) 5ec4db2 net: Hardcode protocol sizes and use fixed-size types (Wladimir J. van der Laan) de87ea6 net: Consistent checksum handling (Wladimir J. van der Laan) d4bcd25 net: push only raw data into CConnman (Cory Fields) b79e416 net: add CVectorWriter and CNetMsgMaker (Cory Fields) 63c51d3 net: No need to check individually for disconnection anymore (Cory Fields) 07d8c7b net: don't send any messages before handshake or after fdisconnect (Cory Fields) 9adfc7f net: Set feelers to disconnect at the end of the version message (Cory Fields) f88c06c net: handle version push in InitializeNode (Cory Fields) 04d39c8 net: construct CNodeStates in place (Cory Fields) 40a6c5d net: remove now-unused ssSend and Fuzz (Cory Fields) 681c62d drop the optimistic write counter hack (Cory Fields) 9f939f3 net: switch all callers to connman for pushing messages (Cory Fields) 8f9011d connman is in charge of pushing messages (Cory Fields) f558bb7 serialization: teach serializers variadics (Cory Fields) 01ea667 net: Use deterministic randomness for CNode's nonce, and make it const (Cory Fields) de1ad13 net: constify a few CNode vars to indicate that they're threadsafe (Cory Fields) 34050a3 Move static global randomizer seeds into CConnman (Pieter Wuille) 1ce349f net: add a flag to indicate when a node's process queue is full (Fuzzbawls) 5581b47 net: add a new message queue for the message processor (Fuzzbawls) 701b578 net: rework the way that the messagehandler sleeps (Fuzzbawls) 7e55dbf net: Add a simple function for waking the message handler (Cory Fields) 47ea844 net: record bytes written before notifying the message processor (Cory Fields) ffd4859 net: handle message accounting in ReceiveMsgBytes (Cory Fields) 8cee696 net: log bytes recv/sent per command (Fuzzbawls) 754400e net: set message deserialization version when it's time to deserialize (Fuzzbawls) d2b8e0a net: make CMessageHeader a dumb storage class (Fuzzbawls) cc24eff net: remove redundant max sendbuffer size check (Fuzzbawls) 32ab0c0 net: wait until the node is destroyed to delete its recv buffer (Cory Fields) 6e3f71b net: only disconnect if fDisconnect has been set (Cory Fields) 1b0beb6 net: make GetReceiveFloodSize public (Cory Fields) 229697a net: make vRecvMsg a list so that we can use splice() (Fuzzbawls) d2d71ba net: fix typo causing the wrong receive buffer size (Cory Fields) 50bb09d Add test-before-evict discipline to addrman (Ethan Heilman) Pull request description: This is a combination of multiple upstream PRs focused on optimizing the P2P networking flow after the introduction of CConnman encapsulation, and a few older PRs that were previously missed to support the later optimizations. The PRs are as follows: - bitcoin#9037 - net: Add test-before-evict discipline to addrman - bitcoin#5151 - make CMessageHeader a dumb storage class - bitcoin#6589 - log bytes recv/sent per command - bitcoin#8688 - Move static global randomizer seeds into CConnman - bitcoin#9050 - net: make a few values immutable, and use deterministic randomness for the localnonce - bitcoin#8708 - net: have CConnman handle message sending - bitcoin#9128 - net: Decouple CConnman and message serialization - bitcoin#8822 - net: Consistent checksum handling - bitcoin#9441 - Net: Massive speedup. Net locks overhaul - bitcoin#9609 - net: fix remaining net assertions - bitcoin#9626 - Clean up a few CConnman cs_vNodes/CNode things - bitcoin#9698 - net: fix socket close race - bitcoin#9708 - Clean up all known races/platform-specific UB at the time PR was opened - Excluded bitcoin/bitcoin@512731b and bitcoin/bitcoin@d8f2b8a, to be done in a separate PR ACKs for top commit: furszy: code ACK 30d5c66 , testnet sync from scratch went well and tested with #1829 on top as well and all good. furszy: mainnet sync went fine, ACK 30d5c66 . random-zebra: ACK 30d5c66 and merging... Tree-SHA512: 09689554f53115a45f810b47ff75d887fa9097ea05992a638dbb6055262aeecd82d6ce5aaa2284003399d839b6f2c36f897413da96cfa2cd3b858387c3f752c1