P2P: parse network datastream into header/data components in socket thread #2016

Closed
wants to merge 1 commit into from

4 participants

@jgarzik
Bitcoin member

Replaces CNode::vRecv buffer with a vector of CNetMessage's. This simplifies
ProcessMessages() and eliminates several redundant data copies.

Overview:

  • socket thread now parses incoming message datastream into header/data components, as encapsulated by CNetMessage
  • socket thread adds each CNetMessage to a vector inside CNode
  • message thread (ProcessMessages) iterates through CNode's CNetMessage vector

Message parsing is made more strict:

  • Socket is disconnected, if message larger than MAX_SIZE or if CMessageHeader deserialization fails (latter is impossible?). Previously, code would simply eat garbage data all day long.
  • Socket is disconnected, if we fail to find pchMessageStart. We do not search through garbage, to find pchMessageStart. Each message must begin precisely after the last message ends.

ProcessMessages() always processes a complete message, and is more efficient:

  • buffer is always precisely sized, using CDataStream::resize(), rather than progressively sized in 64k chunks. More efficient for large messages like "block".
  • whole-buffer memory copy eliminated (vRecv -> vMsg)
  • other buffer-shifting memory copies eliminated (vRecv.insert, vRecv.erase)
@sipa sipa commented on an outdated diff Nov 15, 2012
src/net.cpp
@@ -627,6 +627,78 @@ void CNode::copyStats(CNodeStats &stats)
}
#undef X
+// requires LOCK(cs_vRecvMsg)
+bool CNode::ReceiveMsgBytes(const char *pch, unsigned int nBytes)
+{
+ while (nBytes > 0) {
+
+ // get current incomplete message, or create a new one
+ if (vRecvMsg.size() == 0 ||
+ vRecvMsg[vRecvMsg.size() - 1].complete())
+ vRecvMsg.push_back(CNetMessage(SER_NETWORK, nRecvVersion));
+
+ CNetMessage& msg = vRecvMsg[vRecvMsg.size() - 1];
@sipa
Bitcoin member
sipa added a note Nov 15, 2012

vRecvMsg.back();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Jeff Garzik P2P: parse network datastream into header/data components in socket t…
…hread

Replaces CNode::vRecv buffer with a vector of CNetMessage's.  This simplifies
ProcessMessages() and eliminates several redundant data copies.

Overview:

* socket thread now parses incoming message datastream into
  header/data components, as encapsulated by CNetMessage
* socket thread adds each CNetMessage to a vector inside CNode
* message thread (ProcessMessages) iterates through CNode's CNetMessage vector

Message parsing is made more strict:

* Socket is disconnected, if message larger than MAX_SIZE
  or if CMessageHeader deserialization fails (latter is impossible?).
  Previously, code would simply eat garbage data all day long.
* Socket is disconnected, if we fail to find pchMessageStart.
  We do not search through garbage, to find pchMessageStart.  Each
  message must begin precisely after the last message ends.

ProcessMessages() always processes a complete message, and is more efficient:

* buffer is always precisely sized, using CDataStream::resize(),
  rather than progressively sized in 64k chunks.  More efficient
  for large messages like "block".
* whole-buffer memory copy eliminated (vRecv -> vMsg)
* other buffer-shifting memory copies eliminated (vRecv.insert, vRecv.erase)
8af12a9
@jgarzik
Bitcoin member

@sipa: ITYM back(). Updated.

@sipa sipa commented on the diff Nov 18, 2012
src/net.h
+ {
+ unsigned int total = 0;
+ for (unsigned int i = 0; i < vRecvMsg.size(); i++)
+ total += vRecvMsg[i].vRecv.size();
+ return total;
+ }
+
+ // requires LOCK(cs_vRecvMsg)
+ bool ReceiveMsgBytes(const char *pch, unsigned int nBytes);
+
+ // requires LOCK(cs_vRecvMsg)
+ void SetRecvVersion(int nVersionIn)
+ {
+ nRecvVersion = nVersionIn;
+ for (unsigned int i = 0; i < vRecvMsg.size(); i++)
+ vRecvMsg[i].SetVersion(nVersionIn);
@sipa
Bitcoin member
sipa added a note Nov 18, 2012

I'm not sure this is right. When the receive version is set, it is only applied to new messages. In practice that probably doesn't mean anything, as the version/verack message order is quite strict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@sipa
Bitcoin member

I like the general design, but this pull mixes an client-side optimization with changing the network protocol policy.

I'm not against making it more strict and not trying to resync after a partial message or garbage data, but maybe that needs some discussion at least.

@mikehearn

When I didn't have resync after garbage data in bitcoinj I did see failures due to it, though it was long ago.

BTW, is it possible now to send huge numbers of messages and cause OOM conditions? Previously if you did that the unread data would stick around in the kernels socket buffers and be discarded automatically. Now I guess the socket thread can read faster than the main thread can process.

@sipa
Bitcoin member

The last time I saw garbage occurring frequently was after the feb20 protocol upgrade. I suppose we can start requiring no garbage now...

The is still a receive buffer flooding check, by the way.

@BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/8af12a9f5a56de594700abb5b0d9a05adf82d64b for binaries and test log.

@sipa
Bitcoin member

I wonder why we even need that flood protection. You could just as well stop polling sockets for read events if their receive buffer is above some threshold.

@sipa
Bitcoin member

For the record: I saw a node segfault with (among others) this patch, in the ~CNetMessage destructor (the CDataStream in it was not allocated, I assume uninitialized memory used as a CNetMessage).

@sipa sipa commented on the diff Dec 13, 2012
src/main.cpp
@@ -3380,7 +3366,10 @@ bool ProcessMessages(CNode* pfrom)
printf("ProcessMessage(%s, %u bytes) FAILED\n", strCommand.c_str(), nMessageSize);
}
- vRecv.Compact();
+ // remove processed messages; one incomplete message may remain
@sipa
Bitcoin member
sipa added a note Dec 13, 2012

Perhaps it's better to use an std::deque here instead of a std::vector?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/8af12a9f5a56de594700abb5b0d9a05adf82d64b for binaries and test log.

@sipa sipa referenced this pull request Mar 24, 2013
Merged

Network optimalizations #2409

@jgarzik
Bitcoin member

Superceded by #2409

@jgarzik jgarzik closed this Mar 24, 2013
@jgarzik jgarzik deleted the jgarzik:rx-no-buffer branch Aug 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment