Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: transport abstraction #28165

Merged
merged 9 commits into from
Aug 24, 2023
Merged

net: transport abstraction #28165

merged 9 commits into from
Aug 24, 2023

Conversation

sipa
Copy link
Member

@sipa sipa commented Jul 26, 2023

This PR furthers the P2P message serialization/deserialization abstraction introduced in #16202 and #16562, in preparation for introducing the BIP324 v2 transport (making this part of #27634). However, nothing in this PR is BIP324-specific, and it contains a number of independently useful improvements.

The overall idea is to have a single object in every CNode (called m_transport) that is responsible for converting sent messages to wire bytes, and for converting received wire bytes back to messages, while having as little as possible knowledge about this conversion process in higher-level net code. To accomplish that, there is an abstract Transport class with (currently) a single V1Transport implementation.

Structurally, the above is accomplished by:

  • Merging the TransportDeserializer and TransportSerializer classes into a single Transport class, which encompasses both the sending and receiving side. For V1Transport these two sides are entirely separate, but this assumption doesn't hold for the BIP324 transport where e.g. the sending encryption key depends on the DH key negotiation data received from the other side. Merging the two means a future V2Transport can handle all this interaction without callers needing to be aware.
  • Removing the assumption that each message is sent using a computed header followed by (unmodified) data bytes. To achieve that, the sending side of Transport mirrors what the receiver side does: callers can set a message to be sent, then ask what bytes must be sent out, and then allowing them to transition to the next message.
  • Adding internal locks to protect the sending and receiving state of the V1Transport implementation. I believe these aren't strictly needed (opinions welcome) as there is no real way to use Transport objects in a multi-threaded fashion without some form of external synchronization (e.g. "get next bytes to send" isn't meaningful to call from multiple threads at the same time without mechanism to control the order they'll actually get sent). Still, I feel it's cleaner to make the object responsible for its own consistency (as we definitely do not want the entire object to be under a single external GUARDED_BY, as that'd prevent simultaneous sending and receiving).
  • Moving the conversion of messages to bytes on the sending side from PushMessage to SocketSendData, which is needed to deal with the fact that a transport may not immediately be able to send messages.

This PR is not a refactor, though some commits are. Among the semantic changes are:

  • Changing the send buffer pushback mechanism to trigger based on the memory usage of the buffer rather than the amount of bytes to be sent. This is both closer to the desired behavior, and makes the buffering independent from transport details (which is why it's included here).
  • When optimistic send is not applicable, the V1 message checksum calculation now runs in the net thread rather than the message handling thread. I believe that's generally an improvement, as the message handling thread is far more computationally bottlenecked already.
  • The checksum calculation now runs under the CNode::cs_vSend lock, which does mean no two checksum calculations for messages sent to the same node can run in parallel, even if running in separate threads. Despite that limitation, having the checksum for non-optimistic sends moved in the net thread is still an improvement, I believe.
  • Statistics for per-message-type sent bytes are now updated when the bytes are actually handed to the OS rather than in PushMessage. This is because the actual serialized sizes aren't known until they've gone through the transport object.

A fuzz test of the entire V1Transport is included. More elaborate rationale for each of the changes can be found in the commit messages.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 26, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK theStack, vasild, dergoegge
Concept ACK jonatack
Stale ACK vincenzopalazzo, Sjors, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@dergoegge
Copy link
Member

Concept ACK

Merging the two means a future V2Transport can handle all this interaction without callers needing to be aware.

🚀

@sipa sipa force-pushed the 202307_merge_sers branch from 977f812 to b7a7ed7 Compare July 31, 2023 16:54
@sipa sipa mentioned this pull request Aug 1, 2023
@jonatack
Copy link
Member

jonatack commented Aug 2, 2023

Concept ACK

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Mostly happy with b7a7ed7. Minor inline suggestions and/or making sure I understand what the code is doing.

I also checked that all intermediate commits compile and ran the new fuzzer for several hours (but didn't study it very carefully).

  • 1937a5f: -maxsendbuffer help could be changed to say "Maximum per-connection memory use for the send buffer"
  • 68e48a0: can you elaborate a bit in the commit description why nSendOffset can be dropped?

pnode->m_send_memusage += sizeof(pnode->vSendMsg.back()) + memusage::DynamicUsage(pnode->vSendMsg.back());
pnode->m_transport->MarkBytesSent(bytes.size());
}
pnode->m_send_memusage += msg.GetMemoryUsage();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

68e48a0: this is much readable than the intermediate calculation introduced in 1937a5f. It might be worth moving that commit after here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't reordering works without (even) more intermediary complexity. The issue is "net: move message serialization from PushMessage to SocketSendData" changes the data type of vSendMsg from bytes-to-be-sent to messages-to-be-sent, and the latter just don't have a known size-on-the-wire (unless an additional API to transports is added for that, or hardcoding the V1 message encoding size rules). That's why this PR first changes the notion of send buffer size: the old notion just doesn't really make sense anymore after the buffer changes introduced here.

@sipa
Copy link
Member Author

sipa commented Aug 3, 2023

68e48a0: can you elaborate a bit in the commit description why nSendOffset can be dropped?

In the current codebase, the send buffering (= remembering the to-be-sent bytes which we haven't managed to send yet) is done using vSendMsg (a queue of byte arrays) + nSendOffset (the position within vSendMsg[0] up to where we've sent things). After this PR, vSendMsg is turned into a queue of messages, which have not yet been converted to bytes-on-the-wire; this conversion is now handled by m_transport, and it's the transport that remembers what/how much has been sent yet.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Slowly working my way through, reviewed up to f97adbf (commit 3/8), looks good so far.
One potential code deduplication nit (though I'm not sure if it gains that much in readability): seems like for V1Transport, HaveBytesToSend() and DoneSendingMessage() are in an inverse relation (or however one would call that), so one could be expressed through the other, e.g.:

index 887669e32b..af0fa5c603 100644
--- a/src/net.cpp
+++ b/src/net.cpp
@@ -863,8 +863,7 @@ void V1Transport::SetMessageToSend(CSerializedNetMsg&& msg) noexcept
 
 bool V1Transport::HaveBytesToSend() const noexcept
 {
-    LOCK(m_cs_send);
-    return m_sending_header || m_bytes_sent != m_message_to_send.data.size();
+    return !DoneSendingMessage();
 }
 
 Transport::BytesToSend V1Transport::GetBytesToSend() const noexcept

Copy link

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK till 1937a5f

I will continue later on it

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@@ -72,6 +72,12 @@ void ConnmanTestMsg::NodeReceiveMsgBytes(CNode& node, Span<const uint8_t> msg_by

bool ConnmanTestMsg::ReceiveMsgFrom(CNode& node, CSerializedNetMsg&& ser_msg) const
{
/* Flush out any unsent bytes from previous messages. */
while (node.m_transport->HaveBytesToSend()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems confusing and not ideal that a function called ReceiveMsgFrom has so much code dealing with the Send part, and also the side effect of flushing the send message buffer, when the goal is just to create a header for ser_msg to be able to receive that, but not to send anything. For example, it should be possible to call ReceiveMsgFrom in situations where the send buffer has unrelated contents. Maybe it'd be better to just have the relevant code (i.e. the old prepareForTransport duplicated here to extract the header instead of changing the send parts of m_transport?)

Copy link
Member Author

@sipa sipa Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's complicated. You're right that duplicating the old prepareForTransport here would work, but I feel that's not really the right approach, as it's kind of hiding what's really going on. It'd also be incompatible with a potential future upgrade to using v2 transports inside the affected unit and fuzz tests.

What's really going on is that we're using the CNode's own sending infrastructure to construct bytes, which are then fed to the same CNode's receive infrastructure. But there is also "normal" non-test sending logic (e.g. sending of version/verack automatically) that uses the same sending infrastucture (however, failing as there is no real socket to send anything on). The flushing here was necessary to wipe the non-test messages that enter the same transport.

I have for now tried to address your concern here by introducing a more explicit FlushSendBuffer call, and adding it where necessary to make tests pass, making it clear where and what is being flushed away, however I think the more proper solution would involve:

  • Somehow prevent the sending of non-test messages entirely in test connmans (rather that hacking them away after the fact).
  • Give the test connman dedicated per-receive-node send transports which are just used for the test messages sent there. That would make it compatible with v2 operation too.

Doing those things here feel out of scope, though.

@sipa sipa force-pushed the 202307_merge_sers branch from b7a7ed7 to f9f69aa Compare August 15, 2023 22:28
@sipa
Copy link
Member Author

sipa commented Aug 15, 2023

I've made some significant changes to this PR:

  • DoneSendingMessage is gone (the caller can just call SetMessageToSend directly, which will fail if nothing can be sent at that time).
  • HaveBytesToSend is gone (the caller can just call GetBytesToSend, which will report an empty span if nothing is to be sent).
  • GetBytesToSend now takes a have_next_message as input, which lets its prediction for whether there are more bytes to send afterwards be more accurate (letting it take into account a future SetMessageToSend). This prediction is also tested now, through an Assume in the net code directly, and inside the transport simulation fuzz test.
  • A lot more comments and documentation.
  • More sanity checks in the added fuzz test.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new approach in 384c798 is a nice improvement.

I'll continue reviewing from 09972b5 onwards after you've had a chance to comment on my more comment.

@sipa sipa force-pushed the 202307_merge_sers branch 2 times, most recently from 51c8604 to d9a91ba Compare August 16, 2023 23:22
Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d9a91ba just in time for the #27981 rebase :-(

@sipa sipa force-pushed the 202307_merge_sers branch 2 times, most recently from 3e0d044 to 47eb247 Compare August 17, 2023 19:01
@sipa sipa force-pushed the 202307_merge_sers branch from 47eb247 to 44bb2ba Compare August 18, 2023 02:51
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 8a3b6f3

@DrahtBot DrahtBot requested review from vincenzopalazzo and removed request for vincenzopalazzo August 24, 2023 11:24
Copy link
Member

@dergoegge dergoegge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 8a3b6f3

@DrahtBot DrahtBot requested review from vincenzopalazzo and removed request for vincenzopalazzo August 24, 2023 12:36
@fanquake fanquake merged commit 1fa6411 into bitcoin:master Aug 24, 2023
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Sep 8, 2023
8a3b6f3 refactor: make Transport::ReceivedBytes just return success/fail (Pieter Wuille)
bb4aab9 net: move message conversion to wire bytes from PushMessage to SocketSendData (Pieter Wuille)
a1a1060 net: measure send buffer fullness based on memory usage (Pieter Wuille)
009ff8d fuzz: add bidirectional fragmented transport test (Pieter Wuille)
fb2c5ed net: make V1Transport implicitly use current chainparams (Pieter Wuille)
0de48fe net: abstract sending side of transport serialization further (Pieter Wuille)
649a83c refactor: rename Transport class receive functions (Pieter Wuille)
27f9ba2 net: add V1Transport lock protecting receive state (Pieter Wuille)
93594e4 refactor: merge transport serializer and deserializer into Transport class (Pieter Wuille)

Pull request description:

  This PR furthers the P2P message serialization/deserialization abstraction introduced in bitcoin#16202 and bitcoin#16562, in preparation for introducing the BIP324 v2 transport (making this part of bitcoin#27634). However, nothing in this PR is BIP324-specific, and it contains a number of independently useful improvements.

  The overall idea is to have a single object in every `CNode` (called `m_transport`) that is responsible for converting sent messages to wire bytes, and for converting received wire bytes back to messages, while having as little as possible knowledge about this conversion process in higher-level net code. To accomplish that, there is an abstract `Transport` class with (currently) a single `V1Transport` implementation.

  Structurally, the above is accomplished by:
  * Merging the `TransportDeserializer` and `TransportSerializer` classes into a single `Transport` class, which encompasses both the sending and receiving side. For `V1Transport` these two sides are entirely separate, but this assumption doesn't hold for the BIP324 transport where e.g. the sending encryption key depends on the DH key negotiation data received from the other side. Merging the two means a future `V2Transport` can handle all this interaction without callers needing to be aware.
  * Removing the assumption that each message is sent using a computed header followed by (unmodified) data bytes. To achieve that, the sending side of `Transport` mirrors what the receiver side does: callers can set a message to be sent, then ask what bytes must be sent out, and then allowing them to transition to the next message.
  * Adding internal locks to protect the sending and receiving state of the `V1Transport` implementation. I believe these aren't strictly needed (opinions welcome) as there is no real way to use `Transport` objects in a multi-threaded fashion without some form of external synchronization (e.g. "get next bytes to send" isn't meaningful to call from multiple threads at the same time without mechanism to control the order they'll actually get sent). Still, I feel it's cleaner to make the object responsible for its own consistency (as we definitely do not want the entire object to be under a single external GUARDED_BY, as that'd prevent simultaneous sending and receiving).
  * Moving the conversion of messages to bytes on the sending side from `PushMessage` to `SocketSendData`, which is needed to deal with the fact that a transport may not immediately be able to send messages.

  This PR is not a refactor, though some commits are. Among the semantic changes are:
  * Changing the send buffer pushback mechanism to trigger based on the memory usage of the buffer rather than the amount of bytes to be sent. This is both closer to the desired behavior, and makes the buffering independent from transport details (which is why it's included here).
  * When optimistic send is not applicable, the V1 message checksum calculation now runs in the net thread rather than the message handling thread. I believe that's generally an improvement, as the message handling thread is far more computationally bottlenecked already.
  * The checksum calculation now runs under the `CNode::cs_vSend` lock, which does mean no two checksum calculations for messages sent to the same node can run in parallel, even if running in separate threads. Despite that limitation, having the checksum for non-optimistic sends moved in the net thread is still an improvement, I believe.
  * Statistics for per-message-type sent bytes are now updated when the bytes are actually handed to the OS rather than in `PushMessage`. This is because the actual serialized sizes aren't known until they've gone through the transport object.

  A fuzz test of the entire `V1Transport` is included. More elaborate rationale for each of the changes can be found in the commit messages.

ACKs for top commit:
  theStack:
    re-ACK 8a3b6f3
  vasild:
    ACK 8a3b6f3
  dergoegge:
    Code review ACK 8a3b6f3

Tree-SHA512: 26e9a6df47f1dd3e3f3edb4874edf365728e5a8bbc9d0d4d71fb6000cb2dfde5574902c47ffcf825af6743922f2ff9d31a5a38942a196f4ca6669122e15e42e4
void initialize_p2p_transport_serialization()
{
SelectParams(ChainType::REGTEST);
g_all_messages = getAllNetMessageTypes();
std::sort(g_all_messages.begin(), g_all_messages.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this sort here? Anybody remembers? It is only used like:

g_all_messages[v % g_all_messages.size()]

where v is a random number. The sort seems unnecessary?

The question arised in: #29421 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's not random, it's a fuzzer input. They're random in a way, but definitely not uniformly random.

I think the sorting is there to make sure that just reorderings in the list don't affect fuzzer behavior.

kwvg added a commit to kwvg/dash that referenced this pull request Aug 4, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Aug 4, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Aug 6, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Aug 6, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Aug 13, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Aug 27, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Aug 27, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Aug 31, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Sep 4, 2024
PastaPastaPasta added a commit to dashpay/dash that referenced this pull request Sep 4, 2024
, bitcoin#24021, bitcoin#24543, bitcoin#26844, bitcoin#25325, bitcoin#28165, partial bitcoin#20524, bitcoin#26036, bitcoin#27981 (networking backports: part 7)

76a458e fmt: apply formatting suggestions from `clang-format-diff.py` (Kittywhiskers Van Gogh)
63962ec merge bitcoin#28165: transport abstraction (Kittywhiskers Van Gogh)
c6b9186 merge bitcoin#25325: Add pool based memory resource (Kittywhiskers Van Gogh)
8c986d6 partial bitcoin#27981: Fix potential network stalling bug (Kittywhiskers Van Gogh)
13f6dc1 merge bitcoin#26844: Pass MSG_MORE flag when sending non-final network messages (Kittywhiskers Van Gogh)
caaa0fd net: use `std::deque` for `vSendMsg` instead of `std::list` (Kittywhiskers Van Gogh)
2ecba6b partial bitcoin#26036: add NetEventsInterface::g_msgproc_mutex (Kittywhiskers Van Gogh)
f6c9439 merge bitcoin#24543: Move remaining globals into PeerManagerImpl (Kittywhiskers Van Gogh)
dbe41ea refactor: move object request logic to `PeerManagerImpl` (Kittywhiskers Van Gogh)
112c4e0 merge bitcoin#24021: Rename and move PoissonNextSend functions (Kittywhiskers Van Gogh)
6d690ed merge bitcoin#23970: Remove pointless and confusing shift in RelayAddress (Kittywhiskers Van Gogh)
87205f2 merge bitcoin#21327: ignore transactions while in IBD (Kittywhiskers Van Gogh)
51ad8e4 merge bitcoin#21148: Split orphan handling from net_processing into txorphanage (Kittywhiskers Van Gogh)
cbff29a partial bitcoin#20524: Move MIN_VERSION_SUPPORTED to p2p.py (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  * Dependent on #6098

  * Dependent on #6233

  * `p2p_ibd_txrelay.py` was first introduced in [bitcoin#19423](bitcoin#19423) to test feefilter logic but on account of Dash not having feefilter capabilities, that backport was skipped over but on account of the tests introduced in [bitcoin#21327](bitcoin#21327) that test capabilities present in Dash, a minimal version of `p2p_ibd_txrelay.py` has been committed in.

  * `vSendMsg` is originally a `std::deque` and as an optimization, was changed to a `std::list` in 027a852 ([dash#3398](#3398)) but this renders us unable to backport [bitcoin#26844](bitcoin#26844) as it introduces build failures. The optimization has been reverted to make way for the backport.

    <details>

    <summary>Compile failure:</summary>

    ```
    net.cpp:959:20: error: invalid operands to binary expression ('iterator' (aka '_List_iterator<std::vector<unsigned char, std::allocator<unsigned char>>>') and 'int')
                if (it + 1 != node.vSendMsg.end()) {
                    ~~ ^ ~
    /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_bvector.h:303:3: note: candidate function not viable: no known conversion from 'iterator' (aka '_List_iterator<std::vector<unsigned char, std::allocator<unsigned char>>>') to 'ptrdiff_t' (aka 'long') for 1st argument
      operator+(ptrdiff_t __n, const _Bit_iterator& __x)
    [...]
    1 error generated.
    make[2]: *** [Makefile:11296: libbitcoin_server_a-net.o] Error 1
    make[2]: *** Waiting for unfinished jobs....
    make[2]: Leaving directory '/src/dash/src'
    make[1]: *** [Makefile:19171: all-recursive] Error 1
    make[1]: Leaving directory '/src/dash/src'
    make: *** [Makefile:799: all-recursive] Error 1
    ```

    </details>

  * The collection of `CNode` pointers in `CConnman::SocketHandlerConnected` has been changed to a `std::set` to allow for us to erase elements from `vReceivableNodes` if the node is _also_ in the set of sendable nodes and the send hasn't entirely succeeded to avoid a deadlock (i.e. backport [bitcoin#27981](bitcoin#27981))

  * When backporting [bitcoin#28165](bitcoin#28165), `denialofservice_tests` has been modified to still check with `vSendMsg` instead of `Transport::GetBytesToSend()` as changes in networking code to support LT-SEMs (level-triggered socket events mode) mean that the message doesn't get shifted from `vSendMsg` to `m_message_to_send`, as the test expects.
    * Specifically, the changes made for LT-SEM support result in the function responsible for making that shift (`Transport::SetMessageToSend()` through `CConnman::SocketSendData()`), not being called during the test runtime.

  * As checking `vSendMsg` (directly or through `nSendMsgSize`) isn't enough to determine if the queue is empty, we now also check with `to_send` from `Transport::GetBytesToSend()` to help us make that determination. This mirrors the change present in the upstream backport ([source](https://github.com/bitcoin/bitcoin/pull/28165/files#diff-00021eed586a482abdb09d6cdada1d90115abe988a91421851960e26658bed02R1324-R1327)).

  ## Breaking Changes

  * `bandwidth.message.*.bytesSent` will no longer include overhead and will now only report message size as specifics that let us calculate the overhead have been abstracted away.

  ## Checklist:

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  PastaPastaPasta:
    utACK 76a458e

Tree-SHA512: 2e47c207c1f854cfbd5b28c07dd78e12765ddb919abcd7710325df5d253cd0ba4bc30aa21545d88519e8acfe65638a57db4ca66853aca82fc355542210f4b394
@bitcoin bitcoin locked and limited conversation to collaborators Feb 14, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants