Skip to content

net: Drop v2 garbage authentication packet #28525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 29, 2023

Conversation

real-or-random
Copy link
Contributor

@real-or-random real-or-random commented Sep 24, 2023

Note that this is a breaking change, see also bitcoin/bips#1498

The benefit is a simpler implementation:

  • The protocol state machine does not need separate states for garbage authentication and version phases.
  • The special case of "ignoring the ignore bit" is removed.
  • The freedom to choose the contents of the garbage authentication packet is removed. This simplifies testing.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 24, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK naumenkogs, sipa, Sjors, theStack, ajtowns, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Copy link
Member

@naumenkogs naumenkogs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@ajtowns
Copy link
Contributor

ajtowns commented Sep 26, 2023

Approach ACK assuming BIP change is accepted

Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK, with minor nits. I'm also on board with the corresponding BIP change (see the discussion there).

@sipa sipa mentioned this pull request Sep 19, 2023
43 tasks
See also bitcoin/bips#1498

The benefit is a simpler implementation:
 - The protocol state machine does not need separate states for garbage
   authentication and version phases.
 - The special case of "ignoring the ignore bit" is removed.
 - The freedom to choose the contents of the garbage authentication
   packet is removed. This simplifies testing.
@real-or-random
Copy link
Contributor Author

real-or-random commented Sep 27, 2023

The CI failure seems unrelated: test_framework.authproxy.JSONRPCException: Unable to create transaction. Fee exceeds maximum configured by user (e.g. -maxtxfee, maxfeerate) (-4). edit: Okay, this is #28491.

@naumenkogs
Copy link
Member

ACK e3720bc

@sipa
Copy link
Member

sipa commented Sep 27, 2023

ACK e3720bc. Re-ran the v2 transport fuzzer overnight.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

/** During GARBAUTH, the garbage received during GARB_GARBTERM. */
std::vector<uint8_t> m_recv_garbage GUARDED_BY(m_recv_mutex);
/** AAD expected in next received packet (currently used only for garbage). */
std::vector<uint8_t> m_recv_aad GUARDED_BY(m_recv_mutex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it's a bit weird to receive an AAD in one message but then verify it in the next message. Perhaps it's cleaner to to keep m_recv_garbage around and explicitly move it into m_recv_aad as soon as we start receiving bytes for the decoy/version message?

That way we can Assume that m_recv_aad is empty when start receiving bytes and clear it when we're done processing a message. (this commit already does the latter)

Copy link
Member

@sipa sipa Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's cleaner to to keep m_recv_garbage around and explicitly move it into m_recv_aad as soon as we start receiving bytes for the decoy/version message?

We receive data into m_recv_buffer, and by the time we process it, we're effectively already done with it. So what you're suggesting would just amount to moving/shrinking from m_recv_buffer to m_recv_garbage and immediately after (on the next line) moving that to m_recv_aad.

I think the better way of looking at it is not that the AAD is data received at all; it's a side effect of processing data in a particular phase that results in a requirement that the next message commit to it. For now, the only place where this happens is that during the processing of received bytes in the GARB_TERM phase the garbage needs to be committed to (but not all the data received during that phase, as it doesn't include the terminator).

That way we can Assume that m_recv_aad is empty when start receiving bytes and clear it when we're done processing a message. (this commit already does the latter).

I don't think that works? At the start of the version phase, the previous garbage will be in m_recv_aad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it's a bit weird to receive an AAD in one message but then verify it in the next message. Perhaps it's cleaner to to keep m_recv_garbage around and explicitly move it into m_recv_aad as soon as we start receiving bytes for the decoy/version message?

If I understand correctly, this is roughly equivalent to dropping the second commit again, and I don't think this was cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the start of the version phase, the previous garbage will be in m_recv_aad.

This is the part that feels a bit strange to me. But may it's not too bad:

We could instead Assume that m_recv_aad is empty when we enter the APP and when we enter the APP_READY state. It can be cleared in the VERSION state and after each normal message in the APP state, which indeed this PR does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose an Assume in APP may or may not be future proof, because it depends on how AADs will be used: do we know what to expect for the next message (as is the case here for garbage), do we infer it from the current message, is it some time dependent thing or even a constant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose an Assume in APP may or may not be future proof, because it depends on how AADs will be used: do we know what to expect for the next message (as is the case here for garbage), do we infer it from the current message, is it some time dependent thing or even a constant?

Well, if we ever have a v3 that uses AAD for more than the garbage, we could drop the Assume, so I guess that's not a big deal.

On the other hand, I don't see that much value in adding an Assume. The sender and receiver code are sufficiently independent. So unless we have the exact same bug in the sender logic, a bug in the receiver logic that leads to m_recv_aad being non-empty would lead to an immediate disconnect. That is, such a bug will certainly be noticed during testing, even without the Assume. (The Assume would just make it easier to find the cause.)

My current thinking is that I don't want to touch the PR now that it has three ACKs. But I'll add an Assume if there's another good reason to touch it (or if you have strong opinions on this, of course.)

Copy link
Contributor Author

@real-or-random real-or-random Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally relevant to the PR but some remarks:

do we know what to expect for the next message (as is the case here for garbage)
do we infer it from the current message,

I think one of these two is true in 99.9% of the cases I can imagine.

As for inferring it from current message: Inferring it from the contents of the AEAD is excluded by design (you can't decrypt a ciphertext that doesn't verify). It would just be possible for data in the "outer" header (currently that's just the length field). But even if we want some additional data in the outer header, this doesn't really contradict the current logic in the code (e.g., we could concatenate the current-message thing to whatever m_recv_aad is, or then start Assuming it's empty).

is it some time dependent thing

That will be too fragile, i.e., we would abort the connection if clocks are off.

even a constant?

Then there will be no reason to include it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications. Agree that adding an Assume can wait.

@theStack
Copy link
Contributor

Concept ACK

@sipa
Copy link
Member

sipa commented Sep 27, 2023

Observation during discussion with @achow101: the old behavior is actually equivalent to the new behavior, but with the added rule that the very first packet is implicitly a decoy (whether the bit is set or not).

@real-or-random
Copy link
Contributor Author

old behavior [...] with the added rule that the very first packet is implicitly a decoy (whether the bit is set or not).

Interesting observation, and this confirms that the old behavior is somewhat unnatural.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK e3720bc

Timing wise, I plan to run this on mainnet once it's added to the integration PR.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK e3720bc

I didn't follow the full discussion in the BIP change PR yet, but to my understanding, this approach simplifies the state machine and doesn't have any drawbacks (other than adaptions needed in #24748 and a temporary incompatibility between BIP324 nodes running old/new version, obviously).

@achow101
Copy link
Member

RFM but waiting for BIP change to be merged first.

@sipa sipa mentioned this pull request Sep 28, 2023
@ajtowns
Copy link
Contributor

ajtowns commented Sep 29, 2023

ACK e3720bc - simpler and more flexible, nice

@Sjors
Copy link
Member

Sjors commented Sep 29, 2023

BIP has been merged

@Sjors
Copy link
Member

Sjors commented Sep 29, 2023

FYI this is what happens if you don't upgrade your node and try to connect to an upgrade one:

2023-09-29T10:45:39.069931Z [net] Added connection to ... peer=369
2023-09-29T10:45:39.069954Z [net] sending version (103 bytes) peer=369
2023-09-29T10:45:39.069969Z [net] send version message: version 70016, blocks=809861, them=[...]:8333, txrelay=1, peer=369
2023-09-29T10:45:39.111424Z [net] start sending v2 handshake to peer=369
2023-09-29T10:45:39.295724Z [net] received: wtxidrelay (0 bytes) peer=369
2023-09-29T10:45:39.295749Z [net] non-version message before version handshake. Message "wtxidrelay" from peer=369
2023-09-29T10:45:39.295860Z [net] received: sendaddrv2 (0 bytes) peer=369
2023-09-29T10:45:39.295883Z [net] non-version message before version handshake. Message "sendaddrv2" from peer=369
2023-09-29T10:45:39.295990Z [net] received: verack (0 bytes) peer=369
2023-09-29T10:45:39.296011Z [net] non-version message before version handshake. Message "verack" from peer=369
2023-09-29T10:46:40.011229Z [net] version handshake timeout, disconnecting peer=369
2023-09-29T10:46:40.011279Z [net] disconnecting peer=369
2023-09-29T10:46:40.011323Z [net] Cleared nodestate for peer=369

(bangs head on table for forgetting make install)

The other way around, if you upgrade but connect to a node that didn't upgrade, you get something like:

2023-09-29T11:02:27.083928Z [net] Added connection to ... peer=55
2023-09-29T11:02:27.084037Z [net] sending version (103 bytes) peer=55
2023-09-29T11:02:27.084096Z [net] send version message: version 70016, blocks=809862, them=...:8333, txrelay=1, peer=55
2023-09-29T11:02:27.117891Z [net] start sending v2 handshake to peer=55
2023-09-29T11:02:27.222922Z [net] V2 transport error: invalid message type (0 bytes contents), peer=55
2023-09-29T11:02:27.522980Z [net] socket closed for peer=55
2023-09-29T11:02:27.523016Z [net] disconnecting peer=55
2023-09-29T11:02:27.523059Z [net] Cleared nodestate for peer=55

@achow101
Copy link
Member

ACK e3720bc

@achow101 achow101 merged commit d18a8f6 into bitcoin:master Sep 29, 2023
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Oct 5, 2023
e3720bc net: Simplify v2 recv logic by decoupling AAD from state machine (Tim Ruffing)
b0f5175 net: Drop v2 garbage authentication packet (Tim Ruffing)

Pull request description:

  Note that this is a breaking change, see also bitcoin/bips#1498

  The benefit is a simpler implementation:
   - The protocol state machine does not need separate states for garbage authentication and version phases.
   - The special case of "ignoring the ignore bit" is removed.
   - The freedom to choose the contents of the garbage authentication packet is removed. This simplifies testing.

ACKs for top commit:
  naumenkogs:
    ACK e3720bc
  sipa:
    ACK e3720bc. Re-ran the v2 transport fuzzer overnight.
  ajtowns:
    ACK e3720bc - simpler and more flexible, nice
  achow101:
    ACK e3720bc
  Sjors:
    utACK e3720bc
  theStack:
    Code-review ACK e3720bc

Tree-SHA512: 16600ed868c8a346828de075c4072e37cf86440751d08ab099fe8b58ff71d8b371a90397d6a4247096796db68794275e7e0403f218859567d04838e0411dadd6
kwvg added a commit to kwvg/dash that referenced this pull request Sep 17, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Sep 17, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Sep 18, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Sep 20, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Sep 20, 2024
PastaPastaPasta added a commit to dashpay/dash that referenced this pull request Sep 24, 2024
, bitcoin#26584, bitcoin#27577, bitcoin#28419, bitcoin#28196, bitcoin#28433, bitcoin#28489, bitcoin#28525 (BIP324 backports)

63b13aa merge bitcoin#28525: Drop v2 garbage authentication packet (Kittywhiskers Van Gogh)
662394c merge bitcoin#28489: fix incorrect assumption in v2transport_test (Kittywhiskers Van Gogh)
98782c6 merge bitcoin#28433: Follow-up to BIP324 connection support (Kittywhiskers Van Gogh)
f982516 merge bitcoin#28196: BIP324 connection support (Kittywhiskers Van Gogh)
3087275 merge bitcoin#28419: introduce and use `ConsumePrivateKey` helper (Kittywhiskers Van Gogh)
dccd395 merge bitcoin#27577: give seednodes time before falling back to fixed seeds (Kittywhiskers Van Gogh)
eb4f01f merge bitcoin#26584: include local ("unreachable") peers in -netinfo table (Kittywhiskers Van Gogh)
10dc874 merge bitcoin#26982: bitcoin#25880 fixups (Kittywhiskers Van Gogh)
a36f8f2 merge bitcoin#25880: Make stalling timeout adaptive during IBD (Kittywhiskers Van Gogh)
1d77f3f merge bitcoin#26519: Add getpeerinfo test for missing version message (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  * When backporting [bitcoin#25880](bitcoin#25880), changes needed to be made to `p2p_ibd_stalling.py` to help it pass.
    * DIP3 activation had to be delayed to a block beyond the range of the test. This is to prevent block rejection arising from a missing DIP3-compliant coinbase (done with `-dip3params=2000:2000`)
    * Mock time was disabled to ensure nodes in the test do not resort to direct fetching (with mock time enabled, nodes would be considered close apart in time, which would prevent the primary node from fetching in parallel, which is behavior this test relies on) (done with `self.disable_mocktime = True`)
    * The nodes connected do not report compressed headers support (the test relies on sending `headers` messages and reworking it to use compressed headers has little benefit) (done with `services = NODE_NETWORK | NODE_BLOOM`)

  * When backporting [bitcoin#28196](bitcoin#28196), in the `v2transport_test` unit test, references to `4000000` were substituted with `MAX_PROTOCOL_MESSAGE_LENGTH` as Dash messages have a protocol limit of 3MiB ([source](https://github.com/dashpay/dash/blob/d754799580d2914a0dd6388c90100d14feec0428/src/net.h#L79-L80)) while Bitcoin messages have a protocol limit of 4MB ([source](https://github.com/bitcoin/bitcoin/blob/225718eda89d65a7041c33e1994d3bf7bd71bbdd/src/net.h#L62-L63)).
    * Additionally note that short message IDs as defined in the BIP324 spec ([source](https://github.com/bitcoin/bips/blob/22660ad3078ee9bd106e64d44662a59a1967c4bd/bip-0324.mediawiki#v2-bitcoin-p2p-message-structure)) have not been changed to include Dash-specific messages, meaning, Dash-specific messages will always take 13 bytes.
    * As `FEEFILTER` is not supported by Dash, it has been replaced with a blank string in the short IDs table. It was not removed as doing so would disturb the table's arrangement as specified in spec and require readjustment of tests to account for the change in layout.

  ## Breaking Changes

  None expected.

  ## Checklist

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  UdjinM6:
    utACK 63b13aa
  knst:
    utACK 63b13aa

Tree-SHA512: c41d2c6b1f145be513e285c8f91d00ac31fff4e4d24f611a1fdab24c8740f453b0bb28912021cdf8be4f5ce93dcff8579864727ee14b8e45894b56df524ab48d
@bitcoin bitcoin locked and limited conversation to collaborators Sep 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants