Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Drop v2 garbage authentication packet #28525

Merged
merged 2 commits into from Sep 29, 2023

Conversation

real-or-random
Copy link
Contributor

@real-or-random real-or-random commented Sep 24, 2023

Note that this is a breaking change, see also bitcoin/bips#1498

The benefit is a simpler implementation:

  • The protocol state machine does not need separate states for garbage authentication and version phases.
  • The special case of "ignoring the ignore bit" is removed.
  • The freedom to choose the contents of the garbage authentication packet is removed. This simplifies testing.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 24, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK naumenkogs, sipa, Sjors, theStack, ajtowns, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Copy link
Member

@naumenkogs naumenkogs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

src/net.cpp Outdated Show resolved Hide resolved
src/net.h Outdated Show resolved Hide resolved
src/net.h Outdated Show resolved Hide resolved
@ajtowns
Copy link
Contributor

ajtowns commented Sep 26, 2023

Approach ACK assuming BIP change is accepted

src/net.cpp Show resolved Hide resolved
src/net.h Outdated Show resolved Hide resolved
Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK, with minor nits. I'm also on board with the corresponding BIP change (see the discussion there).

src/test/net_tests.cpp Outdated Show resolved Hide resolved
See also bitcoin/bips#1498

The benefit is a simpler implementation:
 - The protocol state machine does not need separate states for garbage
   authentication and version phases.
 - The special case of "ignoring the ignore bit" is removed.
 - The freedom to choose the contents of the garbage authentication
   packet is removed. This simplifies testing.
@real-or-random
Copy link
Contributor Author

real-or-random commented Sep 27, 2023

The CI failure seems unrelated: test_framework.authproxy.JSONRPCException: Unable to create transaction. Fee exceeds maximum configured by user (e.g. -maxtxfee, maxfeerate) (-4). edit: Okay, this is #28491.

@naumenkogs
Copy link
Member

ACK e3720bc

@sipa
Copy link
Member

sipa commented Sep 27, 2023

ACK e3720bc. Re-ran the v2 transport fuzzer overnight.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

/** During GARBAUTH, the garbage received during GARB_GARBTERM. */
std::vector<uint8_t> m_recv_garbage GUARDED_BY(m_recv_mutex);
/** AAD expected in next received packet (currently used only for garbage). */
std::vector<uint8_t> m_recv_aad GUARDED_BY(m_recv_mutex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it's a bit weird to receive an AAD in one message but then verify it in the next message. Perhaps it's cleaner to to keep m_recv_garbage around and explicitly move it into m_recv_aad as soon as we start receiving bytes for the decoy/version message?

That way we can Assume that m_recv_aad is empty when start receiving bytes and clear it when we're done processing a message. (this commit already does the latter)

Copy link
Member

@sipa sipa Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's cleaner to to keep m_recv_garbage around and explicitly move it into m_recv_aad as soon as we start receiving bytes for the decoy/version message?

We receive data into m_recv_buffer, and by the time we process it, we're effectively already done with it. So what you're suggesting would just amount to moving/shrinking from m_recv_buffer to m_recv_garbage and immediately after (on the next line) moving that to m_recv_aad.

I think the better way of looking at it is not that the AAD is data received at all; it's a side effect of processing data in a particular phase that results in a requirement that the next message commit to it. For now, the only place where this happens is that during the processing of received bytes in the GARB_TERM phase the garbage needs to be committed to (but not all the data received during that phase, as it doesn't include the terminator).

That way we can Assume that m_recv_aad is empty when start receiving bytes and clear it when we're done processing a message. (this commit already does the latter).

I don't think that works? At the start of the version phase, the previous garbage will be in m_recv_aad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it's a bit weird to receive an AAD in one message but then verify it in the next message. Perhaps it's cleaner to to keep m_recv_garbage around and explicitly move it into m_recv_aad as soon as we start receiving bytes for the decoy/version message?

If I understand correctly, this is roughly equivalent to dropping the second commit again, and I don't think this was cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the start of the version phase, the previous garbage will be in m_recv_aad.

This is the part that feels a bit strange to me. But may it's not too bad:

We could instead Assume that m_recv_aad is empty when we enter the APP and when we enter the APP_READY state. It can be cleared in the VERSION state and after each normal message in the APP state, which indeed this PR does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose an Assume in APP may or may not be future proof, because it depends on how AADs will be used: do we know what to expect for the next message (as is the case here for garbage), do we infer it from the current message, is it some time dependent thing or even a constant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose an Assume in APP may or may not be future proof, because it depends on how AADs will be used: do we know what to expect for the next message (as is the case here for garbage), do we infer it from the current message, is it some time dependent thing or even a constant?

Well, if we ever have a v3 that uses AAD for more than the garbage, we could drop the Assume, so I guess that's not a big deal.

On the other hand, I don't see that much value in adding an Assume. The sender and receiver code are sufficiently independent. So unless we have the exact same bug in the sender logic, a bug in the receiver logic that leads to m_recv_aad being non-empty would lead to an immediate disconnect. That is, such a bug will certainly be noticed during testing, even without the Assume. (The Assume would just make it easier to find the cause.)

My current thinking is that I don't want to touch the PR now that it has three ACKs. But I'll add an Assume if there's another good reason to touch it (or if you have strong opinions on this, of course.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally relevant to the PR but some remarks:

do we know what to expect for the next message (as is the case here for garbage)
do we infer it from the current message,

I think one of these two is true in 99.9% of the cases I can imagine.

As for inferring it from current message: Inferring it from the contents of the AEAD is excluded by design (you can't decrypt a ciphertext that doesn't verify). It would just be possible for data in the "outer" header (currently that's just the length field). But even if we want some additional data in the outer header, this doesn't really contradict the current logic in the code (e.g., we could concatenate the current-message thing to whatever m_recv_aad is, or then start Assuming it's empty).

is it some time dependent thing

That will be too fragile, i.e., we would abort the connection if clocks are off.

even a constant?

Then there will be no reason to include it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications. Agree that adding an Assume can wait.

@theStack
Copy link
Contributor

Concept ACK

@sipa
Copy link
Member

sipa commented Sep 27, 2023

Observation during discussion with @achow101: the old behavior is actually equivalent to the new behavior, but with the added rule that the very first packet is implicitly a decoy (whether the bit is set or not).

@real-or-random
Copy link
Contributor Author

old behavior [...] with the added rule that the very first packet is implicitly a decoy (whether the bit is set or not).

Interesting observation, and this confirms that the old behavior is somewhat unnatural.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK e3720bc

Timing wise, I plan to run this on mainnet once it's added to the integration PR.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK e3720bc

I didn't follow the full discussion in the BIP change PR yet, but to my understanding, this approach simplifies the state machine and doesn't have any drawbacks (other than adaptions needed in #24748 and a temporary incompatibility between BIP324 nodes running old/new version, obviously).

@achow101
Copy link
Member

RFM but waiting for BIP change to be merged first.

@sipa sipa mentioned this pull request Sep 28, 2023
@ajtowns
Copy link
Contributor

ajtowns commented Sep 29, 2023

ACK e3720bc - simpler and more flexible, nice

@Sjors
Copy link
Member

Sjors commented Sep 29, 2023

BIP has been merged

@Sjors
Copy link
Member

Sjors commented Sep 29, 2023

FYI this is what happens if you don't upgrade your node and try to connect to an upgrade one:

2023-09-29T10:45:39.069931Z [net] Added connection to ... peer=369
2023-09-29T10:45:39.069954Z [net] sending version (103 bytes) peer=369
2023-09-29T10:45:39.069969Z [net] send version message: version 70016, blocks=809861, them=[...]:8333, txrelay=1, peer=369
2023-09-29T10:45:39.111424Z [net] start sending v2 handshake to peer=369
2023-09-29T10:45:39.295724Z [net] received: wtxidrelay (0 bytes) peer=369
2023-09-29T10:45:39.295749Z [net] non-version message before version handshake. Message "wtxidrelay" from peer=369
2023-09-29T10:45:39.295860Z [net] received: sendaddrv2 (0 bytes) peer=369
2023-09-29T10:45:39.295883Z [net] non-version message before version handshake. Message "sendaddrv2" from peer=369
2023-09-29T10:45:39.295990Z [net] received: verack (0 bytes) peer=369
2023-09-29T10:45:39.296011Z [net] non-version message before version handshake. Message "verack" from peer=369
2023-09-29T10:46:40.011229Z [net] version handshake timeout, disconnecting peer=369
2023-09-29T10:46:40.011279Z [net] disconnecting peer=369
2023-09-29T10:46:40.011323Z [net] Cleared nodestate for peer=369

(bangs head on table for forgetting make install)

The other way around, if you upgrade but connect to a node that didn't upgrade, you get something like:

2023-09-29T11:02:27.083928Z [net] Added connection to ... peer=55
2023-09-29T11:02:27.084037Z [net] sending version (103 bytes) peer=55
2023-09-29T11:02:27.084096Z [net] send version message: version 70016, blocks=809862, them=...:8333, txrelay=1, peer=55
2023-09-29T11:02:27.117891Z [net] start sending v2 handshake to peer=55
2023-09-29T11:02:27.222922Z [net] V2 transport error: invalid message type (0 bytes contents), peer=55
2023-09-29T11:02:27.522980Z [net] socket closed for peer=55
2023-09-29T11:02:27.523016Z [net] disconnecting peer=55
2023-09-29T11:02:27.523059Z [net] Cleared nodestate for peer=55

@achow101
Copy link
Member

ACK e3720bc

@achow101 achow101 merged commit d18a8f6 into bitcoin:master Sep 29, 2023
16 checks passed
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Oct 5, 2023
e3720bc net: Simplify v2 recv logic by decoupling AAD from state machine (Tim Ruffing)
b0f5175 net: Drop v2 garbage authentication packet (Tim Ruffing)

Pull request description:

  Note that this is a breaking change, see also bitcoin/bips#1498

  The benefit is a simpler implementation:
   - The protocol state machine does not need separate states for garbage authentication and version phases.
   - The special case of "ignoring the ignore bit" is removed.
   - The freedom to choose the contents of the garbage authentication packet is removed. This simplifies testing.

ACKs for top commit:
  naumenkogs:
    ACK e3720bc
  sipa:
    ACK e3720bc. Re-ran the v2 transport fuzzer overnight.
  ajtowns:
    ACK e3720bc - simpler and more flexible, nice
  achow101:
    ACK e3720bc
  Sjors:
    utACK e3720bc
  theStack:
    Code-review ACK e3720bc

Tree-SHA512: 16600ed868c8a346828de075c4072e37cf86440751d08ab099fe8b58ff71d8b371a90397d6a4247096796db68794275e7e0403f218859567d04838e0411dadd6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants