Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the client's PTO code by allowing the server to send a PING #3161

Closed
ianswett opened this issue Oct 28, 2019 · 20 comments
Closed

Simplify the client's PTO code by allowing the server to send a PING #3161

ianswett opened this issue Oct 28, 2019 · 20 comments
Assignees
Labels
-recovery -transport design has-consensus

Comments

@ianswett
Copy link
Contributor

@ianswett ianswett commented Oct 28, 2019

At the New York QUIC interim, we dealt with the problem of the server being blocked by the amplification factor by making the client arm the PTO until it knew the server had validated the path, even if it had nothing to send.

From the recovery pseudocode:

if (endpoint is client without 1-RTT keys):
    // Client sends an anti-deadlock packet: Initial is padded
    // to earn more anti-amplification credit,
    // a Handshake packet proves address ownership.
    if (has Handshake keys):
        SendOneAckElicitingHandshakePacket()
    else:
        SendOneAckElicitingPaddedInitialPacket()`

When reviewing #3057, I realized that if we allowed the server to send a PING-only packet when the PTO fired and the amplification factor had been reached, then that would elicit a padded Initial ACK frame in response, which also fixes the issue.

@martinduke though that was a simpler solution than the current approach and this is a bit of an edge case, so simple is probably good.

The original issue was #1764

@ianswett ianswett added design -recovery labels Oct 28, 2019
@ianswett ianswett self-assigned this Oct 28, 2019
@martinduke
Copy link
Contributor

@martinduke martinduke commented Oct 28, 2019

In a strict sense, this means that a client can use N bytes to trigger the server to send 3N + epsilon bytes, but this fits in nicely with Probes generally not being subject to sending limits. It is somewhat gross but more elegant that a client timer with nothing in flight.

@ianswett ianswett added -transport and removed design labels Oct 28, 2019
@kazuho
Copy link
Member

@kazuho kazuho commented Oct 29, 2019

In a strict sense, this means that a client can use N bytes to trigger the server to send 3N + epsilon bytes

I think this might be concerning, though I could well be wrong.

Consider the case of an attacker acting as a client, using the IP address of a victim as the source address. The attacker can spoof an Initial packet that carries a Client Hello and another Initial packet that carries an ACK for the server's Initial packet that carried the Server Hello. An attacker can build the correct second packet assuming that the server uses the DCID of the client's Initial packet as it's CID (or the CID is zero length), and that the server's first packet number can be guessed.

If that ACK contains a delay that indicate to the server that the RTT is very small (e.g., 1ms), then the PTO could be as small as that value. Then, if the server's connection timeout is say 1000ms, the server might send 1000 packets to the victim before dropping the connection.

Am I missing something?

@ianswett
Copy link
Contributor Author

@ianswett ianswett commented Oct 29, 2019

PTO always uses exponential backoff, so if the first PTO was 1ms, that'd be ~10 PTOs, not 1000.

In terms of total bytes sent, 10 packets with a 1 byte payload are still smaller than a single full-sized packet.

@huitema
Copy link
Contributor

@huitema huitema commented Oct 29, 2019

Isn't this concern why we designed the Retry mechanism in the first place? If we are concerned about address spoofing attacks, the simple solution might be to generalize retry. As in, "if the server can predict that its first flight will contain more than 3 times the amount of data received with the client's Initial packets, then it SHOULD verify the source address using Retry."

@ianswett
Copy link
Contributor Author

@ianswett ianswett commented Oct 29, 2019

My concern is that the current anti-deadlock mechanism is fairly awkward and rarely needed, which makes me concerned it will be incorrectly implemented. By sending a PING only packet, the server is essentially informing the client it has something to send, but can't send it due to the amplification limit.

The deadlock happens when the client's Initial is acknowledged, but the client doesn't receive the server's Initial, so it can't decrypt any handshake packets. In this state, the client has nothing to send. It can happen even if the server's first flight fits into 3 packets. But it's expected that typically servers will bundle their ACK of the client's Initial with the server Initial, and this will be rare in practice.

@huitema
Copy link
Contributor

@huitema huitema commented Oct 29, 2019

@ianswett: Fair. Due to the risk of amplification attacks, the server is probably not willing to retransmit Initial packets on timer. In that case, the current spec is a bit too lose. It should say that "If the sender is not willing to retransmit Initial packets due to protection against amplification attacks, it MUST NOT acknowledge the client's initial packet." So basically, either the server verifies the client's address with the Retry/Token mechanism, or it MUST NOT acknowledge client initial packets before continuity can be verified by obtaining an ACK from the client.

@ianswett
Copy link
Contributor Author

@ianswett ianswett commented Oct 29, 2019

That's an interesting suggestion, but I think it require some non-trivial changes to the server's logic in addition to not sending ACKs? Something similar to what's described in #3080.

It may be worth opening up an issue for allowing the server to not arm a timer until the path has been validated, because I think that's a wider issue than just this PR?

@huitema
Copy link
Contributor

@huitema huitema commented Oct 29, 2019

I agree that having the server repeat a PING on timer may also work, and that the ACK of the randomly chosen Server's CID would make a reasonable address ownership test. ACK of the PING would naturally unlock any transmission that is pending address ownership.

Kazuho objects that this his hackable if the chosen Server's CID is guessable. We don't have a "non-guessable" requirement in the spec. In that case, the natural fix is the challenge/response mechanism.

@kazuho
Copy link
Member

@kazuho kazuho commented Oct 29, 2019

@ianswett

PTO always uses exponential backoff, so if the first PTO was 1ms, that'd be ~10 PTOs, not 1000.

Ah. Thank you for pointing that out. Though it seems that the amount of the data that the server can send could go like 4x the amount of data that the client has sent rather than the current 3x.

To paraphrase, I think what we might be trading here is the initial amount of data the server can send vs. the capability of sending PINGs. My view is that the former is a scarce resource, and that I'd prefer having the more space in the former even if requires us to have some complexity.

@martinduke
Copy link
Contributor

@martinduke martinduke commented Oct 29, 2019

@kazuho An Initial ping is maybe 45 bytes. So 10 pings makes the amplification ~3.4.

I was under the impression that the 3 multiplier was semi-arbitrary, so I'm unconcerned about a fractional increase.

@janaiyengar
Copy link
Contributor

@janaiyengar janaiyengar commented Oct 30, 2019

I have a couple of thoughts. First, this change is not necessarily an overall simplification. It makes things slightly simpler at the client, but it adds about as much code to the server. Additional code has to be added at the server to the PTO response to only send a PING and not a retransmission when the path is not yet validated.

Second, I think it does open up an interesting attack vector that is currently absent. @kazuho notes
that there is amplification, but I don't think byte amplification is the interesting one here, since the total load is going to be < 1 MTU. The interesting amplification here is packet amplification, which is a useful attack since UDP and packet processing costs at endpoints / clients is a relatively high cost. And you can get a fair number of packets back from the server for spoofing a couple of packets, in spite of the exponential backoff.

I'll grant that the attack is a bit of a stretch, and does require that the attacker guess the server's CID and packet number. Though, as @kazuho pointed out to me offline, we allow the server to use a zero-length CID.

I don't think this change is necessary. And on balance, I don't think this change is a good one.

@janaiyengar
Copy link
Contributor

@janaiyengar janaiyengar commented Oct 30, 2019

I do like the idea of sending a PING frame from the server as a signal to the client that it is blocked. I think we can suggest that as an optimization that a clever server might implement under the 3x limit, or when it reaches the 3x limit.

@kazuho
Copy link
Member

@kazuho kazuho commented Oct 30, 2019

@janaiyengar

I do like the idea of sending a PING frame from the server as a signal to the client that it is blocked. I think we can suggest that as an optimization that a clever server might implement under the 3x limit, or when it reaches the 3x limit.

While endpoints can do that, I am not sure if it is a good idea to endorse such a behavior in the specification.

The reason we have so far been fine with having the 3x limit being applied until the Handshake packets are exchanged is because we assume that the size of ServerHello would not be that large compared to the size of the ClientHello, and therefore that the chance of hitting 3x limit before path validation is minimal. It's a hack.

If we are to address the corner cases that the server hits after sending 3x amount of data, then I think we should fix the hack than putting another hack (i.e. the PING that signals 3x limit) above the existing hack.

I mean, if we think that defining a way that allows the server to unblock the 3x limit is important, we should be considering allowing the use of PATH_CHALLENGE / PATH_RESPONSE frames in Initial packets.

@ianswett
Copy link
Contributor Author

@ianswett ianswett commented Nov 3, 2019

I believe this is simpler to implement assuming one is implementing both a client and server. I already have special case code to send a PING when there are bytes in flight but nothing to send, so this is very similar.

I agree it's not necessary, but I find the simplification appealing, and when we first discussed this problem, no one suggested this solution, so I thought it was worth writing up.

If we keep the existing text that the client has to keep sending, I'm not sure doing this on the server side is a valuable optimization, so I'd prefer to stick with one mechanism or the other.

To me, the main negative is the server sending extra datagrams. And maybe that's enough to stick with what we have.

@janaiyengar
Copy link
Contributor

@janaiyengar janaiyengar commented Nov 5, 2019

I believe this is simpler to implement assuming one is implementing both a client and server. I already have special case code to send a PING when there are bytes in flight but nothing to send, so this is very similar.

When a PTO fires, the endpoint is supposed to try and retransmit something if there is data to retransmit. In this case, you'd need to special-case it so that the server doesn't in fact retransmit but sends a PING instead. That's what I meant by adding more code.

@ianswett
Copy link
Contributor Author

@ianswett ianswett commented Nov 5, 2019

It also SHOULD send a PING only packet if there are bytes in flight but nothing to send.

@janaiyengar
Copy link
Contributor

@janaiyengar janaiyengar commented Nov 5, 2019

Not new data, any data. This was deliberate. See Section 5.3.1: "When the PTO timer expires, and there is new or previously sent unacknowledged data, it MUST be sent." The point of this text was to send data if available, new or previously sent. In the case that we are discussing the server is supposed to send a PING, but it might have data to send.

@mnot mnot added this to Triage in Late Stage Processing Nov 5, 2019
@mnot mnot added the design label Nov 29, 2019
@mnot mnot moved this from Triage to Design Issues in Late Stage Processing Nov 29, 2019
@larseggert
Copy link
Member

@larseggert larseggert commented Feb 5, 2020

Discussed in ZRH. @ianswett to decide if he wants to pursue this, potentially discuss with @MikeBishop and/or @kazuho.

@ianswett
Copy link
Contributor Author

@ianswett ianswett commented Feb 5, 2020

I realized there is one major benefit of keeping the existing design, which is the client is guaranteed to have at least one RTT measurement in this circumstance, whereas the server may not.

For this reason, I'm inclined to close this with no action unless others feel strongly otherwise.

@ianswett
Copy link
Contributor Author

@ianswett ianswett commented Feb 5, 2020

@kazuho came up with the same point simultaneously, so let's close this with no action.

@ianswett ianswett added the proposal-ready label Feb 5, 2020
@project-bot project-bot bot moved this from Design Issues to Consensus Emerging in Late Stage Processing Feb 5, 2020
@LPardue LPardue moved this from Consensus Emerging to Consensus Call issued in Late Stage Processing Feb 19, 2020
@LPardue LPardue removed the proposal-ready label Feb 19, 2020
@LPardue LPardue added the call-issued label Feb 26, 2020
@LPardue LPardue added has-consensus and removed call-issued labels Mar 4, 2020
@project-bot project-bot bot moved this from Consensus Call issued to Consensus Declared in Late Stage Processing Mar 4, 2020
Late Stage Processing automation moved this from Consensus Declared to Text Incorporated Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-recovery -transport design has-consensus
Projects
Late Stage Processing
  
Issue Handled
Development

No branches or pull requests

9 participants