Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error code for acknowledgment of a packet that was never sent #2319

Closed
marten-seemann opened this issue Jan 9, 2019 · 14 comments
Closed

Error code for acknowledgment of a packet that was never sent #2319

marten-seemann opened this issue Jan 9, 2019 · 14 comments
Labels
-transport design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.

Comments

@marten-seemann
Copy link
Contributor

As discussed in #2302 (comment), I'd like to advocate for introducing a new error code for loss recovery-related errors, e.g. when an unsent packet (in a respective packet number space) is acknowledged.

In general, we have transport error codes to describe the area where an error occurred. For example, we have a STREAM_STATE_ERROR, which is used when the peer sends frames for the wrong stream type or unopened stream types. And we have a FLOW_CONTROL_ERROR, for, obviously, flow-control related errors.

Generating ACKs for multiple packet number spaces is not a trivial thing to do, and it would be of great help for debugging to have a more specific error here than the catch-all PROTOCOL_VIOLATION.

@mikkelfj
Copy link
Contributor

mikkelfj commented Jan 9, 2019

I think this is best served in the associated error message string.

In production this is a real attack, and less information to the attacker is a good thing.

@marten-seemann
Copy link
Contributor Author

Then why do we have different error codes stream states, flow control etc.? There are real attacks here as well, and I don't see why they would be less severe than loss recovery-related attacks. At least we should be consistent.
If we really say that all errors just give an attacker more information, let's kill all error codes except NO_ERROR and SERVER_BUSY.

@mikkelfj
Copy link
Contributor

mikkelfj commented Jan 9, 2019

(I sense sarcasm) MQTT tried to do that, but people were unhappy so they added more error codes, e.g. when incorrectly connecting multiple clients with the same client ID which was an easy mistake.

@mikkelfj
Copy link
Contributor

mikkelfj commented Jan 9, 2019

More seriously, an attacker could use the error code to find a way to avoid hitting gaps in optimistic acks with high probability. Granted, the attacker could probably also use PROTOCOL_VIOLATION to infer that, once it is confident it has a working implementation.

@ianswett
Copy link
Contributor

ianswett commented Jan 9, 2019

I'd like to add an error code for this.

In general, I don't think we have a consistent view on error codes and error strings at the moment. We removed a lot of error codes around the time of the Paris interim, many of which could not be sent on the wire and so really needed to be removed. But we haven't added many(if any?) back and this seems like a good one to add.

But if we think the error code is providing too much information to the peer, then error details is almost certainly even more, unless it's always empty, in which case what's the point?

@mikkelfj
Copy link
Contributor

mikkelfj commented Jan 9, 2019

The error details can be flagged out in production and used for debugging. Perhaps there should be some guidelines on that.

@mikkelfj
Copy link
Contributor

mikkelfj commented Jan 9, 2019

Actually, optimistic ACK's could also be defended against by ignoring the invalid ACK's and silently downgrade the connection to min CWND. That would provide even less feedback to an attacker. But of course an error is needed for debugging. Not sure I would recommend this - but it is an option.

@kazuho
Copy link
Member

kazuho commented Jan 9, 2019

@marten-seemann

introduce an error code for loss recovery-related errors

I like the way you frame the issue. It is my understanding that we have so far assigned error codes for each feature rather than every one of the specific errors. Calling it "loss recovery-related errors" certainly follows that practice.

The question is if there is a group of errors that fall into the category. Do you have something other than ACK containing an invalid packet number in mind?

@marten-seemann
Copy link
Contributor Author

The question is if there is a group of errors that fall into the category. Do you have something other than ACK containing an invalid packet number in mind?

I can think of three different cases why an ACK would contain an invalid packet number:

  • the obvious case: you ack a too high packet number, which wasn't sent yet
  • you ack in the wrong packet number space. This will most likely look like you acked a packet that the peer didn't send (in that packet number space)
  • I assume that most implementations will sooner or later implement skipping of packet numbers as a mitigation of the optimistic ack attack. This error would occur when you ack a packet number that was intentionally skipped.

@martinthomson martinthomson added design An issue that affects the design of the protocol; resolution requires consensus. -transport -recovery labels Jan 9, 2019
@mnot mnot added this to Recovery in Transport / Recovery / TLS Jan 10, 2019
@kazuho
Copy link
Member

kazuho commented Jan 10, 2019

@marten-seemann Thank you for elaborating. To me it sounds that they would look like one symptom to the reciever; i.e. an invalid PN in ACK.

I was wondering if there is a intent to use the error code for other symptoms, because it would imply that there are group of errors that might be categorized as a group (based on my understanding that we assign error codes to each feature or category).

Though I am not strongly opposed to having an error code for this case, I fear if having an error code for this opens can of worms (i.e. "we have error code for this, so why not have another error code for X"). Therefore am trying to see what the line we would be trying to hold is.

@ianswett ianswett moved this from Recovery to Close in Transport / Recovery / TLS Jan 10, 2019
@kazuho
Copy link
Member

kazuho commented Jan 12, 2019

I must have been dumb, but I now realize that the proposed error code would be equivalent of a connection error of (error_code=PROTOCOL_VIOLATION, frame_type=ACK)? Am I correct?

Assuming that is correct, I wonder if we achieve anything by having a new error code.

@ianswett
Copy link
Contributor

@kazuho you make a good point. I think it'd be preferable to remove the frame type and have more error codes, but I think we'll get to have that larger discussion in Tokyo.

@martinthomson martinthomson changed the title introduce an error code for loss recovery-related errors Error code for acknowledgment of a packet that was never sent Jan 30, 2019
@huitema
Copy link
Contributor

huitema commented Jan 30, 2019

This error condition is a symptom of the "optimistic ack" attack. If I am under attack, I don't want to give more info to the attacker. Generic connection close is fine.

@mnot
Copy link
Member

mnot commented Feb 8, 2019

In Tokyo: close with no action.

@mnot mnot removed the yaec! label Feb 12, 2019
@mnot mnot added the has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list. label Feb 24, 2019
@mnot mnot closed this as completed Feb 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-transport design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.
Projects
No open projects
Development

No branches or pull requests

8 participants