Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify header protection sampling #4435

Closed
larseggert opened this issue Dec 10, 2020 · 7 comments · Fixed by #4436
Closed

Clarify header protection sampling #4435

larseggert opened this issue Dec 10, 2020 · 7 comments · Fixed by #4436
Labels
-tls ietf-lc An issue that was raised during IETF Last Call.

Comments

@larseggert
Copy link
Member

Tracking issue for #4404

@larseggert larseggert added -tls ietf-lc An issue that was raised during IETF Last Call. labels Dec 10, 2020
@larseggert larseggert added this to Triage in Late Stage Processing via automation Dec 10, 2020
@larseggert larseggert linked a pull request Dec 10, 2020 that will close this issue
@larseggert
Copy link
Member Author

@PaulDance wrote in #4404:

The header protection sampling (HPS) is quite simple, as it is just selecting 16 bytes out of the encrypted payload, but we think the current description is really confusing.

Indeed, section 5.4.2 explains the process with pseudocodes computing an offset relative to the beginning of an entire packet which payload is the encrypted one. However, when applying the AEAD function, one cannot do it in-place, so additional allocation must be reserved in any case. Therefore, one most probably has access to the encrypted payload separately from the unprotected packet, headers and payload.

Thus, this PR proposes two things, split in different commits:

Clearly and explicitly specify the two already-existing pseudocodes sample from an entire packet.
Add another pseudocode explaining how the sampling can be achieved for both short-header and long-header packets.
We have not changed or removed any existing content however, as we believe this draft could use some more explanations and clarifications here and there, rather than less with confusing elements, even if it means a bit of duplication.

Please tell us if there is any form of typo, wrong wording, grammatical issue or just general mistake, especially concerning the new pseudocode.

Let's please keep the discussion here (on this issue) and then we'll dispatch #4404 based on the outcome.

@janaiyengar
Copy link
Contributor

This is a fine addition in theory -- it's purely additive -- but I'm not certain that it really helps implementers. The PR will need some editorializing if we do want to take it, but I've not seen a ton of interest in taking this PR in from others. If anyone else wants this in, please speak up.

@PaulDance
Copy link
Contributor

Well I understand your concern @janaiyengar, but in what way does it not help implementers exactly?

When we worked on these functions, we spent several hours being completely wrong in our understanding of the specification, while both parts are actually quite simple. Therefore, we believe it should be just as simple to understand from the first reading. That's why I proposed such a clarification, but also because if it happened to us, then there's a good chance some will get confused as well.

For the first part, although pseudocodes used packet as a designation of the sampled array, the first sentence of the section states that the sampling operates on the ciphertext. This confused us quite a bit, hence the first commit in order to avoid being cryptic.

For the second one, the idea of pseudocodes in the first place is to help implementers, right? From our understanding, the proposed new one should provide a simple and more pragmatic way to achieve the sampling.

Obviously I might be wrong in saying this, but then let me understand why I am so.

@kazuho
Copy link
Member

kazuho commented Dec 11, 2020

I have a mixed feeling about the proposal.

From implementation standpoint, I agree that it helps people understand the design if we clarify the sampling offset relative to encrypted data. In fact, the first paragraphs of that section already states that. OTOH, only the sender can implemented by using that property. The receiver has to determine the starting offset of the sample relative to the starting offset of the packet number, because the starting offset of encrypted data cannot be determined until the length of packet number is determined as part of header unprotection. Therefore, I think it is beneficial to retain current definition of packet protection which is talks about offsets relative to packets.

However, when applying the AEAD function, one cannot do it in-place, so additional allocation must be reserved in any case. Therefore, one most probably has access to the encrypted payload separately from the unprotected packet, headers and payload.

I do not agree with this statement for two reasons. The first reason is that, as stated above, receivers would never have access to encrypted payload before unprotecting the header. The second reason is that I tend to assume senders to do AEAD encryption in-place. They prepare a MTU-sized buffer, writes unprotected packet header, then fills the payload bytes, leaving at least 16 bytes of extra space at the end to hold the AEAD tag. Then, it mutates the buffer using AEAD, then using header protection. That's how quicly works, and I would be surprised if we are the only one implementation taking that approach.

To summarize, the proposed description of the alternative approach only works on the sender-side, and it is unclear to me if it helps many implementers.

@PaulDance
Copy link
Contributor

PaulDance commented Dec 11, 2020

[...] receivers would never have access to encrypted payload before unprotecting the header.

Yes you are right @kazuho, when receiving, one cannot determine the limit between the protected packet number and the payload.

However, it seems to me that if one has the ability to parse a series of bytes into a form of packet structure with a sort of best-effort method, then they could start by doing so and it would result in a partially parsed packet where the packet number length would be completely wrong. But if that wrong packet number length (the header field + 1) is made to be consistent with the parsed packet number's bytes length (len(packet_number)) during that parsing, then it would not interfere with the header protection sampling because 4 - len(packet_number) is the offset relative to the (wrong) start of the encrypted payload required in order to consider the packet number length to be the maximum value 4, which is exactly what needs to be achieved for the sampling. This is the strategy we have adopted in our implementation and from our tests, it works for both the sender's and the receiver's sides.

Of course that strategy does not work for the rest of operations required to unprotect the packer number field by considering the packet_number + payload array of bytes and then decrypt the payload from its deduced correct start.

Correct me if I'm still wrong here.

[...] senders do AEAD encryption in-place.

I was not aware it was possible, sorry to have been wrongly assertive. It's just that the AEAD implementation we used does not work in-place, so I just assumed it would probably be the case elsewhere too.

[...] I would be surprised if we are the only one implementation taking that approach.

I suppose you wouldn't be, especially for implementations having performance as a critical objective. But then, I would also be surprised if we were the only implementation not doing it in-place.

[...] it is unclear to me if it helps many implementers.

Is "helping some implementers" not enough for such a specification or would it add too much noise? I just assumed RFCs were meant to be as unambiguous, clear and easy-to-implement as possible, hence this proposal being additive.

@kazuho
Copy link
Member

kazuho commented Dec 11, 2020

@PaulDance I'm not sure if I follow, but I assume that your point can be generalized as that the receiver will know where the packet number field begins, and use that to figure out where the sample is. I agree with that.

Is "helping some implementers" not enough for such a specification or would it add too much noise?

Yes, this issue can be viewed as finding the right balance between helpful to some vs. adding noise to others.

But taking a step back, I tend to think that the current example is confusing to everybody, because it replicates how an endpoint should parse entire packet header, where people would already have parsed all the plaintext part of the packet header before trying to unprotect the header.

Hence filed #4436. I'm not sure if the PR fully resolves @PaulDance's concern, but anyways.

Late Stage Processing automation moved this from Triage to Issue Handled Dec 11, 2020
@PaulDance
Copy link
Contributor

it replicates how an endpoint should parse entire packet header

You're right, because it assumes the implementation to have a form of access to where the payload could start, but probably does not. I still think both approaches could be presented without too much confusion, but oh well ¯\_(ツ)_/¯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-tls ietf-lc An issue that was raised during IETF Last Call.
Projects
Late Stage Processing
  
Issue Handled
4 participants