Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This proposal replaces all integer encodings in frames with a unified variable-length encoding format. It simplifies the encoding considerably, most noticeably in the ACK frame, which I have largely rewritten. This change increases the number of streams to 2^62, but reduces the number of packets and octets per stream to that same number. Both are very large values anyway. I kept the 2^10 scaling on the connection-level flow control window, so that goes to 2^72 now. The cost for ACK is a negligible decrease in efficiency for larger gaps or ACK blocks (mainly between 64 and 255 in length). This is counteracted by an increase in efficiency if there are occasional ranges or gaps larger than 255, which were previously very difficult to encode correctly. As previously discussed, the timestamp format is removed. In its place, we use the same integer encoding, but with a scaling value, or exponent, that is signaled in transport parameters, that should allow common ACK delays to be expressed in fewer octets (an exponent of 0 only allows for 16ms to be encoded in two octets, but the default exponent is 8, allowing up to 128ms of ACK delay to be encoded on two octets with an 8us resolution). I haven't made any changes for HTTP yet, I plan to do that in a separate PR.
- Loading branch information
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also https://github.com/martinthomson/quic-int
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this proposal (IIRC it's pretty much what was discussed at the Paris interim).
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also my review comment on integer encoding format. I should probably have written those comments here instead.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally like this proposal, though I do have one comment related to the ACK frame changes. WinQuic tends to parse and build these frames from smallest to largest packet number. It used to just require a minor equation to calculate the end based on the number of blocks/gaps. That would no longer be possible with this design.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the ACK block encoding, I find it hard to read, and have always had problems with interpreting the text since I first encountered QUIC.
I propose removing the largest acked since it isn't really guaranteed to be the largest the peer has seen.
Instead have the following structure:
"There is a least one packet acknownledged by the ACK frame and it is identified by the First Ack Packet. The First Ack Block Length indicates the number of consequtive packets being acknowledged, starting with the First Ack Packet. Additional ranges can be ackowledged by skipping Gap length packet numbers to reach the start of the next Ack Block and the length of this block is given the Ack Block length field. Pairs of (Gap Length, Ack Block Length) can be repeated zero or more times until the Gap length is encoded as a zero."
Note: there is special case of only acknowleding one packet where the First Ack Block Length could be encoded in a more compact format. If this is relevant, The First Ack Block could be the only value present and have a dedicacated flag to encode this.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Github went berserk with the comments, but I would say I think this is pretty good. I added some comments.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a more dramatic change to the design that would allow in-order construction, avoid negative-valued overflows, and would save an occasional octet:
ACK = Length *(Gap ACK_Block) Delay
Assuming that gaps and blocks are all positive values rather than negatives. It makes acquiring the largest acknowledged more difficult (it would be the sum of the sequence of gaps and blocks), but it would avoid some of the stranger parts of the current design (and address @nibanks concern about ordering).
I wanted to do that separately though. I'm trying to make only one change at a time.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this proposal is basically good.
That said, I'll ask again that it's a lot easier to discuss issues if they aren't all conflated in one PR. I understand that they seem related, but then the entire protocol is related. This PR purports to change the integer encoding, but it's revamping the ACK frame structurally and changing the ack delay time format in addition. I would ask that you please separate them into separate PRs.
If you want to change the design of the ACK frame format, please make that a separate issue, and let's agree on what's useful there.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the time delay format, would it be too simple to just have a bit in the ACK flags that specifies milliseconds or microseconds for that frame? Does 1 us - 64 sec (or 16 sec assuming Martin's other change) give us enough dynamic range for the vast majority of applications in a 2B encoding?
This would also allow the scaling to be per-ack-frame rather than something negotiated, which isn't a big deal, but meh.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the 2 bits to encode the length, I see significant pros and cons for the simplicity of implementation. It's good to get rid of the thicket of different encodings to figure out how big the fields are. That's probably worth losing the 0-byte offset option, etc.
I also agree that this handles ACK corner cases better.
On the other hand, this makes it hard to figure out how big anything is supposed to be before parsing the entire thing. This is somewhat annoying when you're doing this in kernel space and dealing with skbuff-like structures, and means I have to check for the end of the datagram (a malformed frame) on essentially every field in an ACK, rather than once per frame.
I would also like to understand how this affects long form headers. The fixed size of fields in long form headers is an important innovation that makes things much easier to handle between versions, etc.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikkelfj wrote:
There are strong incentives for it to be so. If I always send an accurate largest_acked, then my PCB only need to store the receive time of the highest packet number. If l choose largest_acked to be any other packet, I have to essentially store the receive time of that packet as well.
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinduke, this only affects frame encodings. The packet headers don't change at all.
I have retained the 0-byte offset (and the remainder-of-the-packet length flag) so that this is a minimal change. We could consider its value separately.
Are you arguing for length-prefix rather than count-prefix? I think that we'll probably want to poll implementers about that point. The cost of this sort of encoding is that you have to read the frame to learn how long it is. The trade-off is, I think, between different decoder costs (length allows you to skip the frame and parse at leisure, count allows you to know how many ranges to allocate), vs. encoder costs (length seems to be strictly harder because it requires that you encode or calculate first and then backfill the length, this in part is why the variable integer encoding permits longer encodings to include small values).
f2f8113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinthomson I might have been confused by the changes at the very top regarding packet numbers. Perhaps this only applies to packet numbers in ack frames? Perhaps some clarifying language is in order.
I apologize for not reading the PR more carefully to see the limited scope of this change. To be honest, if this only applies to select fields in certain frames, while other fields (encoding the same thing!) use the old style encoding, I'm less in love with the idea. If we're going to do this, I'd just as soon make this standard for all integers except for the ones that aren't really numbers (just Conn ID and Version, I think). We can make a special rule that Packet Number in Long Headers MUST have the 10 encoding. Then we free up tons of bits in flags fields all the way through QUIC.
Alternatively, make the ACK frame a special snowflake with different encodings and put all the notation language there. ACK already has odd encodings, and this seems to be the only straightforward way to solve certain problems in the ACK..
I am not sure what I am arguing for. This problem seems limited to the ACK frame, so perhaps the answer is to do both. I agree that there are significant drawbacks to either approach. I'll mull it over a bit more.