Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplication can use different indexing #1128

Closed
martinthomson opened this issue Feb 21, 2018 · 15 comments
Closed

Duplication can use different indexing #1128

martinthomson opened this issue Feb 21, 2018 · 15 comments
Labels
-qpack design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.

Comments

@martinthomson
Copy link
Member

A duplication instruction never references the static table. Currently however, this is possible, and it's not a great idea.

Aside from it being pointless to duplicate entries in the static table (it costs space and time to do this, and references to these new entries will have longer encodings than static table references), in the current design the encoding of the duplication instruction is such that virtually every duplication instruction has to use a two octet encoding because exactly one entry in the dynamic table can be duplicated in one octet. Duplication of this entry is completely useless because duplicating it does nothing to improve its position in the table.

@martinthomson martinthomson added design An issue that affects the design of the protocol; resolution requires consensus. -qpack labels Feb 21, 2018
@mikkelfj
Copy link
Contributor

mikkelfj commented Feb 21, 2018 via email

@afrind
Copy link
Contributor

afrind commented Feb 21, 2018

One proposed solution comes from the QPACK draft, which is to use a bit in the instruction to indicate if a referenced index is static or dynamic. I found that this actually simplified parts of the implementation compared to HPACK/QCRAM's unified index space. If we separate the instruction space, then we can define duplication to always operate on the dynamic table index space. It also may compress slightly better, since more dynamic indexes can be encoded by a single byte.

@afrind
Copy link
Contributor

afrind commented Feb 21, 2018

@mikkelfj : is the -qcram tag insufficient to differentiate? Would you prefer something in the issue title?

@LPardue
Copy link
Member

LPardue commented Feb 21, 2018 via email

@mikkelfj
Copy link
Contributor

@afrind yes QCRAM in the title is fine, or qcram:, just something. The issue is with email as @LPardue says.

@martinthomson
Copy link
Member Author

@afrind, the extra bit is less efficient overall (see also why arithmetic coding is superior to Huffman). Especially given that the static table is a fixed size. I don't find the indexing that onerous, so I don't think that the extra bit would help. That is, unless we get a proposal for a new, bigger static table. Having references to the dynamic table start at (for example) 200, would make the bit a good investment.

@afrind
Copy link
Contributor

afrind commented Feb 22, 2018

If we leave the on-the-wire index space unified, is the proposal to use a different indexing scheme for this instruction only (eg: skip the + 62)? I'm not crazy about having two different indexing schemes in the design, but otherwise it would solve the issues you raise above.

@martinthomson
Copy link
Member Author

Well, let's decide the other thing first, because that will drive any design here. (-62 is the obviously choice assuming we change nothing else).

@MikeBishop
Copy link
Contributor

Actually, the optimal change for length's sake would be to use a totally different reference point. You typically want to duplicate the things that are about to fall off the end, so count from the oldest item that hasn't been dropped yet. Since you can only send this on the control stream, both sides have a synchronized view about what that is.

That's not great for comprehension, necessarily, but....

@martinthomson
Copy link
Member Author

Right. That was in my original write-up. The reverse indexing is the most efficient.

@afrind
Copy link
Contributor

afrind commented Feb 26, 2018

There's already of 3 different ways indexes are written on the wire: Absolute Index, Hybrid (relative an encoded base), and Relative (HPACK style, relative to head). An implementation may also have an internal indexing scheme (eg: indexes into an array of headers). How many bytes are we going to save introducing a Relative-to-end style? A few every time user-agent and Accept get near the end of the table? I'd prefer to burn the bytes and keep the draft a bit simpler.

@MikeBishop
Copy link
Contributor

MikeBishop commented Feb 26, 2018

Depends on how the instruction space lands, but one byte per duplicate instruction seems a likely outcome.

@afrind
Copy link
Contributor

afrind commented Feb 26, 2018

My question was how often to we wrap around and need to duplicate, and how many will need duplication? I assume it takes several or dozens of requests of normal usage to wrap. How many headers will implementations want to duplicate when a wrap occurs? I can see a browser re-adding the handful that are sent on every request. Also note that this is all gravy over HPACK which would just evict and re-insert.

@afrind
Copy link
Contributor

afrind commented Mar 9, 2018

In my simulation, duplication is a very rare instruction. With a 4k table, my 250 request HAR issued 4 Duplicate instructions. With an 8kb table, it never happened.

If we forgo splitting the instruction space, but only change the control instructions so that Duplicate is reuses Indexed (and keep Table Size Update the same), and we subtract 62, then we could duplicate 63 entries in a single byte, and I don't think there's a big advantage for reversing the index.

@MikeBishop
Copy link
Contributor

Discussed with Alan; absolute indices grow without bound, so they're not a good option. Most tables have less than 1k entries, which can be represented in at most three bytes. You could index from the end of the dynamic table and make it always one byte, but that introduces a fourth (!!!) way to reference entries in QPACK, which seems less than ideal.

Inclination is to leave this alone unless we have data that says Duplicate instructions are getting intractably large.

@mnot mnot removed this from Headers in HTTP May 23, 2018
@mnot mnot added the has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list. label Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-qpack design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.
Projects
None yet
Development

No branches or pull requests

6 participants