Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace HPACK integer encoding #1148

Closed
krasic opened this issue Mar 1, 2018 · 6 comments
Closed

Replace HPACK integer encoding #1148

krasic opened this issue Mar 1, 2018 · 6 comments
Labels
-qpack design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.
Projects

Comments

@krasic
Copy link

krasic commented Mar 1, 2018

Using the same data as an in earlier thread, I noticed that a decent percent of bytes go to indexed representations, about 10% the example I looked at. In this particular example, every indexed representation encoded in a single byte.

If instead of byte-aligning every instruction as HPACK requires, only the entire header block were byte aligned, I could imagine saving two or more bits per indexed instruction (minus some bits of padding per header block), an average of two bits reduced would be around 2.5% overall savings. I happen to think this will be a bigger compression win than a lot of the other optimizations in flight*.

Part of my premise here is that for many connections the number of active dynamic table entries is relatively small, on the order of 64 or less. I haven't sought out the best varint scheme. A simple starting point for thought, one might construct a static huffman table, where probability is index value, i.e. smallest indices encode with fewest bits. ( Byte alignment of the whole block might use a few extra bits of padding )

  • As in the PR for separate instruction sets in 1141.
@MikeBishop
Copy link
Contributor

Interesting idea. I like the potential, though I'm having trouble visualizing exactly what that would look like.

Perhaps a less drastic form, particularly if we wind up with an algorithm that tends to encourage two passes: We could make the indexed representation carry a list of indices, rather than a single index, and byte-pack that. So you send your list of indexed representations as a few long instructions, but everything else separately.

@martinthomson martinthomson added design An issue that affects the design of the protocol; resolution requires consensus. -qpack labels Mar 2, 2018
@martinthomson
Copy link
Member

Buck, do you think that you can find enough data to do a frequency analysis on the indices to inform this?

@mnot mnot added this to Headers in HTTP Mar 6, 2018
@krasic
Copy link
Author

krasic commented Mar 9, 2018

I'll try to take a stab next week.

@krasic
Copy link
Author

krasic commented Mar 14, 2018

Using the simulator, below is frequency data of HPACK indices for the har I've been using.

i pdf cdf
2 7.9 7.9
3 0.4 8.3
6 8.3 16.6
32 1.2 17.8
62 1.6 19.4
63 0.9 20.3
64 0.4 20.7
65 0.7 21.3
66 0.6 21.9
67 0.3 22.2
68 0.4 22.6
69 1.3 23.9
70 2.2 26.1
71 1.3 27.5
72 2.3 29.8
73 2.1 31.9
74 1.4 33.2
75 2.3 35.5
76 1.9 37.4
77 1.9 39.4
78 2.3 41.7
79 2.7 44.3
80 3.7 48.0
81 2.1 50.1
82 2.8 52.8
83 3.8 56.6
84 2.6 59.2
85 2.8 62.1
86 2.8 64.8
87 3.0 67.8
88 3.4 71.2
89 4.1 75.3
90 4.2 79.5
91 1.6 81.1
92 4.1 85.2
93 0.5 85.7
94 3.8 89.5
95 1.6 91.2
96 3.7 94.9
97 2.6 97.5
98 2.3 99.7
99 0.3 100.0

@MikeBishop
Copy link
Contributor

Interesting. So my immediate take-away is that (for this one .har file) you touch only a few static table entries in the first place, but those you touch, you touch fairly often. However, if we were to do a Huffman-based integer encoding, we'd need to do this across a lot of data, both successful and unsuccessful.

@afrind
Copy link
Contributor

afrind commented Jun 4, 2018

QPACK seems to have met its goal of being close to HPACK without this additional complexity.

@afrind afrind closed this as completed Jun 4, 2018
@mnot mnot added the has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list. label Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-qpack design An issue that affects the design of the protocol; resolution requires consensus. has-consensus An issue that the Chairs have determined has consensus, by canvassing the mailing list.
Projects
No open projects
HTTP
Headers
Development

No branches or pull requests

5 participants