Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Reorganize QPACK #1852

Merged
merged 4 commits into from Nov 21, 2018
Merged

WIP: Reorganize QPACK #1852

merged 4 commits into from Nov 21, 2018

Conversation

afrind
Copy link
Contributor

@afrind afrind commented Oct 11, 2018

Added a "Compression Process Overview" section near the top, with a high level description of how to compress/decompress. This section now encompasses a lot of what used to be "Encoding Strategies".

In the places where I added new text, I temporarily added HTML comments to indicicate so it can get a bit more detailed review.

I also deleted a few sentences/paragraphs/sections that I found to be redundant:

  • Preventing Eviction Races
  • "An encoder also respects..."
  • "For header blocks encoded in..."
  • Single Pass Encoding
  • "All table updates occur on..."

@afrind afrind added -qpack editorial An issue that does not affect the design of the protocol; does not require consensus. labels Oct 11, 2018
Copy link
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an improvement (feel free to defer some of the comments, I got a little carried away).


Base Index:

: An absolute index in a header block from which relative indices are made
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Period.


<!-- new -->
An encoder encodes a header list by emitting either an indexed or a literal
representation of each header field in the list. Index references to the static
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/of/for

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with the existing wording; these are representations of the headers. But emitting one for one is also a valid way to construct it.

An encoder encodes a header list by emitting either an indexed or a literal
representation of each header field in the list. Index references to the static
table and literals do not require any dynamic state and never risk head-of-line
blocking. References to the dynamic table do risk head-of-line blocking if the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/do//


To ensure that the encoder is not prevented from adding new entries, the encoder
can avoid referencing entries that will be evicted soonest. Rather than
reference such an entry, the encoder SHOULD emit a Duplicate instruction (see
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/SHOULD/can - it's not a normative requiement

reference. Draining entries - entries with an absolute index lower than the
draining index - will not accumulate new references. The number of
unacknowledged references to draining entries will eventually become zero,
making the entry available for eviction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the last two sentences, suggest: "If the encoder does not create new references to entries with an absolute index lower than the draining index, the number of unacknowledged references to those entries will eventually become zero, allowing the entry to be evicted." That loses the "draining entries" thing, but I would then change the picture to say "Unreferenceable Entries".

|
V
+---+-----+-----+-----+-----+
| n | n-1 | n-2 | ... | d+1 | Absolute Index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be clearer, I think.

The 2 here (in n-2) is confusing. What you really want is a diagram that shows the different values in a block.

Largest reference will be the largest value, that's your baseline.
Base offset is taken relative to the largest reference. Here it's 2, which is confusing. We need to show examples of where base allows for post-base references and one where it doesn't.

Including entries that have been evicted in this diagram confuses the matter. It would be better to rely on text or to show that new entries appear on the left and old entries are evicted on the right.

Does this help?

Indexing Overview

    <- Absolute Indices (increase to left)
          L = Largest Reference
          B = Base Reference

       Newer                     Older
    <- Entries                 Entries ->
       Added                   Evicted

Indexing (simple case, B == L):
       ---+---+---+---+---+---+---+---+---
       ...| L |L-1|L-2|L-3|L-4|L-5|L-6|...
       ---+---+---+---+---+---+---+---+---
Index:      0   1   2   3   4   5   6  ...

Indexing (B > L):
       ---+---+---+---+---+---+---+---+---
       ...| B |B-1|...| L |L-1|L-2|L-3|...
       ---+---+---+---+---+---+---+---+---
Index:      x   x   x   D  D+1 D+2 D+3 ...
            x = invalid index
                        D = B - L

Post-Base Indexing (where B < L)
       ---+---+---+---+---+---+---+---+---
       ...| L |...|B+2|B+1| B |B-1|B-2|...
       ---+---+---+---+---+---+---+---+---
Index:                      0   1   2  ...
Post-Base:  P  ...  1   0
            P = L - B - 1

- Finally, the contents of HEADERS and PUSH_PROMISE frames on request streams
and push streams reference the QPACK table state.

<!-- s/exactly/no more than/ ? -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's leave that question for the moment.

@@ -618,6 +715,7 @@ An encoder that receives an Insert Count equal to zero or one that increases
Largest Known Received beyond what the encoder has sent MUST treat this as a
connection error of type `HTTP_QPACK_DECODER_STREAM_ERROR`.

<!-- move? -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that this belongs in the decoder section.

@@ -633,7 +731,8 @@ After processing a header block whose declared Largest Reference is not zero,
the decoder emits a Header Acknowledgement instruction on the decoder stream.
The instruction begins with the '1' one-bit pattern and includes the request
stream's stream ID, encoded as a 7-bit prefix integer. It is used by the
peer's QPACK encoder to know when it is safe to evict an entry.
peer's QPACK encoder to know when it is safe to evict an entry, and possibly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/QPACK// ?

@@ -633,7 +731,8 @@ After processing a header block whose declared Largest Reference is not zero,
the decoder emits a Header Acknowledgement instruction on the decoder stream.
The instruction begins with the '1' one-bit pattern and includes the request
stream's stream ID, encoded as a 7-bit prefix integer. It is used by the
peer's QPACK encoder to know when it is safe to evict an entry.
peer's QPACK encoder to know when it is safe to evict an entry, and possibly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and remove the possessive.


Header list:

: The ordered collection of header fields associated with an HTTP message. A
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a name-value pair, but the ordered collection? For comparison RFC7541 has "an ordered collection."

x ...
: Indicates that x is variable-length and extends to the end of the region.

# Compression Process Overview

Like HPACK, QPACK uses two tables for associating header fields to indices. The
static table (see {{table-static}}) is predefined and contains common header
fields (some of them with an empty value). The dynamic table (see
{{table-dynamic}}) is built up over the course of the connection and can be used
by the encoder to index header fields repeated in the encoded header lists.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to index header fields repeated in the encoded header lists

A header field does not need to be repeated in order to be indexed and referenced.


Like HPACK, QPACK uses two tables for associating header fields to indices. The
static table (see {{table-static}}) is predefined and contains common header
fields (some of them with an empty value). The dynamic table (see
{{table-dynamic}}) is built up over the course of the connection and can be used
by the encoder to index header fields repeated in the encoded header lists.

- The encoder uses a unidirectional stream to modify the state of the dynamic
table without generating header fields to any particular request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to generate header fields to a request?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "without emitting header fields associated with any particular request"?

table state without modifying it.

- The decoder sends feedback to the encoder on a unidirectional stream that
enables the encoder to manage dynamic table state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which enables the encoder to manage state: decoder feedback or the unidirectional stream?

## Encoder

<!-- new -->
An encoder encodes a header list by emitting either an indexed or a literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An encoder encodes ...

A better wording could probably be found. For example:

  • An encoder compresses a header list... ; or
  • An encoder creates a compressed representation of a header list.

index, which is the smallest absolute index in the dynamic table that it will
emit a reference for. As new entries are inserted, the encoder increases the
draining index to maintain the section of the table that it will not
reference. Draining entries - entries with an absolute index lower than the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use two dashes for the long dash.

reference. Draining entries - entries with an absolute index lower than the
draining index - will not accumulate new references. The number of
unacknowledged references to draining entries will eventually become zero,
making the entry available for eviction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Number agreement: "draining entries" vs "the entry."

decoder's Largest Known Received entry.

When blocking references are permitted, the encoder uses acknowledgement of
header blocks to identify the Largest Known Received index, as described in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/acknowledgement of header blocks/header block acknowledgements/

header lists. It also processes dynamic table modifications from instructions on
the encoder stream.

A decoder MUST must emit header fields in the order their representations appear
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a decoder here, but the decoder in the previous paragraph? The draft uses both "a decoder" and "the decoder;" I cannot detect a method to it.

@@ -123,6 +342,10 @@ addressed.
The static table consists of a predefined static list of header fields, each of
which has a fixed index over time. Its entries are defined in {{static-table}}.

<!--new -->
Note the QPACK static table is indexed from 0, whereas the HPACK static table
was indexed from 1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use present tense: HPACK static table is indexed.


: A unique index for each entry in the dynamic table.

Base Index:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we list the Base Index here, we should also have the Largest Reference.

header block might reference an entry in the dynamic table that has not yet been
received.

Each header block contains a Largest Reference {{header-prefix}} which
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add parentheses around {{header-prefix}}

Copy link
Contributor

@MikeBishop MikeBishop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments so far; I find myself adding +1s to a lot of existing comments. :-)


: An absolute index in a header block from which relative indices are made

QPACK is a name, not an acronym.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣


Like HPACK, QPACK uses two tables for associating header fields to indices. The
static table (see {{table-static}}) is predefined and contains common header
fields (some of them with an empty value). The dynamic table (see
{{table-dynamic}}) is built up over the course of the connection and can be used
by the encoder to index header fields repeated in the encoded header lists.

- The encoder uses a unidirectional stream to modify the state of the dynamic
table without generating header fields to any particular request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "without emitting header fields associated with any particular request"?


Like HPACK, QPACK uses two tables for associating header fields to indices. The
static table (see {{table-static}}) is predefined and contains common header
fields (some of them with an empty value). The dynamic table (see
{{table-dynamic}}) is built up over the course of the connection and can be used
by the encoder to index header fields repeated in the encoded header lists.

- The encoder uses a unidirectional stream to modify the state of the dynamic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bulleted list feels like it comes out of nowhere. Perhaps this could be turned into something more narrative, or at least introduced?


<!-- new -->
An encoder encodes a header list by emitting either an indexed or a literal
representation of each header field in the list. Index references to the static
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with the existing wording; these are representations of the headers. But emitting one for one is also a valid way to construct it.


An encoder can decide whether to risk having a stream become blocked. If
permitted by the value of SETTINGS_QPACK_BLOCKED_STREAMS, compression efficiency
can be improved by referencing dynamic table entries that are still in transit,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this doesn't always result in an improvement, perhaps "can often" or "can sometimes"?

can be improved by referencing dynamic table entries that are still in transit,
but if there is loss or reordering the stream can become blocked at the decoder.
An encoder avoids the risk of blocking by only referencing dynamic table entries
which have been acknowledged, but this means using literals. Since literals make
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, "this could mean"


### Largest Known Received

For the encoder to identify which dynamic table entries can be safely used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"In order to identify...."

{{table-state-synchronize}}).


### Speculative table updates {#speculative-updates}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this section doesn't qualify to be a separate section. This is inherent in the fact that table updates and encoding a header set are separable operations. I'd perhaps condense this to a single sentence and mention it in 2.1?

Updates to the dynamic table can be performed independent of any particular header set being compressed. Encoders MAY insert entries speculatively.

Title needs to be capitalized if retained.


<!-- new -->

Like in HPACK, the decoder processes header blocks and emits the corresponding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More colloquial than I'd choose; "As in HPACK..."?


Regardless of whether a header block contained blocking references, the
knowledge that it has been processed permits the encoder to evict
entries to which no unacknowledged references remain; see {{blocked-insertion}}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we only acknowledge blocks with dynamic references, perhaps this should read "Knowledge that a header block which contains references to the dynamic table has been processed permits [...], regardless of whether those references were potentially blocking."

Copy link
Contributor

@MikeBishop MikeBishop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments so far; I find myself adding +1s to a lot of existing comments. :-)

Added a "Compression Process Overview" section near the top, with a high level description of how to compress/decompress.  This section now encompasses a lot of what used to be "Encoding Strategies".

In the places where I added new text, I temporarily added HTML comments to indicicate so it can get a bit more detailed review.

I also deleted a few sentences/paragraphs/sections that I found to be redundant:

- Preventing Eviction Races
- "An encoder also respects..."
- "For header blocks encoded in..."
- Single Pass Encoding
- "All table updates occur on..."
Not addressed:

1. I left the drawing with 'Draining Entries' without a specific definition in the text.  I think it can be interpreted OK, and 'Unreferencable' isn't actually true.

2. With respect to leaving blocked data in flow control, I changed should to SHOULD instead of can.  My understanding is that you don't have to do a SHOULD, if you have a good reason?  It's important enough that we want to convey more than ability.

3. I skipped redoing the indexing diagram for its own commit

4. I haven't moved the TSS guidance out of the TSS instruction section yet.  I can't lift the whole paragraph without refactoring some of the other text.

5. I think peer's encoder/decoder is correctly possessive.  Doesn't the coder belong to the peer?

6.  I removed all the instances of 'A decoder', but there are still a bunch of references to 'An encoder'.  I'm not sure if the consistency is an improvement in readability yet, so delaying changing more pending feedback.
@martinthomson martinthomson changed the base branch from reorganize to master October 30, 2018 03:24
@martinthomson martinthomson reopened this Oct 30, 2018
@martinthomson
Copy link
Member

Hmm, I wondered why I couldn't clean up the reorganize branch, did it manually, then this got closed as a result. You learn new things every day with GitHub.

I missed some of the comments in the first pass.
@martinthomson
Copy link
Member

So, reviewing this is now hard for some reason. It shows changes that were landed on master. Maybe this will be addressed by creating a new PR from the branch.

Mostly wordsmithing
Moved one section from TSS wire to decoder
@afrind afrind merged commit b93d821 into master Nov 21, 2018
@MikeBishop MikeBishop deleted the reorg-qpack branch February 6, 2019 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-qpack editorial An issue that does not affect the design of the protocol; does not require consensus.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants