Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rough in server scheduling guidance #1266

Merged
merged 13 commits into from Sep 30, 2020
Merged

Conversation

LPardue
Copy link
Contributor

@LPardue LPardue commented Sep 17, 2020

This tries to address two sets of comments on "what signals are subsumed by priority hints" and "how should a server implement things".

One aspect of this is setting client expectations straight; they should have a vague idea what will happen if a server plays ball but also expect that servers can and will do whatever they want.

The other aspect of this is describing what signals servers have at their disposal, and presenting some of the gotchas or tradeoffs that might happen if the extensible priority scheme is implemented too matter-of-fact. Some people would like to see more explicit guidance for servers, that's a fine request. However, I don't see how any single scheduling algorithm would work for the range of vendors and deployments that have shown an interest Priorities, so I've focused on the common criteria.

While adding this section, the thought did cross my mind to move the sections on scheduling. We can always do that as a followup.

Closes #1216 and #1232.

cc @ekinnear, @guoye-zhang, @martinthomson

draft-ietf-httpbis-priority.md Outdated Show resolved Hide resolved
given urgency level can align well with clients usage of HTTP; such as user
agents that load document trees where ordering is important.

For non-incremental resources the total download time (time to first byte - time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it time from request to last byte delivered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably open to interpretation a little so it would be good to hammer it down. Looking at this again, I don't like how I presented it and will tweak it. I was basing it on my understanding of curl and chrome as shown here https://blog.cloudflare.com/a-question-of-timing/. Do other client have alternative views?

My thinking, based on other discussions, is that a non-incremental payload can only be used when the whole payload is received. Therefore, factoring in the delta between request and TTFB is not super helpful in this context. However, non-incremental objects probably do benefit from a shorter TTFB and "time to significant delineator", such as a progressive jpeg header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For things like JS, we often can't continue processing until we have all the bytes. That's what I meant by atomic below: the whole thing is needed, it can't be divided into smaller parts and be useful. But atomic probably has connotations that aren't helpful.

Comment on lines 550 to 551
to last byte) is important. For incremental resources chunk download times are
important, especially the first. A server that receives a mix of incremental and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For incremental resources, the time to deliver every byte is important.

I wonder if the split here is between atomic and incremental.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some types of content not all bytes are important due to internal boundaries. I'd like to avoid falling into the trap of describing this in too much detail if possible. I tried to use "chunk" as the sub-unit that could be as small as 1 byte.

What do you mean by atomic vs incremental?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

This is a simplification: absent more specific information the server must assume that i=false the entire resource needs to be delivered to be of use. For i=true the more of the resource that is delivered the more utility is obtained. Obviously this isn't smoothly linear and there are some bytes that don't allow any more value to be extracted than those preceding, but that is the model that we are operating under.

Extensions can define cut points or chunking. Like a stepped incremental extension that lists a number of byte offsets that contain significant value (which likely needs to come from the server...). Or something that says "don't bother delivering anything less than the following chunk size, because MICE doesn't allow data to be used".

draft-ietf-httpbis-priority.md Show resolved Hide resolved
Comment on lines 553 to 555
factors. An unbalanced scheduler might prefer one type over another, leading to
sub-optimal loading and in the worst case starvation of one type. Servers are
RECOMMENDED to avoid starvation but no specific method of doing so is prescribed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this point about unbalanced scheduling. It seems to imply that there might be some other reason to balance between atomic and incremental resources, but I don't think that is the intent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #1232 (comment)

Let's imagine two cases:

  1. At the same urgency level, a huge non-incremental file download has started, then a small incremental resource is requested.
  2. At the same urgency level, an incremental hanging GET is waiting for response, while a non-incremental file download is requested.

An unbalanced scheduler might be designed to completely flush one type of resource before moving on the the other. Especially likely if resources are large in comparison to the BDP. This could starve the other type from ever getting a share. The text is attempting to say don't do that. To avoid it, an implemention could somehow yield sending one type in a given time period. Guoye's suggestion on #1232 (comment) provides an example. I like to avoid recommending any specific solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that you want to use the word unbalanced because it carries negative connotations.

What those examples are suggesting is a few questions:

Do you want to permit small responses to jump ahead of in-progress responses?
What if something that is nominally higher priority (by order, not urgency) can't start until a lower priority response has begun? Can it pre-empt? Can it pre-empt only based on size?
(There's another one I thought of: Do you permit non-incremental responses to start when incremental responses are in progress? Maybe you have an answer for that already.)

These imply judgment being exercised or the presence of "other inputs". But it is probably sensible to allow it, but that is clearly not the intent conveyed by the signal. Maybe size-based discrimination is appropriate, but it's not something that this scheme supports, so you need to be careful.

What I would do then is to enumerate corner cases where we know that strict adherence to the scheme could end up with suboptimal results. These cases are exactly the ones that I'd include there.

This scheme can't address these cases. So you need a clear delineation between what following the scheme gets you and where you are using "special sauce". This paragraph went from describing how this scheme operates and the consequences of that straight to special sauce stuff with no pause for breath.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworked this paragraph to address the points here and in the other comment. See a215d40

(my force push seems to have broken Github's tracking, I aplogize).

Comment on lines 558 to 559
An HTTP/2 server that sends SETTINGS_DEPRECATE_HTTP2_PRIORITIES ({{disabling}})
SHOULD NOT act on HTTP/2 priority signals.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if that is all it gets? I think that you need to provide more exposition for this recommendation.

Copy link
Contributor Author

@LPardue LPardue Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some more exposition in 3c7f8e6. PTAL.

Comment on lines 531 to 533
Clients can expect servers will make prioritization decisions, including
ignoring all signals. And they should expect that decisions might be based on
metadata or information beyond the scope of extensible priorities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little odd. It says "Server's gonna do what server wanna do." in a somewhat roundabout way. How about:

Clients cannot depend on particular treatment based on priority signals. Servers can use other information to prioritize responses.

Comment on lines 544 to 545
receives concurrent requests at the same urgency level might serve the responses
one-by-one but it needs to pick an order. Serving the lowest Stream ID in a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the "might server the responses one-by-one" is distracting, and the phrasing here implies far more discretion for servers. Maybe just

Prioritizing concurrent requests at the same urgency level based on the stream ID, which corresponds to the order in which clients make requests, ensures that clients can use request ordering to influence response order.

given urgency level can align well with clients usage of HTTP; such as user
agents that load document trees where ordering is important.

For non-incremental resources the total download time (time to first byte - time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For things like JS, we often can't continue processing until we have all the bytes. That's what I meant by atomic below: the whole thing is needed, it can't be divided into smaller parts and be useful. But atomic probably has connotations that aren't helpful.

Comment on lines 550 to 551
to last byte) is important. For incremental resources chunk download times are
important, especially the first. A server that receives a mix of incremental and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

This is a simplification: absent more specific information the server must assume that i=false the entire resource needs to be delivered to be of use. For i=true the more of the resource that is delivered the more utility is obtained. Obviously this isn't smoothly linear and there are some bytes that don't allow any more value to be extracted than those preceding, but that is the model that we are operating under.

Extensions can define cut points or chunking. Like a stepped incremental extension that lists a number of byte offsets that contain significant value (which likely needs to come from the server...). Or something that says "don't bother delivering anything less than the following chunk size, because MICE doesn't allow data to be used".

Comment on lines 553 to 555
factors. An unbalanced scheduler might prefer one type over another, leading to
sub-optimal loading and in the worst case starvation of one type. Servers are
RECOMMENDED to avoid starvation but no specific method of doing so is prescribed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that you want to use the word unbalanced because it carries negative connotations.

What those examples are suggesting is a few questions:

Do you want to permit small responses to jump ahead of in-progress responses?
What if something that is nominally higher priority (by order, not urgency) can't start until a lower priority response has begun? Can it pre-empt? Can it pre-empt only based on size?
(There's another one I thought of: Do you permit non-incremental responses to start when incremental responses are in progress? Maybe you have an answer for that already.)

These imply judgment being exercised or the presence of "other inputs". But it is probably sensible to allow it, but that is clearly not the intent conveyed by the signal. Maybe size-based discrimination is appropriate, but it's not something that this scheme supports, so you need to be careful.

What I would do then is to enumerate corner cases where we know that strict adherence to the scheme could end up with suboptimal results. These cases are exactly the ones that I'd include there.

This scheme can't address these cases. So you need a clear delineation between what following the scheme gets you and where you are using "special sauce". This paragraph went from describing how this scheme operates and the consequences of that straight to special sauce stuff with no pause for breath.

Copy link
Contributor

@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the work. Left some comments. PTAL.

draft-ietf-httpbis-priority.md Outdated Show resolved Hide resolved
to prioritization. Prioritizing concurrent requests at the same urgency level
based on the Stream ID, which corresponds to the order in which clients make
requests, ensures that clients can use request ordering to influence response
order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving the contents of this paragraph to the one above that talks about urgency, and going like: When there are multiple responses with same urgency, a server SHOULD ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this and didn't like what I came up with. The progression from urgency, to incremental, to request order feels more natural to me. Combining the text as you suggest also produced a strange implication that request order is only important for requests at the same urgency, which I don't agree with.

I'd happily review a suggestion that avoids these problems. Perhaps we can make a separate editorial PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think I can work on a separate PR. That PR can point to this PR or master, depending on how we proceed.


An HTTP/2 server implementing the Extensible Priorities scheme instead of the
HTTP/2 priority sends SETTINGS_DEPRECATE_HTTP2_PRIORITIES; see {{disabling}}. It
SHOULD NOT act on priority signals belonging to the HTTP/2 scheme. The absence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence sounds like that a server cannot respect the PRIORITY frames sent by a legacy HTTP/2 client.

I think that the intent is to state something like: When a client sends SETTINGS_DEPRECATE_HTTP2_PRIORITIES, a server SHOULD NOT act ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think we come to this from different viewpoints. I was trying to address the statement in {{disabling}}

The SETTINGS frame precedes any priority signal sent from a client in HTTP/2, so a server can determine if it should respect the HTTP/2 scheme before building state.

I've reworked the paragraph to accomodate either client or server sending the setting, PTAL.

draft-ietf-httpbis-priority.md Outdated Show resolved Hide resolved
instead of the HTTP/2 priority scheme by sending
SETTINGS_DEPRECATE_HTTP2_PRIORITIES; see {{disabling}}. A server that sends or
receives this setting SHOULD NOT act on priority signals belonging to the HTTP/2
scheme. The absence of a client Extensible Priority signal SHOULD be treated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this sentence is correct. I think that a server is expected to respect the H2 prioritization scheme unless the client sends SETTINGS_DEPRECATE_HTTP2_SETTINGS.

I also think that we might want to move the suggestion to {{disabling}}, as it talks about how the client handles the existence (or absence) of the settings parameter? Doing so would be fine, as we refer to that section in the previous sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this would fit in {{disabling}}. That section includes the sentence

The SETTINGS frame precedes any priority signal sent from a client in HTTP/2, so a server can determine if it should respect the HTTP/2 scheme before building state.

So I intended the new text to build on that. My mental model was that a server might just want to cut out a lot of H2 priorities code, leaving only the bits necessary for parsing. Such a server cannot act on the signal and declares so using the deprecate setting. It's unfortunate if the client wants to continue using the old scheme, but there would not be an interop failure.

I could live with downgrading things to only provide this guidance when a server receives the setting. Would that work for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could live with downgrading things to only provide this guidance when a server receives the setting. Would that work for you?

Thanks. I think my preference goes there. IMO, we do not need to recommend servers degrade performance of legacy HTTP/2 clients. So maybe something like: When receiving SETTINGS_DEPRECATE_HTTP2_PRIORITIES, a server MUST ignore the HTTP/2 PRIORITY frames received on that connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in f29bd50

to prioritization. Prioritizing concurrent requests at the same urgency level
based on the Stream ID, which corresponds to the order in which clients make
requests, ensures that clients can use request ordering to influence response
order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think I can work on a separate PR. That PR can point to this PR or master, depending on how we proceed.

@LPardue
Copy link
Contributor Author

LPardue commented Sep 30, 2020

Thanks for the contribution @kazuho. I'm going to squash and merge this.

@LPardue LPardue merged commit 741e80c into master Sep 30, 2020
@LPardue LPardue deleted the priority-server-scheduling branch September 30, 2020 12:02
richanna pushed a commit to richanna/http-extensions that referenced this pull request Oct 20, 2020
* rough in server scheduling guidance

Co-authored-by: Martin Thomson <martin.thomson@gmail.com>
Co-authored-by: Kazuho Oku <kazuhooku@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

What signals are subsumed by priority hints?
3 participants