Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory commitment for header parsing #238

Closed
martinthomson opened this issue Jan 26, 2017 · 4 comments
Closed

Memory commitment for header parsing #238

martinthomson opened this issue Jan 26, 2017 · 4 comments
Labels
-http design An issue that affects the design of the protocol; resolution requires consensus.

Comments

@martinthomson
Copy link
Member

Parsing all possible header blocks could require a significant number of header parsers operating concurrently. Since headers tend to be delivered as a single package to applications, that means that the parsers need sufficient space to hold all the partially complete data.

This can be addressed by only reading from the streams that an endpoint is able to, but that increases complexity and exposure to head-of-line blocking.

This cleaves off the second part of #204.

@martinthomson martinthomson added design An issue that affects the design of the protocol; resolution requires consensus. -http labels Jan 26, 2017
@mnot mnot added this to Headers in HTTP Apr 28, 2017
@martinthomson
Copy link
Member Author

@marten-seemann, I'm trying to understand our options here:

  1. Do nothing
  2. Provide some advice
  3. Provide a mechanism

Despite the complications, I'm inclined toward tackling the advice angle on this one. Header parsers are likely a shared resource that might be shared across multiple connections and there are advantages in solving the atomicity problem more generally (see below for more on this).

Advice

This might be along the lines of "wait for the end of a header block before you parse it lest you commit to having many concurrently active parsers".

The problem here is that you want back pressure. You don't want to read from the transport (and open up the flow control window) or you are buffering multiple concurrent header blocks as bytes instead of headers and you haven't solved the problem.

So you need additional advice that transport APIs provide an application protocol implementation the ability to peek at stream content without releasing it from the transport. A peek API has downsides, and it wouldn't be necessary if we only allowed a single HEADERS frame, because you could just read the HEADERS header to get the length and then wait until the transport has that many bytes available.

Either way, this ensures that flow control is used to limit resource exhaustion.

The downside of using flow control this way is that you also need sender involvement so that you don't deadlock. Too many partially completed header blocks in progress will ensure that the sender can't send any more, but they need to send more in order to complete any one of those header blocks. That forces some pretty complicated logic on the sender side to manage completion of a header block alongside other prioritization concerns. In a sense, the need to avoid deadlock causes completing a header block to be more important than just about anything else.

Maybe we need this functionality more generally: the ability to identify a slice of bytes on a stream that represent an atomic unit.

The Mechanism: Limit Header Parsers

We could instead apply a strict limit to the number of header parsers. This implies adding a measure of HoLB to the protocol.

The sequence numbers that Mike added to serialize HPACK had this effect, but they reduced it to just one, you can imagine a design with multiple "threads" and some way to limit the number of threads.

(Maybe we need a new tag for this sort of problem: "problems that h2 didn't have, but we discover that QUIC inadvertently introduced")

@RyanTheOptimist
Copy link
Contributor

Now that we're back to the single stream per request, I'm not sure that I understand the crux of the issue here. Since headers are delivered on the request stream, are the implications of flow control any different than reading body data?

@martinthomson
Copy link
Member Author

@RyanatGoogle, the problem is unchanged in that regard (the only impact of that change being that bodies are blocked by headers). The problem is that a server that allows N concurrent streams potentially has to run N concurrent header parsers. That's a state commitment that the protocol doesn't account for at all. Or, more correctly, it's a state commitment that the server cannot constrain other than by limiting concurrent requests. It's a problem that could lead to new and interesting variations on the Slowloris attack.

@MikeBishop
Copy link
Contributor

OBE: We now allow a single HEADERS frame, thanks to #905. This means that an implementation could read the length field, determine whether it has the full headers frame, and stop reading until it does. (However, this might not be advisable if the headers are larger than the flow control window. This footgun remains unchanged.)

@mnot mnot removed this from Headers in HTTP Mar 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-http design An issue that affects the design of the protocol; resolution requires consensus.
Projects
None yet
Development

No branches or pull requests

3 participants