Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce rack.protocol header for handling version-agnostic protocol upgrades. #1954

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ioquatix
Copy link
Member

@ioquatix ioquatix commented Aug 28, 2022

#1946 for discussion about the semantics/problem.

Copy link
Contributor

@jeremyevans jeremyevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments for details.

Can you add a discussion of how connection upgrades in a rack app are currently handled without this? I assume they are handled differently in HTTP/1 and HTTP/2, but before trying to standardize something to unify behavior between HTTP versions, I would like to know what a rack app currently needs to return to initiate a procotol upgrade. It sounds like HTTP/1 uses the Upgrade request and response headers, but I'm not sure how it is handled currently for HTTP/2, or whether it is impossible in HTTP/2 without this change (or similar change to handle HTTP/2 pseudo headers).

@@ -179,6 +180,11 @@ def check_environment(env)
## to +call+ that is used to perform a full
## hijack.

## <tt>rack.protocol</tt>:: If the request is an HTTP/1 upgrade or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in favor of the rack.protocol naming. It seems like it would be which HTTP version is in use (SERVER_PROTOCOL) or http or https (rack.url_scheme).

I assume the HTTP/1 upgrade header is already available in HTTP_UPGRADE. Is the HTTP/2 protocol pseudo header exposed anywhere yet?

Absent a better idea, I recommend switching to rack.upgrade_protocol, since it looks this this is used for upgrading the protocol used for the connection.

Regardless of what we select, I think we want to explicitly specify that this in an array (even if it can only contain one value in certain HTTP versions), since it looks like we specify that the env key should be an array later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP/2 literally called this the :protocol pseudo header. I actually don't have a strong opinion about the name but I've already used this name for my current implementation of websocket simply because I had to choose something and it seemed like the logical approach.

I also checked https://datatracker.ietf.org/doc/html/rfc3875 and could find no reference about how to handle connection: upgrade requests. It doesn't seem like the CGI specification which we follow for our env keys has any guidance on the matter.

rack.upgrade_protocol - I would suggest we avoid this, or the "upgrade" terminology in general, as HTTP/2 explicitly calls this out in a number of places as being unsupported and I think the word "upgrade" is deprecated in favour of "connect" and "protocol".

The specific introduction of the protocol pseudo header is here: https://www.rfc-editor.org/rfc/rfc8441.html#section-4

I'm fine with specifying that it should be an Array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rack.connect_protocol maybe? rack.protocol is too confusing as a name for this, IMO.

I would still like to know how changing the connection protocol is currently handled for HTTP/2 in rack. Is it impossible with the current design, does it require server-specific code, or is it possible and just implemented a different way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rack.protocol is the intersection of all the different semantics. connect_protocol is too specific to HTTP/2 that it makes the HTTP/1 use case clumsy. It doesn't make a lot of sense to have rack.connect_protocol in the response headers map to an HTTP/1 connection upgrade IMHO. I understand the value of being more specific, but this isn't all or nothing, the intersection between HTTP/1 and HTTP/2 semantics is very wide and doesn't invite more specific naming IMHO. In other words, being too specific in the naming is equally confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably should have been more specific. There is nothing that ties the :protocol pseudo header to the CONNECT method, so linking them as rack.connect_protocol would be a bit clumsy if later on another method was introduced that also used the :protocol pseudo header with a different method.

lib/rack/lint.rb Outdated Show resolved Hide resolved
## environment.
protocol = headers['rack.protocol']
if protocol
request_protocols = Array(@env['rack.protocol'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the env entry is required to be an array if present, we shouldn't need to use Array to turn it into an array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with that, however in HTTP/2, the semantics are that it should only contain a single value, i.e. a string, while in HTTP/1, it can have multiple values. I'm not sure if we want to just specify that it can be a string (single value HTTP/2 semantics) or Array (multiple values HTTP/1 semantics) or an Array with a single value (HTTP/2 wrapped in HTTP/1 semantics). It's probably easier to use if we always assume it's an Array, but 99.9999999999% of the time, this will always be a single value, websocket. I've never seen anyone with an upgrade header like upgrade: websocket, smtp, redis. I think that's why HTTP/2 standardised it as a single-value pseudo header - realistically no one is asking to upgrade to several distinct protocols, and/or protocol versions. I don't have a strong opinion, but I consider the HTTP/1 semantic to be largely "deprecated" based on the directions the standards appear to be going in. I don't have a strong opinion here either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is going to 99.9999999999% be a string in HTTP/1, and 100% in HTTP/2, we probably should just specify it must be a string. If a client submits upgrade: websocket, smtp, and the app uses the response header upgrade: websocket, it wouldn't pass Lint since it wouldn't match, and I think that's OK. We shouldn't design for the 00.0000000001% use case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be okay with this, but async-http will internally follow the HTTP semantics and has it as an Array, which is how servers should be checking it. Therefore, I think we should encourage servers to implement the behaviour correctly, essentially:

if env['rack.protocol'].find?{|protocol| protocol.casecmp?('websocket')}

@ioquatix
Copy link
Member Author

Can you add a discussion of how connection upgrades in a rack app are currently handled without this?

Essentially, they aren't.

I tried to do protocol upgrade with Puma, thinking at least it should work with a hijacked body, but apparently it's unsupported: puma/puma#2914

I don't know about other servers, but I assume they are equally complicated as it's never been fully specified or tested.

I assume they are handled differently in HTTP/1 and HTTP/2, but before trying to standardize something to unify behavior between HTTP versions, I would like to know what a rack app currently needs to return to initiate a procotol upgrade.

Yes, it's totally different between HTTP/1 and HTTP/2 - HTTP/1 supports connection: upgrade while HTTP/2 uses the CONNECT method. There is very little similarity between the two methods except that the result is a bi-directional stream of some specific wire format.

The closest thing we can unify is that "I have a list of protocols I'd like you to use for this connection".

  • HTTP/1 has multiple protocols and the response must indicate which one was selected and return status 101 and upgrade: $protocol to accept.
  • HTTP/2 has a single protocol and the response must respond with status 200 to accept.

It sounds like HTTP/1 uses the Upgrade request and response headers, but I'm not sure how it is handled currently for HTTP/2, or whether it is impossible in HTTP/2 without this change (or similar change to handle HTTP/2 pseudo headers).

Yes, HTTP/1 uses the upgrade mechanism, and no such mechanism exists for HTTP/2.

This might beg the question, why did HTTP/1 use an upgrade mechanism for WebSocket? My guess is because of poorly conforming browsers and servers. Upgrade is like an escape hatch. But it's not needed - HTTP/1.1 supports bi-directional streaming and in theory the same "CONNECT" style could work in HTTP/1.1 just fine. In my opinion, connection upgrade is a poorly designed feature of HTTP/1 and we should not seek to standardise on it, rather following the direction of HTTP/2+ with the protocol pseudo header. Unfortunately, the vast majority of web socket connections use HTTP/1 upgrade mechanism, so it's unavoidable, but I imagine in 5-10 years it will be much less relevant in practice.

The reason for this, is because HTTP/1 upgrade connections consume an entire TCP connection, which is the very reason HTTP/2+ was designed to avoid with stream multiplexing.

My guess is, eventually we won't need WebSockets - full bi-directional streaming will eventually be supported making this entire piece irrelevant, but that's still along way off: whatwg/fetch#1438

@ioquatix ioquatix self-assigned this Aug 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants