How to handle (legal vis-à-vis HTTP) `OPTIONS *` requests #2117

doriantaylor · 2023-08-28T01:57:57Z

On the basis that PATH_INFO can already have a value of * if Rack::Lint is not used, we can accept this PR. However, I still think we should have a follow up discussion about the right behavior here.

Originally posted by @ioquatix in #2114 (comment)

This is a thorny issue because while RFC9110 asserts (along with its predecessors dating all the way back to RFC2068) that an OPTIONS * request is a perfectly valid mechanism for obtaining global information about a Web server's capabilities. However, the CGI spec, RFC3875, upon which the Rack specification is based, does not seem to acknowledge this validity, and thus does not specify how it should be implemented. An informal poll of Web server implementations suggests that the prevailing behaviour is to put the * of OPTIONS * into the PATH_INFO variable, while leaving SCRIPT_NAME untouched. What makes this issue thorny is that Rack applications (that are running without Rack::Lint) that do not check the request method could be in for a rude surprise if they try to use the PATH_INFO of an OPTIONS * under the assumption that it it is either empty or starts with a /. A particularly obvious hazard is if the * character gets treated as a glob(3) pattern.

As it stands for Rack, the method Rack::Request#path in particular does not account for the possibility that PATH_INFO may contain anything other than an empty string or one that starts with /, per RFC3875 (again, it is my opinion that RFC3875 itself is in error). This situation propagates to other methods in Rack::Request, such as #fullpath, #url, and everything downstream from there.

One solution (within Rack::Request at least) would be, in the call to #path, to check if the request method was OPTIONS and the PATH_INFO was * and faking up a result in that case (e.g. using / in lieu). That would mitigate the potential for harm at least for those applications using Rack::Request. Those who do not would have to be warned to test the request method before dispatching any URI behaviour (a sensible practice in any case).

I should also footnote that the OPTIONS method is not the only one where the request-URI is something other than that which starts with a /; the CONNECT method for one (which one could conceivably do something useful with in Rack using connection hijacking) and the PRI * request for upgrading to HTTP/2 are other specimens, off the top of my head. The point here is it's not just the special case of OPTIONS *.

The text was updated successfully, but these errors were encountered:

ioquatix · 2023-08-28T05:31:39Z

My initial feeling is, at least Request#path should never contain *. I think in this case, OPTIONS *, Request#path should return nil. At worst, this will blow up in places where it would always do the wrong thing.

cc @tenderlove do you have any opinion?

ioquatix · 2024-05-30T02:21:17Z

In HTTP/1, the RFC sometimes refers to the request "target". However in HTTP/2, this is specified using the :path pseudo header.

Therefore, I think we can assume the correct name is "path".

Upon thinking about this, PATH_INFO was probably a hack to work around the PATH environment variable already being specified, since the CGI specification was largely based on executing CGI binaries, providing inputs as environment variables, along with stdin and stdout for request body and response body respectively.

Therefore, based on the fact that HTTP/2 uses :path => * for OPTIONS * requests, I think we have to accept that in Rack, it's equally okay. In other words, I think Rack::Lint should be updated:

On methods with well defined path-based semantics, e.g. GET, POST and so on, we can enforce that PATH_INFO starts with /,
Otherwise we shouldn't attempt to interpret the value of PATH_INFO.
Alternatively, we can simply avoid making any assertion at all.

@jeremyevans what do you think of updating the spec in that regard?

doriantaylor · 2024-05-30T02:32:14Z

Upon thinking about this, PATH_INFO was probably a hack to work around the PATH environment variable already being specified, since the CGI specification was largely based on executing CGI binaries, providing inputs as environment variables, along with stdin and stdout for request body and response body respectively.

IIRC PATH_INFO is part of the Apache request record struct dating back who knows how long and it's whatever is left in the request-URI path after matching a handler.

ioquatix · 2024-05-30T02:45:05Z

In the HTTP RFCs, it's usually referred to as either "path" or "target". And in HTTP/2, it's always referred to as "path".

Regarding historical context, the first usage can be traced back to the CERN httpd: https://github.com/hackervera/cern-httpd/blob/a709a028c11b486c58cad53a09658d30ba327694/Daemon/Implementation/HTScript.c#L584

It simply would have been impossible to call it PATH as it would clash with the shell environment: https://github.com/hackervera/cern-httpd/blob/a709a028c11b486c58cad53a09658d30ba327694/Daemon/Implementation/HTScript.c#L712

The only other thing I'd add to this, is just because CERN's httpd server did it this way (canonicalisation), doesn't mean it's actually a good name, which is why we see the CGI RFC naming conventions diverging from the HTTP RFCs IMHO, and rightly so.

ioquatix · 2024-05-30T02:53:04Z

RFC9112 specifies the behaviour we should follow.

  request-target = origin-form
                 / absolute-form
                 / authority-form
                 / asterisk-form

Specifically:

When making a request directly to an origin server, other than a CONNECT or server-wide OPTIONS request (as detailed below), a client MUST send only the absolute path and query components of the target URI as the request-target. If the target URI's path component is empty, the client MUST send "/" as the path within the origin-form of request-target. A Host header field is also sent, as defined in Section 7.2 of [HTTP].

I'll try to cut a PR so that Rack::Lint follows this specification and we can discuss it further.

ioquatix · 2024-05-31T00:23:22Z

There is one limitation I didn't consider - HTTP/2 does not support the authority form and instead uses the :authority pseudo-header. We may like to add extra validation for this later, but Rack (loosely) follows the CGI specification, explicit validation of HTTP/2 concerns seemed out of scope, at least for the first pass.

ioquatix mentioned this issue May 30, 2024

Fine tune PATH_INFO validation in Rack::Lint. #2181

Merged

ioquatix closed this as completed in #2181 May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle (legal vis-à-vis HTTP) `OPTIONS *` requests #2117

How to handle (legal vis-à-vis HTTP) `OPTIONS *` requests #2117

doriantaylor commented Aug 28, 2023 •

edited

ioquatix commented Aug 28, 2023 •

edited

ioquatix commented May 30, 2024 •

edited

doriantaylor commented May 30, 2024

ioquatix commented May 30, 2024 •

edited

ioquatix commented May 30, 2024

ioquatix commented May 31, 2024

How to handle (legal vis-à-vis HTTP) OPTIONS * requests #2117

How to handle (legal vis-à-vis HTTP) OPTIONS * requests #2117

Comments

doriantaylor commented Aug 28, 2023 • edited

ioquatix commented Aug 28, 2023 • edited

ioquatix commented May 30, 2024 • edited

doriantaylor commented May 30, 2024

ioquatix commented May 30, 2024 • edited

ioquatix commented May 30, 2024

ioquatix commented May 31, 2024

How to handle (legal vis-à-vis HTTP) `OPTIONS *` requests #2117

How to handle (legal vis-à-vis HTTP) `OPTIONS *` requests #2117

doriantaylor commented Aug 28, 2023 •

edited

ioquatix commented Aug 28, 2023 •

edited

ioquatix commented May 30, 2024 •

edited

ioquatix commented May 30, 2024 •

edited