Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support arbitrary status codes #177

Closed
Manishearth opened this issue Dec 5, 2014 · 44 comments
Closed

Support arbitrary status codes #177

Manishearth opened this issue Dec 5, 2014 · 44 comments

Comments

@Manishearth
Copy link
Contributor

While the spec goes up till status code 599, we should probably support arbitrary three-digit codes.

This test fails now because of it (I don't mind allowing that failure for now, but I'd like a solution where hyper doesn't outright reject the code).

@Manishearth
Copy link
Contributor Author

(We're using RawStatus initialized to 0, and we're getting that it stays 0 instead of becoming 699)

@reem
Copy link
Contributor

reem commented Dec 5, 2014

This is rather unfortunate, since adding an Extension(u16) variant wouldn't allow the compiler to optimize Status down to two bytes, but instead it would have to have two bytes for the discriminant and two bytes for all values. If we want Extension(uint) it gets even worse, with Status being six bytes.

I'm not sure how big a deal this is (it at all) but it's unfortunate.

@Manishearth
Copy link
Contributor Author

Since Servo uses RawStatus, perhaps some way to pass it down to RawStatus would work?

I agree that we should try to optimize Status as much as possible.

Another option is to add variants till 999 (with a macro?); which is still 2 bytes.

@seanmonstar
Copy link
Member

I'd venture to say that that test itself is broken. The XHR spec itself says that if a status is not within 200-599, then to throw a RangeError: https://fetch.spec.whatwg.org/#concept-response-response

@Manishearth
Copy link
Contributor Author

That's the fetch standard, not XHR. XHR references fetch, but that specific portion is for fetch(), not XHR::send()

This is the relevant bit of the fetch spec for handlign responses (XHR defines the actual handling of the task here )

@dpc
Copy link
Contributor

dpc commented Dec 13, 2014

Why is StatusCode an enum? When I see around 400 lines of:

  /// 132 (unregistered)
  Code132 = 132,

It seems to me it should be struct StatusCode ( u16 ) with some helpers / static constants for common codes.

@seanmonstar
Copy link
Member

Because by being an enum, it limits the values you might pass.

On Sat, Dec 13, 2014, 2:20 PM Dawid Ciężarkiewicz notifications@github.com
wrote:

Why is StatusCode an enum? When I see around 400 lines of:

/// 132 (unregistered)
Code132 = 132,

It seems to me it should be struct StatusCode { u16 } with some helpers /
static constants for common codes.


Reply to this email directly or view it on GitHub
#177 (comment).

@dpc
Copy link
Contributor

dpc commented Dec 13, 2014

I'm just playing so don't bash me too much if you don't like it, but would that work?

dpc@9c2b06b

The only way to create a custom StatusCode is StatusCode::from_code and checks can be added there, and for common codes there are constants.

I don't like the fact that consts are defined in hyper::status namespace instead of hyper::status::StatusCode though. Is there a way to move this consts in right namespace somehow?

Also, I've removed all the the unregistered codes. Their constants could be restored though, and if the problem with the namespace is solved, this could been a drop-in replacement.

@seanmonstar
Copy link
Member

@dpc hm, i'm coming around to that design. I'd likely change from_code to unregistered(), and also add Deref<u16> to it...

@reem
Copy link
Contributor

reem commented Dec 18, 2014

@seanmonstar I wouldn't want Deref<u16> since that would get us a bunch of u16 methods that are undesirable or incorrect to use.

@kud1ing
Copy link
Contributor

kud1ing commented Dec 18, 2014

Maybe another option: adding an enum value Unhandled(uint) that contain values beyond the spec.
This statically isolates specified and non-specified status codes.

@Manishearth
Copy link
Contributor Author

That doubles the size of the struct though. Perhaps with a smaller int type it would get squashed by alignment.

@seanmonstar
Copy link
Member

Actually we can't. It's currently a c-enum, and they can't also have variants containing values.

@dpc
Copy link
Contributor

dpc commented Dec 18, 2014

@seanmonstar : What's the purpose of Deref? To get the raw code? Wouldn't explicit fn to_raw(&self) -> u16 be OK?

@seanmonstar
Copy link
Member

Yea, the suggestion for Deref was because the current enum allows you to do StatusCode::Ok as u16.

@reem
Copy link
Contributor

reem commented Dec 19, 2014

We could implement [From|To]Primitive

@seanmonstar
Copy link
Member

Thery currently do. They just use Options, which is a minor speedbump.

@pyfisch
Copy link
Contributor

pyfisch commented Jan 3, 2015

Maybe we should just ask if they can fix their test? Just my 2 cents.
As I understand it a StatusCode higher thang 599 is just forbidden by all relevant specs.

@Manishearth
Copy link
Contributor Author

Not exactly. The web platform specs don't forbid it explicitly; and this means that there will be apps out there using it. Well, it's the other way around probably -- webapps were using higher status codes and the specs have to be written such that the web isn't broken just because the specs are too strict.

So the test is correct; not something to be "fixed".

@pyfisch
Copy link
Contributor

pyfisch commented Jan 25, 2015

Ok. Does anybody think that there are status codes out there greater 999 or smaller 100? In this case I would add the variants to the enum.

@Manishearth
Copy link
Contributor Author

Not that I know of. Perhaps ask in #whatwg on freenode

@pyfisch
Copy link
Contributor

pyfisch commented Jan 25, 2015

I asked on #whatwg and @Ms2ger answered:

  • XHR defines status codes as 16-bit attributes (unsigned 😀 )
  • HTTP/1.1 RFC7231 defines status codes as "three-digit integer code"
    • We were not sure if 042 is a "valid" status code
  • HTTP/2 simply refers to HTTP/1.1 for a definition.

@Manishearth
Copy link
Contributor Author

Welcome to the world of web specs. We hope you stay, but you probably should get out while you can 😹

@pyfisch
Copy link
Contributor

pyfisch commented Jan 25, 2015

I think it is hopeless 😹

Two 4 more interesting facts:

  • Both Chromium and Firefox understand 42 and 1024
  • Only Chromium understands 2147483647, firefox thinks it is 200
  • Firefox supports negative status codes with integer underflow 👍 ❤️
  • 0 makes problems (Not displayed or understood as 200)

Who wants to test IE, Safari and Opera?

Summary: Status codes are crazy.

Proposal: support u16 (This is what everyone supports. We could use an Option or the value 0 for anything we do not understand. Who uses such crazy codes will not care if they are understood.) Any ideas?

@seanmonstar
Copy link
Member

Ps, my experiment with using a newtype around u16 stalled. I wanted to keep
similar usage to the enum now, but can't have a mod and a struct with the
same name. Associated constants would be needed to finish that.

On Sun, Jan 25, 2015, 12:05 PM Pyfisch notifications@github.com wrote:

I think it is hopeless [image: 😹]

Two more interesting facts:

  • Both Chromium and Firefox understand 42 and 1024
  • Only Chromium understands 2147483647, firefox thinks it is 200

Who wants to test IE, Safari and Opera?


Reply to this email directly or view it on GitHub
#177 (comment).

@pyfisch
Copy link
Contributor

pyfisch commented Jan 26, 2015

Would it be a serious performance bump, if we implemented StatusCodes like Methods? An enum for registered or well known status codes with an extension for unknown codes.

@Manishearth
Copy link
Contributor Author

Not a serious one I guess, but something to be avoided.

@pyfisch
Copy link
Contributor

pyfisch commented Jan 27, 2015

I modified teepee status code handling to support arbitary status code like header methods: https://gist.github.com/pyfisch/cceae250981243c33c5e (I understand that this should be avoided, I just tried to understand teepee and the problem better).

As @seanmonstar noted "associated constants" are needed to not change the interface much if we use an u16. I do not think that they are even planned for the near future (Is this RFC somewhat related? https://github.com/rust-lang/rfcs/blob/091e5fabbbbd0419b8d291a6d1cd8e6e59c56deb/text/0195-associated-items.md).

So the only option left is to use a newtype StatusCode(u16) and to define aliases for common status codes like static INFORMATIONAL: StatusCode = StatusCode(100);. It would change the API but once after there are associated constants we can update the code.

Or is there another possibility?

@Manishearth
Copy link
Contributor Author

We can actually "pretend" to be an enum, like so:

struct StatusCode(pub u16);

mod StatusCode {
  use super::StatusCode;
  pub static Ok: StatusCode = StatusCode(200);
  ....
}

Demo

@reem
Copy link
Contributor

reem commented Jan 27, 2015

I actually quite like @pyfisch's linked modification to the teepee status. I think it makes a lot of sense to lump all non-registered status codes together, rather than having a specific enum variant for all the unregistered codes.

@SimonSapin
Copy link
Contributor

I had a name conflict the last time I tried this, but maybe the language changed.

@Manishearth
Copy link
Contributor Author

@reem Yeah, but it still makes the enum larger than it needs to be -- with my solution above you can have enum-like behavior as well as having a u16 status code, though the matches won't be forced to be exhaustive on normal status codes anymore.

@SimonSapin
Copy link
Contributor

("This" is @Manishearth's pretend-enum.)

@Manishearth
Copy link
Contributor Author

I was surprised too. But as you can see in the linked demo, it magically works! :)

@reem
Copy link
Contributor

reem commented Jan 27, 2015

I think we've blown the additional byte that @pyfisch's implementation requires way out of proportion :P. It's not going to make any difference at all in reality.

@Manishearth
Copy link
Contributor Author

I know, but I still don't like it

(I have a tendency to overoptimize things that don't need it 😛)

@reem
Copy link
Contributor

reem commented Jan 27, 2015

I take responsibility as the first one to bring it up, but I've since realized I was being ridiculous.

@seanmonstar
Copy link
Member

@Manishearth try to impl StatusCode and everything goes to hell.

@seanmonstar
Copy link
Member

@pyfisch here's the associated constants issue.

With the constants, I'd implement like so:

pub struct StatusCode(u16);
impl StatusCode {
    pub const Ok = StatusCode(200);
    // ...
}

Without that, I'm against the newtype, because it causes these issues:

  • as a newtype, the registered codes must be in ALL_CAPS (to make the lint happy)

  • if not all caps, the lint is trigger everywhere they are used

  • mistyping one of the registered constants screws up matches

    match code {
        Ok => (), // triggers lint about wrong-capped constant
        MovedTemporaraly => (), // becomes catch-all, like _
    }
  • except if you use them like enums currently

    match code {
        StatusCode::Ok => (), // lint is all happy
        StatusCode::MovedTemporaraly => (), // error as undefined value
    }

So, until associated constants are implemented, the enum with an Unregistered(u16) seems best.

@pyfisch
Copy link
Contributor

pyfisch commented Jan 28, 2015

@seanmonstar thanks for linking me the issue.

Since you both @reem and @seanmonstar support my attempt to improve status codes, I published the code that is based on teepee at https://github.com/pyfisch/hyperprotocol. I created it as a separate repo since we wanted to split hyper anyway, and it looks for me like a good time to begin with it. Otherwise I can also integrate it into current hyper.

Any suggestions for my header code?

@seanmonstar You proposed Unregistered(u16) as an extension variant name. I admit UnregisteredStatusCode is quite long, but it is like the name for methods https://github.com/pyfisch/hyperprotocol/blob/master/src/method.rs#L56. I think we may should change them both to Unregistered.

@chris-morgan
Copy link
Contributor

I just noticed this and would like to comment on it: I disagree with the premise and will fight for the status quo.

I am operating on RFC 7231’s definition of status-code which is more restrictive than the simple 3DIGIT that the grammar definition in RFC 7230 gives:

The first digit of the status-code defines the class of response.
The last two digits do not have any categorization role. There are
five values for the first digit:

o 1xx (Informational): …

Frankly, I’m not sure why it wasn’t defined in the grammar as %x31-35 2DIGIT, but the text provides further normative restrictions which make 0xx and 6xx–9xx illegal. This state, the grammar permitting something that the text then restricts as illegal, is not at all uncommon.

Similarly, when considering the registry at http://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml, it is worth while bearing in mind that this is an exhaustive registry, unlike the HTTP method, where extension methods are explicitly supported. (This is, incidentally, a change from RFC 2616 which hardcoded the list of known status codes and added an extension-code rule, but that list of its has been obsolete for ages.)

I say it clearly and in direct disagreement with @Manishearth: a test using status code 699 is illegal and is invoking undefined behaviour. Those tests should be removed. There is clear evidence that while many (probably most, historically) things support values outside the range 100–599, some do not (I get the impression this is happening more and more for new things as security is considered more and more important, e.g. in Fetch), so it’s not as though we’re introducing an extra restriction that no one has done before by actually disallowing values outside the range permitted by the specifications.

If the alternative were as elegant or efficient as the current implementation, I would not mind too much about extending it to support illegal states, though I would still fight quietly against it. But as it is, with the language as it is there is no solution even half as good as the current one. I submit as a refresher the article about status-line design that I wrote initially; there are a couple of notable changes in the world since then (RFC 723X superseding RFC 2616, and 1xx being made legal again in HTTP/2), and the Rust syntax has changed in places, but the contents of the article are still sound. Most notably for this discussion, using an enum with an extension code variant has severe backwards-compatibility problems.

@Manishearth
Copy link
Contributor Author

When it comes to the web, one must permit more behavior than the spec constrains us to. The restrictions imposed on people designing webapps by the spec (namely, don't use weird status codes) are not the same as the restrictions for platforms implementing the spec. This is a crucial difference when approaching specs -- are you a platform designer, or a platform user?

For example, while the HTML spec is rather specific as to how tags should be nested, the HTML parsing spec allows for all kinds of edge cases. It's in this spirit that this test was written (at least, it seems like it) -- a browser or other HTTP client should not choke on strange status codes.

If you look at the WPT tests, many of them are doing just that -- using illegal code to invoke edge cases and whatnot. It is unfortunate that they have to do that, but the web has evolved that way, and we have to live with it if we want to support random websites.

There is a macro-based option which should work if we want to support status codes >599 without losing efficiency, or the constants method (without the fake-enum).

Also, as platform implementors we must support behavior defined in the less stricter RFC anyway, so the 3 digit rule seems like a better one to implement.

@reem
Copy link
Contributor

reem commented Jan 29, 2015

Of note for the backwards compatibility problems is the extensible enums
RFC, which might be relevant here.
On Wed, Jan 28, 2015 at 8:48 PM Manish Goregaokar notifications@github.com
wrote:

When it comes to the web, one must permit more behavior than the spec
constrains us to. The restrictions imposed on people designing webapps by
the spec (namely, don't use weird status codes) are not the same as the
restrictions for platforms implementing the spec. For example, while the
HTML spec is rather specific as to how tags should be nested, the HTML
parsing spec allows for all kinds of edge cases. It's in this spirit that
this test was written (at least, it seems like it) -- a browser or other
HTTP client should not choke on strange status codes.

If you look at the WPT tests, many of them are doing just that -- using
illegal code to invoke edge cases and whatnot. It is unfortunate that they
have to do that, but the web has evolved that way, and we have to live with
it if we want to support random websites.

There is a macro-based option which should work if we want to support
status codes >599 without losing efficiency, or the constants method
(without the fake-enum).

Also, as platform implementors we must support behavior defined in the
less stricter RFC anyway, so the 3 digit rule seems like a better one to
implement.


Reply to this email directly or view it on GitHub
#177 (comment).

@pyfisch
Copy link
Contributor

pyfisch commented Feb 7, 2015

Should I create a PR for this? I think it is better to add support for all status codes now and not to wait.

seanmonstar pushed a commit that referenced this issue Feb 14, 2015
As discussed in #177 hyper must support status code outside the
standard range for compatibility reasons.

BREAKING CHANGE: This removes unregistered status codes from the enum. Use
`FromPrimitive` methods to create them now. StatusCode and StatusClass can no
longer be casted to `u16`, use `ToPrimitive` methods now.
For example `status.to_u16().unwrap()` to get the status code number.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants