Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host and :authority must agree #968

Merged
merged 7 commits into from
Sep 24, 2021

Conversation

martinthomson
Copy link
Collaborator

This makes a few changes, restricting things further than before. For
the most part, this removes an allowance in the original specification
that had Host and :authority potentially differing. The goal of that
was - from memory - to preserve some of the inherent quirks in HTTP/1.1.
That turns out to be more of a liability than an asset and far less
important now that we have a more formal understanding of the structure
of requests.

Closes #905.

This makes a few changes, restricting things further than before.  For
the most part, this removes an allowance in the original specification
that had Host and :authority potentially differing.  The goal of that
was - from memory - to preserve some of the inherent quirks in HTTP/1.1.
That turns out to be more of a liability than an asset and far less
important now that we have a more formal understanding of the structure
of requests.

Closes httpwg#905.
Copy link
Member

@mnot mnot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things to think about.

Also, it seems like something should be listed in 'Changes from...'?

draft-ietf-httpbis-http2bis.xml Outdated Show resolved Hide resolved
@@ -2934,16 +2934,29 @@ cookie: e=f
pseudo-header field to convey authority information, unless there is no authority
information to convey (in which case it MUST NOT generate :authority).
</t>
<t>
Clients MUST NOT generate a request with a <tt>Host</tt> header field that differs
from the <tt>:authority</tt> pseudo-header field. A server MAY treat a request as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get your reasoning for MAY here instead of something stronger. It seems that this is a security issue, so we should have stronger reasoning for optionality here. Personally I think this should be at least SHOULD, if not MUST.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning is that old implementations won't apply this rule (so it can't be MUST) and some might choose not to (as it requires an extra comparison). I'm OK with "SHOULD", but "MAY" seemed safer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO if there's significant security impact, it can be a MUST - especially since this is about recipient behaviour, not sender behaviour.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @mnot. I also think that for senders we can safely say that the fields MUST be byte-for-byte identical.

<t>
Clients MUST NOT generate a request with a <tt>Host</tt> header field that differs
from the <tt>:authority</tt> pseudo-header field. A server MAY treat a request as
malformed if it contains a <tt>Host</tt> header field that is different from the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 'different' here? If one is present and one isn't, is that different? What if they have case differences? Etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't specify that deliberately, allowing someone to treat "example.com" and "example.com:443" and "Example.com" each as different to the other.

Would it be best to point that out?

Servers are not required to normalize these fields before comparing them, which means that clients that produce both fields need to make the values byte-for-byte identical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just add 'when their values are compared, byte-for-byte'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should not be compared byte-for-byte, we did this mistake recently and broke a few clients :-)
RFC3986#6.2.3 explains the scheme-based normalization, that works particularly well for H2 (in short, drop ":80" on "http" and ":443" on "https").

I think I'm fine with a SHOULD though. As Stefan (I think) mentioned, most implementations currently do not perform the comparison and a MUST would instantly make them non-compliant.

HTTP/1.1 messaging proceeds differently (in 3.2). It enumerates what a client must send, then says that a recipient MUST reject as badreq any request with invalid or missing Host. This leaves a bit of gray area about what you consider as invalid but does mandate that some checks are performed. I'm not sure I like this approach better, but I wanted to note it if that can fuel the discussion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely think we should leave a breadcrumb to, say, scheme-based normalization, as well as to the relevant bit of the -semantics draft (§ 7.2 Host and Authority).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with citing RFC 3986 is that it doesn't actually specify anything in Section 6.2. It's all examples.

Section 7.2 of -semantics is similarly unhelpful.

However, as the server is authoritative, perhaps we can rely on its own definitions of authority and say if the fields identify different authorities - by its own definition - it can reject the request. Would that work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'd apply the same guidance to intermediaries? I think that's probably acceptable from where I'm sitting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've worked something out. It's pretty gross though. Let me know what you think.

source of this information; see <xref target="HTTP" section="7.2"/>.
</t>
<t>
An intermediary that forwards a request received in HTTP/2 via HTTP/1.1 MUST set the
Copy link
Contributor

@wtarreau wtarreau Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still embarrassed by the normative language involving H1 here. What about this variant:

An intermediary that needs to produce a Host header field (e.g. to translate
an HTTP/2 request to HTTP/1.1) MUST set the ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a good suggestion. I've taken it.

Copy link
Collaborator

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is well-constructed.

@wtarreau
Copy link
Contributor

I think that's good as well. Let's have @gregw and Stefan have a look in order to avoid further round trips.

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are good.... but I have a couple of niggles about cumbersome wording... not that I have better suggestions.

from the <tt>:authority</tt> pseudo-header field, when compared byte-for-byte. A
server SHOULD treat a request as malformed if it contains a <tt>Host</tt> header
field that identifies a different entity to the <tt>:authority</tt> pseudo-header
field. The values of fields need to be normalized to compare them (see <xref
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "of fields" apply to all fields, all pseudo fields or just :authority fields?

Does the normalization "need" apply to the byte-for-byte comparison above ie. do MUST they only be byte-for-byte identical after normalization?

Is a "need" a MUST or a SHOULD?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this instance, I think "need" is a statement of fact -- these are hostnames, therefore not case-sensitive, therefore they need to be normalized to perform a useful comparison. No need for RFC2119 language here. However, that implies that "when compared byte-for-byte" above is wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're placing a stronger requirement on the sender than the receiver. The receiver requirements are more lax because we want to allow for folks like @icing who might want to avoid extra work to do so (either because they are happy that their h2 implementation is otherwise OK and don't want to open the patient to add extra checks that will mostly just slow them down). We don't need to see every server doing this rejection, just a few. Clients will learn not to put different values in.

That said, I think we can say "identical" in the first part without resorting to "byte-for-byte". "byte-for-byte" is just the most obvious and easiest way to ensure that the values are the same.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...we want to allow for folks like @icing who might want to avoid extra work to do so...

That's very kind and heart-warming.😌

draft-ietf-httpbis-http2bis.xml Outdated Show resolved Hide resolved
from the <tt>:authority</tt> pseudo-header field, when compared byte-for-byte. A
server SHOULD treat a request as malformed if it contains a <tt>Host</tt> header
field that identifies a different entity to the <tt>:authority</tt> pseudo-header
field. The values of fields need to be normalized to compare them (see <xref
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this instance, I think "need" is a statement of fact -- these are hostnames, therefore not case-sensitive, therefore they need to be normalized to perform a useful comparison. No need for RFC2119 language here. However, that implies that "when compared byte-for-byte" above is wrong.

draft-ietf-httpbis-http2bis.xml Outdated Show resolved Hide resolved
Co-authored-by: Greg Wilkins <gregw@webtide.com>
@wtarreau
Copy link
Contributor

Yeah much better now. A coworker also told me he was bothered by the "byte-for-byte" which is now gone. I've reread all of it from https://github.com/martinthomson/http2v2/blob/af9a40279c465e0b6d7994710affe94ef5228109/draft-ietf-httpbis-http2bis.xml#L2900 to see it in context and I find that it flows nicely, details the protocol particularities and traps to care about without stepping over other protocol versions. So that's a definite +1 from me!

I'm finding something confusing two paragraphs later however: it's said Note that request targets for CONNECT or asterisk-form OPTIONS requests never include authority information which contradicts the CONNECT section saying A CONNECT header section is constructed ... The :authority pseudo-header field contains the host and port to connect to.

And I'm sure I participated to that but don't remember how we ended up on this one :-/

@MikeBishop
Copy link
Contributor

@wtarreau, that's because the :authority pseudo-header in a CONNECT request is the target of the connection, not the proxy you're asking to handle the request. It's not that the field is absent, but that it means something different in that context.

@wtarreau
Copy link
Contributor

@MikeBishop OK but regardless it still looks like recent clients are trying hard to send the exact same Host as the :authority in CONNECT (typically including :443). It's unclear to me why the target of a proxied CONNECT is not an authority while the target of a proxied GET is an authority then.

Co-authored-by: Mark Nottingham <mnot@mnot.net>
@martinthomson martinthomson merged commit 78d4e21 into httpwg:main Sep 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remaining corner cases between Host and :authority
7 participants