Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not require Trailer header to parse trailers in HTTP2 #92

Merged
merged 1 commit into from Apr 30, 2022

Conversation

Maaarcocr
Copy link
Contributor

Fixes #91

@ioquatix
Copy link
Member

ioquatix commented Apr 16, 2022

How should we handle HTTP/2 -> HTTP/1 proxy. Should this header be added?

@Maaarcocr
Copy link
Contributor Author

Maaarcocr commented Apr 16, 2022

This may be wrong, but given that you have to wait all trailers before being able to decide what to put in the Trailers header, you may as well just add all trailers in the headers, which may be easier, as you do not have to think about chunk encoding and so on.

@ioquatix
Copy link
Member

That would break streaming?

@Maaarcocr
Copy link
Contributor Author

I looked at other proxies, like envoy grpc proxy, and they don't really support streaming all together (I guess partially because of trailers).

@ioquatix
Copy link
Member

I'm fine with something like your proposed change provided it doesn't go against any RFCs, I agree we would like to support gRPC. I need to review this in detail to make sure this is the right behaviour change. I think it's probably acceptable.

@ianks
Copy link
Contributor

ianks commented Apr 22, 2022

This is great @Maaarcocr, and should fix some issues with GRPC. Might be worthwhile adding a test to document the behavior.

Also, I can't really tell what's up with the CI failures but it seems like other branches are failing as well, so it might not be related to this change in particular.

@ioquatix
Copy link
Member

I'm trying to find a basis or reason for allowing this.

https://datatracker.ietf.org/doc/html/rfc7540 page 58 shows an example of an HTTP/1 request upgraded to an HTTP/2 request which includes a direct 1:1 usage of trailer header as per HTTP/1.

     HTTP/1.1 100 Continue            HEADERS
     Extension-Field: bar       ==>     - END_STREAM
                                        + END_HEADERS
                                          :status = 100
                                          extension-field = bar

     HTTP/1.1 200 OK                  HEADERS
     Content-Type: image/jpeg   ==>     - END_STREAM
     Transfer-Encoding: chunked         + END_HEADERS
     Trailer: Foo                         :status = 200
                                          content-length = 123
     123                                  content-type = image/jpeg
     {binary data}                        trailer = Foo
     0
     Foo: bar                         DATA
                                        - END_STREAM
                                      {binary data}

                                      HEADERS
                                        + END_STREAM
                                        + END_HEADERS
                                          foo = bar

This seems to imply at least it's compatible, but not that it's required.

On page 51, it explicitly states:

4.  optionally, one HEADERS frame, followed by zero or more
       CONTINUATION frames containing the trailer-part, if present (see
       [[RFC7230], Section 4.1.2](https://datatracker.ietf.org/doc/html/rfc7230#section-4.1.2)).

The only advice I can find proximal to that section is:

[4.4](https://datatracker.ietf.org/doc/html/rfc7230#section-4.4).  Trailer

   When a message includes a message body encoded with the chunked
   transfer coding and the sender desires to send metadata in the form
   of trailer fields at the end of the message, the sender SHOULD
   generate a Trailer header field before the message body to indicate
   which fields will be present in the trailers.  This allows the
   recipient to prepare for receipt of that metadata before it starts
   processing the body, which is useful if the message is being streamed
   and the recipient wishes to confirm an integrity check on the fly.

     Trailer = 1#field-name

I could not find anywhere that states that trailer in the header MUST be present, only SHOULD.

Does this mean semantically we should accept the same logic in HTTP/1?

@Maaarcocr
Copy link
Contributor Author

It seems like it's a SHOULD which is what I've also seen around. I was unsure about HTTP1.1 but I guess even if chunk encoded you need the trailer header only to do some integrity checks, not really for any other reason.

A better solution, for both HTTP2 and HTTP1 would be to let the user decide what to do when the trailer header is present and the trailers don't match it (also maybe what to do if the header is missing all together).

I do think that not requiring it is the right thing to do. We could approach it this way:

  1. first, just don't require it
  2. possibly add levers and whistles that the user of the library can use to perform integrity checks if the header is present
  3. possibly allow users to make it such the library keeps the current behaviour if the trailer header is missing

@ioquatix
Copy link
Member

Trailers as a semantic are quite tricky to fit within the current model for HTTP request-response and need explicit support in the headers semantics. So I need to make sure it's going to work as expected.

@Maaarcocr
Copy link
Contributor Author

Ok, that makes sense. Do you reckon we could merge the http2 changes sooner than the whole overall changes?

Mostly asking because, internally, we do rely on async-http and this change would unlock us. I know it's not really something that concerns you, but it would help us plan a bit if we knew when we could expect the http2 changes to get merged.

(The work we're doing internally may also get open sourced and it's related to gRPC, but all of that is TBD)

@ioquatix
Copy link
Member

Does gRPC optionally use trailers as a header or not at all?

@Maaarcocr
Copy link
Contributor Author

afaik not at all. Maybe some implementations may, but the ones that us the shared C implementation (Ruby, Python, C#) don't. I'm pretty sure the go one doesn't.

@ioquatix
Copy link
Member

Does gRPC use any of the same semantics for headers? like, accept, content-type etc. Do they have the same parsing rules as HTTP?

@ioquatix
Copy link
Member

I will try to get this merged by the weekend. There are a couple of other changes that need to be made to Protocol::HTTP::Headers.

@Maaarcocr
Copy link
Contributor Author

yes grpc is built on top of HTTP2, so when it comes to it, in order to talk to it you need a fairly basic HTTP2 client that sends the right headers and packs the body message the right way.

So yes, it uses fairly standard http headers. It also has its own trailers (like grpc-status), which are still just normal headers but only have their own semantics within grpc connections.

@ioquatix
Copy link
Member

While HTTP requires:

Unless the request includes a [TE](https://httpwg.org/specs/rfc7230.html#header.te) header field indicating "trailers" is acceptable, as described in [Section 4.3](https://httpwg.org/specs/rfc7230.html#header.te), a server SHOULD NOT generate trailer fields that it believes are necessary for the user agent to receive. Without a TE containing "trailers", the server ought to assume that the trailer fields might be silently discarded along the path to the user agent. This requirement allows intermediaries to forward a de-chunked message to an HTTP/1.0 recipient without buffering the entire response.

It seems like gRPC does not require any such negotiations. The semantics of headers and trailers are very similar but different enough that it's tricky to support both semantics in one package.

HTTP semantics "SHOULD":

  1. Client sends "te: trailers"
  2. Server sends "trailers: name1, name2, name..."
  3. Server chooses format appropriate for trailers (i.e. chunked encoding) where other formats might not work (fixed content length).

While the lower level protocols support this, it's hard to know what the design trade offs are, i.e. sacrifice strict HTTP semantics to allow gRPC to live in the same code base. async-http is for HTTP semantics.

My understanding is gRPC uses the same protocol as HTTP/2 but it's not sharing the same semantics. Because it has trailers with none of the semantics required for HTTP to work correctly and this includes proxies and the ability to correctly proxy requests with trailers, from HTTP/2 to HTTP/1.

I'm hesitant to adopt a change here which violates "SHOULD" recommendations in the RFCs. Of course, async-http is not perfect and doesn't follow te:trailers but we should add that, and it will be another problem for gRPC.

As such, I think I'll cut a new gem, async-grpc which implements the grpc client and server semantics. Unless you have some other idea.

Basically it boils down to, is gRPC really compatible with HTTP semantics? Or is it just using the same protocol/wire format?

@ioquatix
Copy link
Member

I've asked this question here: https://groups.google.com/g/grpc-io/c/Mw-HeAV_ZZc

@ioquatix
Copy link
Member

Another angle is, do we relax HTTP/2 protocol checks because it's acceptable in the RFCs? I need to know if HTTP/2 actually relaxes the semantic requirements around trailer header. From my initial read of RFC7540, it's not really clear.

@ioquatix
Copy link
Member

@mnot if you have any input on this it would be most appreciated.

@ioquatix
Copy link
Member

https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md

This document serves as a detailed description for an implementation of gRPC carried over HTTP2 framing. It assumes familiarity with the HTTP2 specification.

"over HTTP2 framing"

@mnot
Copy link

mnot commented Apr 29, 2022

Trailer handling is one of the things that's changed in the newest HTTP specs (about to be published Any Day Now) -- see here. If you have any questions, glad to try to help.

@Maaarcocr
Copy link
Contributor Author

Maaarcocr commented Apr 29, 2022

I don't fully agree that grpc is changing HTTP2 semantics, as grpc is pretty much built on top of it and it just sees it as a transport layer.

Some gRPC server implementations error back to you if you do not send the TE=trailers header. I've learnt this the hard way while trying to reimplement a grpc client, precisely the one that share the C implementation, so they are compliant with the HTTP spec. I think the go implementation actually allows you to either set or not set TE=trailers and it will do the right thing.

There was a medium article (that now got deleted for some reason) where the author described how to send grpc requests using cURL and its http2 client.

All in all, I think grpc doesn't violate the HTTP2 spec and it's actually one of the few cases where trailers are used.

@ioquatix
Copy link
Member

Okay so the expectation is to use te: trailers but not the trailer header itself which is “SHOULD” in the relevant RFCs?

@ioquatix
Copy link
Member

ioquatix commented Apr 29, 2022

Does:

A sender that intends to generate one or more trailer fields in a message should generate a Trailer header field in the header section of that message to indicate which fields might be present in the trailers.

Apply to gRPC or not?

A well formed HTTP response "SHOULD" generate a "trailer" header according to RFCs. I guess it won't hurt gRPC implementations and makes it compatible with HTTP semantics.

The question is:

How should the client behave when receiving an HTTP/2 response which includes trailers, but does not include a trailer header?

@ioquatix
Copy link
Member

ioquatix commented Apr 30, 2022

Currently, an HTTP/2 response which:

  1. Doesn't have a trailer header, and
  2. Has trailers, will generate a stream error.

By this PR only, a HTTP/2 response which:

  1. Doesn't have a trailer header, and
  2. Has trailers, will add those as headers,
  3. They will not be proxied correctly by async-http because it streams responses and was not aware that there would be trailers at the time the stream started, and thus may not have made the correct choice about how to proxy, e.g. HTTP/1.1 might use fixed content-length but with trailers it should use chunked encoding. In other words, trailers are treated like headers, and won't be sent because by the time they are received, the headers are already sent.

The solution would be to assume that HTTP/2 can always generate trailers, but there is some extra overhead involved in this. It will force HTTP/2 -> HTTP/1 to always use chunked encoding even if it was otherwise unnecessary.

@ioquatix
Copy link
Member

Basically, to be completely compatible with HTTP semantics, gRPC should specify:

trailer: grpc-status, grpc-message

Looking at https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#responses I wonder if that was implied by

Trailers → Status [Status-Message] *Custom-Metadata

But I'm assuming servers don't implement this?

@ejona86 do you have any feedback/insight into this spec/design?

@ioquatix
Copy link
Member

This was an interesting interpretation of the relevant docs: httpwg/http-core#18 (comment)

@Maaarcocr
Copy link
Contributor Author

I understand your concerns about grpc possibly skewing away from proper HTTP semantics and just adhering to its framing.

At the same time, would it make sense to maintain 2 http implementations whose only difference is that one errors out when the "Trailer" header is not sent and the other doesn't?

I have a working demo of using async-http as a grpc client and the only change I had to make for it to work was to allow trailers without the "Trailer" header.

@ioquatix
Copy link
Member

After sleeping on this, and reflecting on the relevant RFCs, I believe that the te: trailers and trailer header is completely useless in practice given the state of how things are being implemented. The word "SHOULD" is clearly not strong enough and means that any logic depending on this header cannot be relied on in general for the correct functioning of HTTP client/server.

It's hard for HTTP/1 since there are clear trade offs (fixed content-length body vs chunked encoding).

Therefore, I think we can accept this PR and we just have to accept the semantic complexities that come from it.

@ioquatix ioquatix merged commit 602d8a6 into socketry:main Apr 30, 2022
@mnot
Copy link

mnot commented May 1, 2022

Yes, this PR is obviously the right thing to do. A requirement on a sender (especially a SHOULD) does not place any onus on a recipient to check that the sender is doing the right thing -- it's encouraging good behaviour (hence, SHOULD). If it were necessary for the recipient to check it, that would have been said explicitly in the spec.

@mnot
Copy link

mnot commented May 1, 2022

See also this, under 'interpreting requirements'.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2022

@mnot thanks for your feedback.

this PR is obviously the right thing to do

Sorry to nitpick, but it really wasn't obvious to me hahah.

How does a client/server in this case meaningfully do anything w.r.t. a SHOULD requirement? Emit warnings? How about gRPC in this case where it's essentially ignoring the SHOULD on a systematic basis. If standard X says SHOULD and standard Y building on X ignores that, it feels like a mistake to me.

@mnot
Copy link

mnot commented May 2, 2022

How does a client/server in this case meaningfully do anything w.r.t. a SHOULD requirement? Emit warnings?

A recipient doesn't need to do anything. This requirement is there to encourage senders to make the information available, nothing more.

How about gRPC in this case where it's essentially ignoring the SHOULD on a systematic basis. If standard X says SHOULD and standard Y building on X ignores that, it feels like a mistake to me.

gRPC isn't a standard, and doesn't use HTTP particularly well. That's on them.

@ioquatix
Copy link
Member

ioquatix commented May 2, 2022

Okay, I made updated releases of all the relevant gems with this new behaviour.

@ioquatix ioquatix added this to the v0.56.6 milestone May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP/2 does not require the Trailer header to be there
4 participants