Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to skip supplementation of a Date header #1495

Merged
merged 13 commits into from Nov 24, 2017

Conversation

deweerdt
Copy link
Member

@deweerdt deweerdt commented Nov 9, 2017

Current H2O always sets the date: header, both in HTTP/1 and HTTP/2. This might be problematic, for example when a mruby handler or when an upstream sets it.

This PR addresses the issue by not setting the header in the protocol layer, but instead leaving it to the individual handlers to do it. As a result:

  • proxy never sets it, leaves it as-is if upstream sets it
  • file always sets it
  • mruby and fastcgi only set it if isn't set already.

@deweerdt
Copy link
Member Author

/cc @i110 this is about the date: header in the response

@@ -601,20 +601,16 @@ static size_t flatten_headers_estimate_size(h2o_req_t *req, size_t server_name_a
static size_t flatten_headers(char *buf, h2o_req_t *req, const char *connection)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should tweak flatten_headers_estimate_size too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in ed0786d

@i110
Copy link
Contributor

i110 commented Nov 10, 2017

Thank you for the bug fix. One thing I don't understand is, are there any serious reasons that we mustn't append Date header if the upstream doesn't send it?

https://tools.ietf.org/html/rfc7231#section-7.1.1.2

A recipient with a clock that receives a response message without a
   Date header field MUST record the time it was received and append a
   corresponding Date header field to the message's header section if it
   is cached or forwarded downstream.

So in my understanding, the proxy handler also should do set it if isn't set already. If so, the protocol handlers can do the following and other handlers doesn't need to do anything.

    if (h2o_find_header(&req->res.headers, H2O_TOKEN_DATE, 0) < 0)
        h2o_resp_add_date_header(req);

But if there are some use cases that we don't want to fill the Date header, I think it might be good to add a configuration knob to switch the behavior. What do you think?

@deweerdt
Copy link
Member Author

deweerdt commented Nov 10, 2017

One thing I don't understand is, are there any serious reasons that we mustn't append Date header if the upstream doesn't send it?

Some servers give you full control over the headers that you send, and it surprises some customers if they see it appear when we switch to H2O.

Similarly, if you set a specific date in mruby, i would expect the header to take precedence over the one set by the stack?

We could definitely add a knob to handle the proxy case (where the date: header is set unless it's already set by upstream)

@i110
Copy link
Contributor

i110 commented Nov 10, 2017

Similarly, if you set a specific date in mruby, i would expect the header to take precedence over the one set by the stack?

Yes, I understand that the header value which is explicitly set by the upstream server or mruby handler should take precedence over auto-filled value. So I agree that the current implementation has the bug. OTOH, if the date header is missing, I'm afraid the default behavior of h2o should conform HTTP spec, i.e. fill the Date header automatically. So adding a knob is better way to resolve your issue?

BTW I tried nginx and apache proxy modules, they fill the date header if it's missing, and there seems to be no configuration options to suppress that.

@deweerdt
Copy link
Member Author

So adding a knob is better way to resolve your issue?

Yes, i think i agree, it would also have the advantage of not changing the current behavior (except when the header is set by upstream, but that's clearly a bug). @kazuho any thoughts?

@@ -548,6 +548,10 @@ static int fill_headers(h2o_req_t *req, struct phr_header *headers, size_t num_h
}
}

/* add date: if it's missing from the response */
if (h2o_find_header(&req->res.headers, H2O_TOKEN_DATE, 0) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use != -1 as a check? It means the same, but it might be a good idea to be consistent with other parts of the code that invoke the same function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 79e5186

@kazuho
Copy link
Member

kazuho commented Nov 10, 2017

So adding a knob is better way to resolve your issue?

Yes, i think i agree, it would also have the advantage of not changing the current behavior (except when the header is set by upstream, but that's clearly a bug). @kazuho any thoughts?

Actually, I prefer having less knobs. Having less makes the code concise and less prone to bugs.

I would argue that we do not need to fix the behavior of a broken HTTP server by the fact that we are forwarding the response.

I would also be a bit worried about the change of the meaning when we add a Date header. Consider the case in which origin sends an Expires header without Date. The client is expected to calculate the difference between it's clock time and the value of the Expires header, and use that as the period for the returned response to be considered fresh. If we add Date header, the meaning changes, since the client would then use the value of the Date header field supplied by H2O to calculate the difference.

@i110
Copy link
Contributor

i110 commented Nov 10, 2017

Hmm, I still feel that it's better to have default behaviors conform to RFC as strict as possible, and changing(?) the meaning of Date header is slight problem for age calculation, because many clients like browsers also use their local time to check the freshness in such case (I believe this approximation works well in most cases) and, above all, it's the spec. Also as an additional bonus, if we can handle Date header in protocol handlers, we can remove Date-header code in each handlers and the fear of forgetting to write that when we create new handlers

But actually I don't have strong motivation to argue it because I think the chance of missing Date header is not so high, so sorry for bothering you!

@deweerdt deweerdt force-pushed the only-set-date-header-when-needed branch from cfa38ec to c9991fb Compare November 11, 2017 05:32
@kazuho
Copy link
Member

kazuho commented Nov 15, 2017

@i110 I actually might agree with you about making this an opt-in feature. RFC 7231 section 7.1.1.2 states:

An origin server MUST NOT send a Date header field if it does not
have a clock capable of providing a reasonable approximation of the
current instance in Coordinated Universal Time.
(snip)
A recipient with a clock that receives a response message without a
Date header field MUST record the time it was received and append a
corresponding Date header field to the message's header section if it
is cached or forwarded downstream.

So when acting as an intermediary, H2O is required to add a Date header (unless the user wants to opt-out from such behavior).

Also as an additional bonus, if we can handle Date header in protocol handlers, we can remove Date-header code in each handlers and the fear of forgetting to write that when we create new handlers

I am not convinced with this.

In the proxy handler, we need to preserve the Date header sent by upstream (the fact that we do not not do that now is a bug). The fact means that the logic that adds the Date header needs to traverse through the HTTP headers list. I do not want such traversal to happen for every HTTP request H2O handles (e.g. when serving a static file).

That leads to the conclusion that every handler is required to add a Date header.

This is ON by default (which has beem H2O's behavior historically). If
set to OFF, H2O won't be setting the `date:` header if it's missing from
the upstream response
@deweerdt
Copy link
Member Author

245bcb4 adds a proxy.emit-missing-date-header knob, and it defaults to ON, since that's been H2O's behavior historically.

@kazuho kazuho merged commit 2886d79 into h2o:master Nov 24, 2017
@kazuho
Copy link
Member

kazuho commented Nov 24, 2017

Thank you for working on the issue!

@i110 i110 mentioned this pull request Dec 22, 2017
@kazuho kazuho changed the title Only set date header when needed add option to skip supplementation of a Date header Jan 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants