New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/1 proxying: preserve header cases #1194

Merged
merged 18 commits into from Feb 22, 2017

Conversation

Projects
None yet
2 participants
@deweerdt
Member

deweerdt commented Feb 9, 2017

This patch adds a new parameter: proxy.preserve-original-case. When
this parameter is set, we record the casing of the received headers
(both in the reponse and request), and when forwarding the request (or
the response), we apply the original casing.

@deweerdt

This comment has been minimized.

Show comment
Hide comment
@deweerdt

deweerdt Feb 9, 2017

Member

@kazuho This PR isn't ready for merging: it's missing more testing, and it doesn't handle special headers like Connection: or Content-Length:. However I thought close enough to start to discuss the implementation.
My first intention was to keep the original buffer around and keep pointers to it, but on a second thought, it seemed it would be less memory consuming to keep the original casing in a bitfield as done here.
I don't see a good way to avoid the allocation in http1client.c, even if the feature is not in use, but other than that there should be no change of behavior if the option is off.
Any thoughts?

Member

deweerdt commented Feb 9, 2017

@kazuho This PR isn't ready for merging: it's missing more testing, and it doesn't handle special headers like Connection: or Content-Length:. However I thought close enough to start to discuss the implementation.
My first intention was to keep the original buffer around and keep pointers to it, but on a second thought, it seemed it would be less memory consuming to keep the original casing in a bitfield as done here.
I don't see a good way to avoid the allocation in http1client.c, even if the feature is not in use, but other than that there should be no change of behavior if the option is off.
Any thoughts?

deweerdt added some commits Feb 8, 2017

HTTP/1 proxying: preserve header cases
This patch adds a new parameter: `proxy.preserve-original-case`. When
this parameter is set, we record the casing of the received headers
(both in the reponse and request), and when forwarding the request (or
the response), we apply the original casing.
- have ucase be at least one to fix a c++/c issue
- fix h2o_init_request to copy the casing information correctly
- simplify init_headers
@kazuho

Thank you for the PR. I think this is a good work that improves interoperability.

I've left my comments in-line. Please let me know what you think.

Show outdated Hide outdated include/h2o/http1client.h
@@ -46,7 +46,7 @@ typedef struct st_h2o_http1client_header_t {
typedef int (*h2o_http1client_body_cb)(h2o_http1client_t *client, const char *errstr);
typedef h2o_http1client_body_cb (*h2o_http1client_head_cb)(h2o_http1client_t *client, const char *errstr, int minor_version,
int status, h2o_iovec_t msg, h2o_http1client_header_t *headers,
size_t num_headers);
h2o_str_case_t **orig_hname_case, size_t num_headers);

This comment has been minimized.

@kazuho

kazuho Feb 13, 2017

Member

How about changing the argument: h2o_http1client_header_t *headers to h2o_header_t *?

Doing so, and tokenizing the names right after calling h2o_strtolower seems like the way we should go.

By doing so, costly comparisons of iterating through the headers in the later moments in the on_head function of lib/common/http1client.c can be optimized to use token comparisons (e.g. change h2o_memis(headers[i].name, headers[i].name_len, H2O_STRLIT("connection")) to headers[i].name == H2O_TOKEN_CONNECTION).

I do not think the additional amount of stack used by the list of h2o_header_t would be a issue, if we could leave the scope that allocates struct phr_header headers[100] before calling on_head.

@kazuho

kazuho Feb 13, 2017

Member

How about changing the argument: h2o_http1client_header_t *headers to h2o_header_t *?

Doing so, and tokenizing the names right after calling h2o_strtolower seems like the way we should go.

By doing so, costly comparisons of iterating through the headers in the later moments in the on_head function of lib/common/http1client.c can be optimized to use token comparisons (e.g. change h2o_memis(headers[i].name, headers[i].name_len, H2O_STRLIT("connection")) to headers[i].name == H2O_TOKEN_CONNECTION).

I do not think the additional amount of stack used by the list of h2o_header_t would be a issue, if we could leave the scope that allocates struct phr_header headers[100] before calling on_head.

Show outdated Hide outdated include/h2o.h
/**
* If non-NULL, contains a bitfield indicating whether a character in @name was upper case
*/
h2o_str_case_t *orig_hname_case;

This comment has been minimized.

@kazuho

kazuho Feb 13, 2017

Member

I am unsure if we want to save memory at the cost of spending extra cycles for building the header in the original case. Unless you have a need, lets just store a char * that points to the characters in the original case.

@kazuho

kazuho Feb 13, 2017

Member

I am unsure if we want to save memory at the cost of spending extra cycles for building the header in the original case. Unless you have a need, lets just store a char * that points to the characters in the original case.

Show outdated Hide outdated include/h2o.h
* a boolean flag if set to true, instructs the proxy to forward headers in the same case they were received
* (only active in the HTTP/1 case)
*/
int preserve_original_case;

This comment has been minimized.

@kazuho

kazuho Feb 13, 2017

Member

Let's forget about making this a configuration directive.

I think always preserving the case of headers for requests originating from HTTP/1 protocol would be fine.

@kazuho

kazuho Feb 13, 2017

Member

Let's forget about making this a configuration directive.

I think always preserving the case of headers for requests originating from HTTP/1 protocol would be fine.

deweerdt added some commits Feb 13, 2017

- Store the original value in h2o_header_t instead of a bit field
  storing the original case
- Pass h2o_header_t instead of h2o_http1client_header_t to on_head and
  informational_cb. This allows to get rid of h2o_http1client_header_t
  entirely, and perform the tokenization in http1client
- Remove configuration option: we always preserve the original case
- Free the original case in http1client.c:on_head
- Fix test labels
@deweerdt

This comment has been minimized.

Show comment
Hide comment
@deweerdt

deweerdt Feb 13, 2017

Member

@kazuho thank you for your comments. I've addressed them in 486ec19, I believe. It seemed natural to remove h2o_http1client_header_t entirely, after doing the tokenization in http1client.c, let me know if that works for you.

Member

deweerdt commented Feb 13, 2017

@kazuho thank you for your comments. I've addressed them in 486ec19, I believe. It seemed natural to remove h2o_http1client_header_t entirely, after doing the tokenization in http1client.c, let me know if that works for you.

@kazuho

Thank you for the changes! Adding a field to a struct that we use at so many places is a non-trivial amount of work, thank you very much.

I think we are on the right direction. I have left comments in-line.

Show outdated Hide outdated include/h2o.h
@@ -1088,12 +1092,13 @@ ssize_t h2o_find_header_by_str(const h2o_headers_t *headers, const char *name, s
/**
* adds a header to list
*/
void h2o_add_header(h2o_mem_pool_t *pool, h2o_headers_t *headers, const h2o_token_t *token, const char *value, size_t value_len);
h2o_header_t *h2o_add_header(h2o_mem_pool_t *pool, h2o_headers_t *headers, const h2o_token_t *token, const char *value,
size_t value_len);

This comment has been minimized.

@kazuho

kazuho Feb 14, 2017

Member

How about adding orig_hname as an argument instead of returning h2o_header_t * so that you could set the value in the caller?

The same goes for h2o_add_header_by_str as well.

@kazuho

kazuho Feb 14, 2017

Member

How about adding orig_hname as an argument instead of returning h2o_header_t * so that you could set the value in the caller?

The same goes for h2o_add_header_by_str as well.

Show outdated Hide outdated include/h2o.h
/**
* The name of the header as originally received from the client
*/
const char *orig_hname;

This comment has been minimized.

@kazuho

kazuho Feb 14, 2017

Member

Maybe the order of the fields should be name -> orig_hname -> value?

I think we should consider orig_hname as a first-class member of h2o_header_t (we need to always take care of the value!). In this respect, having fields for the name first makes more sense to me.

Also, it might be better to rename the field to orig_name. We do not need h since this struct represents a header.

@kazuho

kazuho Feb 14, 2017

Member

Maybe the order of the fields should be name -> orig_hname -> value?

I think we should consider orig_hname as a first-class member of h2o_header_t (we need to always take care of the value!). In this respect, having fields for the name first makes more sense to me.

Also, it might be better to rename the field to orig_name. We do not need h since this struct represents a header.

@deweerdt

This comment has been minimized.

Show comment
Hide comment
@deweerdt

deweerdt Feb 14, 2017

Member

@kazuho thank you for your comments. I believe i've addressed all your comments so far in 130b457

Member

deweerdt commented Feb 14, 2017

@kazuho thank you for your comments. I believe i've addressed all your comments so far in 130b457

@kazuho

Thank you for the modification! I think we have become more error-prone thanks to the changes.

May I ask one more change before merging this to master?

Show outdated Hide outdated lib/common/http1client.c
const h2o_token_t *token;
char *orig_name;
orig_name = h2o_strdup(NULL, src_headers[i].name, src_headers[i].name_len).base;

This comment has been minimized.

@kazuho

kazuho Feb 15, 2017

Member

How about creating a memory pool as a local variable of the function (i.e. h2o_mem_pool_t pool; h2o_mem_init_pool(&pool) at the top scope of the function), and allocate memory from that pool instead of calling h2o_strdup (and free) for every header name?

I believe that that would be a good optimization considering the fact that h2o_mem_alloc_pool is super-fast for small objects.

We could also allocate headers of type h2o_headers_t using the pool (instead of h2o_headers_t *headers[100]).

@kazuho

kazuho Feb 15, 2017

Member

How about creating a memory pool as a local variable of the function (i.e. h2o_mem_pool_t pool; h2o_mem_init_pool(&pool) at the top scope of the function), and allocate memory from that pool instead of calling h2o_strdup (and free) for every header name?

I believe that that would be a good optimization considering the fact that h2o_mem_alloc_pool is super-fast for small objects.

We could also allocate headers of type h2o_headers_t using the pool (instead of h2o_headers_t *headers[100]).

This comment has been minimized.

@deweerdt

deweerdt Feb 15, 2017

Member

Good idea, will do. Thanks!

@deweerdt

deweerdt Feb 15, 2017

Member

Good idea, will do. Thanks!

@deweerdt

This comment has been minimized.

Show comment
Hide comment
@deweerdt

deweerdt Feb 16, 2017

Member

@kazuho e843a04 now uses a pool allocator in http1client.c::on_head. Thank you.

Member

deweerdt commented Feb 16, 2017

@kazuho e843a04 now uses a pool allocator in http1client.c::on_head. Thank you.

@kazuho kazuho merged commit 7912487 into h2o:master Feb 22, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 22, 2017

Member

Thank you for the changes. Merged to master.

Member

kazuho commented Feb 22, 2017

Thank you for the changes. Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment