Conversation
|
I stepped through this codepath to see what's going on. The operative branch for this scenario is right here: if ((content_type_index = h2o_find_header(&req->res.headers, H2O_TOKEN_CONTENT_TYPE, -1)) != -1 &&
(mime = h2o_mimemap_get_type_by_mimetype(req->pathconf->mimemap, req->res.headers.entries[content_type_index].value, 0)) !=
NULL)
req->res.mime_attr = &mime->data.attr;
else
req->res.mime_attr = &h2o_mime_attributes_as_is;Notably ssize_t h2o_find_header(const h2o_headers_t *headers, const h2o_token_t *token, ssize_t cursor)
{
for (++cursor; cursor < headers->size; ++cursor) {
if (headers->entries[cursor].name == &token->buf) {
return cursor;
}
}
return -1;
}So even though we have the right header for h2o by value it completely ignores it here and therefore disables all compression. There's another function called But wait! The library will do the string interning for us if we let it. All it takes is a one character change right here: h2o_add_header_by_str(&rec_u->pool, &rec_u->res.headers,
hed_u->nam_c, hed_u->nam_w, 1, // <- THIS ONE RIGHT HERE
0, hed_u->val_c, hed_u->val_w); |
Adds a compression handler, configured to support gzip and brotli for responses of non-trivial size. The library takes care of applying compression only on content types where this makes sense. For now, we're deferring to h2o's defaults, which is essentially a whitelist containing most common text-based mime types. We lightly adjust the `h2o_add_header_by_str` call so that h2o tokenizes our headers appropriately, letting it detect the content type.
|
Brilliant, thank you @pkova. I took the liberty of force-pushing a cleaner diff so I don't dirty the blame needlessly. Passes smoke tests. Haven't tried anything crazy, but this should now restrict itself to h2o's text type defaults. |
Adds a compression handler, configured to support gzip and brotli for responses of non-trivial size.
The library takes care of applying compression only on content types where this makes sense. For now, we're deferring to h2o's defaults, which is essentially a whitelist containing most common text-based mime types.
We lightly adjust the
h2o_add_header_by_strcall so that h2o tokenizes our headers appropriately, letting it detect the content type.With this change, we see bytes-over-the-wire reduced as expected.
A cache refresh of the groups web client on a ship with very little content (so, primarily .js file downloads) goes down from ~10mb to ~3mb transferred. (~30%)
Filling a channel with small posts by the same author gives a "recent posts" json scry result of ~23kb, but only uses ~3kb over the wire. (~13%!)
Happy to bikeshed the compression configuration here. I just picked values that seemed fairly middle-of-the-road.