Add ability to compress request body #989

mstoykov · 2019-04-08T10:10:04Z

This adds compression param to all http methods that will compress the body
set the correct Content-Encoding and Content-Length and will send the request
with the compressed body. The currently supported algorithms are 'gzip' and 'deflate'.
Combinations of the two separated by a comma(",") are also supported.

This fixes #988

This adds `compression` param to all http methods with only current possible value `gzip` that will compress the body set the correct `Content-Encoding` and `Content-Length` and will send the request with the compressed body.

robingustafsson

LGTM, just don't forget to update release notes 🙂

mstoykov · 2019-04-08T10:15:18Z

LGTM, just don't forget to update release notes slightly_smiling_face

I am even going to update the docs :P
after all the checks pass ... and figure out what appveyor's problem is >.>

codecov · 2019-04-08T10:31:58Z

Codecov Report

Merging #989 into master will decrease coverage by 0.02%.
The diff coverage is 53.33%.

@@            Coverage Diff             @@
##           master     grafana/k6#989      +/-   ##
==========================================
- Coverage   72.12%   72.09%   -0.03%     
==========================================
  Files         131      131              
  Lines        9621     9636      +15     
==========================================
+ Hits         6939     6947       +8     
- Misses       2268     2272       +4     
- Partials      414      417       +3

Impacted Files	Coverage Δ
js/modules/k6/http/request.go	`78.42% <53.33%> (-1.36%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3506ee1...9efecb2. Read the comment docs.

codecov · 2019-04-08T10:31:59Z

Codecov Report

Merging #989 into master will increase coverage by 0.16%.
The diff coverage is 90.69%.

@@            Coverage Diff             @@
##           master     grafana/k6#989      +/-   ##
==========================================
+ Coverage   72.12%   72.28%   +0.16%     
==========================================
  Files         131      132       +1     
  Lines        9621     9699      +78     
==========================================
+ Hits         6939     7011      +72     
- Misses       2268     2273       +5     
- Partials      414      415       +1

Impacted Files	Coverage Δ
js/modules/k6/http/request.go	`80.48% <100%> (+0.7%)`	⬆️
lib/netext/httpext/compression_type_gen.go	`86.66% <86.66%> (ø)`
lib/netext/httpext/request.go	`88.28% <89.65%> (+0.11%)`	⬆️
lib/netext/httpext/tracer.go	`98.21% <0%> (+1.78%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3506ee1...5ae8d5c. Read the comment docs.

js/modules/k6/http/request.go

release notes/upcoming.md

na-- · 2019-04-09T08:32:39Z

Moving the discussion about just using the Content-Encoding header to specify any compression here, instead of in the inline code comments. In my mind, this would make both for better UX, and make the implementation cleaner.

It seems fine to me if we should throw errors when users try to specify unsupported Content-Encoding values (e.g. br for Brotli, or multiple encodings like deflate, gzip). Until we support those things, of course. For example, there's already an open issue for brotli support, and we can easily add other ones for zstd, compress and any others or multiple layered encodings like gzip, deflate

A somewhat connected issue would be the Accept-Encoding and Content-Encoding headers we generate in the HAR file converter, as mentioned here - we need to make sure we don't advertise things we don't support there.

robingustafsson · 2019-04-09T08:52:37Z

I'm against using user specified headers as an implicit signal that we should do additional processing of the body contents (if you specify application/json or any other Content-Type we don't do any special processing of the body data afaik?).
Specifying headers like Content-Encoding to me means you're doing low-level modifications of the request and I'd thus expect you to also do the compression of the body yourself.
I think any situation where we want to do further processing in Go, it should be a "param" as it's more explicit that there will be some processing done in Go that will (potentially) modify the HTTP headers and/or body.

na-- · 2019-04-09T10:09:07Z

Well... I'm not totally convinced that we need separate options. I get that it might be a bit strange to modify k6's behavior by setting the headers map of a request object, but if the alternative is having multiple different and separate settings that magically modify the headers for every little thing? This will only exacerbate the config mess we're currently in...

Even ignoring that, how do we handle things if those settings and the user-specified header values conflict? Do we silently overwrite them or do we throw an error? Or what if the user specified a gzip content-type but didn't compress the body themselves? And in any case, such compression will be super inefficient in the JS runtime, so should we expose the gzip/deflate/brotly/etc. primitives in the K6 JS API as well? The whole issue becomes much more complicated for no apparent UX benefit...

Regarding Content-Type: there are a bunch of open issues for application/json and multipart/form-data content types that would be much better with this approach: #843, #747, #878, #988 (yes, this one 😁 ). The actual k6 code will also be more readable. This seems true to me also for any converter output, i.e. the internal HAR converter (https://github.com/loadimpact/k6/issues/762) and the other jmeter/postman/etc. converters.

And for super-advanced users that want to manually compress their request bodies or mess around with other internals, we could add a single boolean ignoreRequestHeaders option in http.Params that would disable any extra k6 behavior, though I doubt many users would use it.

Also, we kind of already read the HTTP headers, to get any Host value the user manually specified. And to merge cookies between the request and the cookie jar... I think I actually found a bug with the current k6 behavior when I was checking this. What do you think this should produce?

import http from "k6/http";

export let options = {
  iterations: 2,
};

export default function () {
  // Set cookies
  if (__ITER === 0) {
    let resp1 = http.get("https://httpbingo.org/cookies/set?key=value1");
    console.log(resp1.body);
  }

  // Check cookies
  let resp2 = http.get("https://httpbingo.org/cookies");
  console.log(resp2.body);

  // Check custom cookies...
  let resp3 = http.get("https://httpbingo.org/cookies", {
    headers: { "Cookie": "key=value2;anotherkey=anothervalue" }
  });
  console.log(resp3.body);
}

And similarly, we probably need to validate any Accept-encoding headers that contain encoding that k6 doesn't support.

mstoykov · 2019-04-09T10:29:16Z

Us reading the headers is not a problem, us doing "magical" interactions with them is.
My opinion on how params vs headers should work (not how it works now but how it will be least confusing):

headers take precedence
If we would modify a header because of param we don't
2.1. Maybe as a better UX print a warning when params are specifically specified that will not overwrite headers, but not if they are set for the whole group()/script after Ideas for improving the group() built-in function #884 .

The bug above is strange ... I guess it has something to do with the way cookies are used and cookiejars and such.

As an additional question how do you think #970 should be implemented ? because I have always thought it will be through params supposedly or do you propose they add the headers themself

na-- · 2019-04-09T11:02:49Z

Sorry, I'm not sure I understand what you mean by "headers take precedence". Let's try with examples.... Here's this script, what should happen?

http.get("https://httpbin.org/anything", {
  compression: "gzip",
  headers: {
    "Content-Encoding": "br",
  }
});

Here's a similar script, with Content-Type this time:

http.post("https://httpbin.org/anything", {foo: "bar", "baz": "zaz"}, {
  contentType: "application/json",
  headers: {
    "Content-Type": "multipart/form-data; boundary=--myBoundary",
  }
});

If "headers take precedence" means we do nothing, since the user explicitly set the header, then this seems like a poor UX to me, since the user also explicitly set our parameter. Or are those errors, warnings, or do we silently overwrite the headers? And how would you deal with more complicated cases, like setting that custom multipart boundary with params? Like this:

http.post("https://httpbin.org/anything", {foo: "bar", "baz": "zaz"}, {
  contentType: "multipart/form-data; boundary=--myBoundary",
});

or by separate parameters, using an object? Or multiple encodings? In a header it's simply Content-Encoding: deflate, gzip, bit do you use contentEncoding: "deflate, gzip" or contentEncoding: ["deflate", "gzip"] in the JS code? I'm struggling to understand why we're trying to re-invent the wheel when there's already a standardized way all of these options are configured that would be in a needless conflict with any custom options we introduce.

na-- · 2019-04-09T11:06:09Z

Regarding #970 - in the same way I'm not suggesting we make users implement brotly/gzip/zstd compression themselves, or NTLM/digest/basic authentication, I'm also not suggesting that they should implement the h2c headers themselves. There are also other protocol-level issues that are way beyond the level of headers...

mstoykov · 2019-04-09T12:44:06Z

p.s. While writing this I decided that this discussion should probably be moved outside of this PR as this is getting out of scope

By "precedence" I mean - we do nothing if they've setted a header that our param would've set. I think it means that they have mistaken something and we either error out (seem excessive but maybe the better option) or print a warning (probably going to get noisy).

first example - we don't encode anything send the header as it is and print a warning/error out without doing anything. Will their server get uncompressed body when they expected compressed one -yes . Is that bad - yes. Do we have to read minds in this situation - I don't think so.

second example - what does contentType does ? and leave it to what the Header is.

third example - they are not doing it correctly ... it is not our fault the HTTP spec for multipart/form-data requires a lot of hard to write by hand text ...
Which after I read some more leaves me to believe that

K6 Does not respect the boundary value placed in a Content-Type header #843 is invalid - they've set the boundary wrong in the header. (even if they are using the map/json object body)
The whole thing with the file seems to me like it was a try to make it work in all cases but tried to be magical by not actually not letting the user set the Content-Type through some useful manner. I would propose that we add some params option that sets it instead.

All my arguments have been about: lets not overwrite what the user has said or change things without it being extremely obvious that this is what the user wants is, because I HATE MAGIC (not magic the gathering, that one is great). Everything that does something without it being obvious is potential issue.

If you set param that says compression:gzip you would expect that we will compress something as gzip ... for example your body as there is nothing else that is gzipable in that particular case. Should we also require them to set the header ? I would say "no" because we don't make them set Content-Length* as well. (I suppose if they Content-Length we should respect that ... although this could be problematic)
If you set your header to something you will expect that your header will have the value you have set it ... no matter what.
If you have given us some json object(map) instead of string/array of integers you would ... I suppose expect that we will send it as formdata and here the magic (given that we want to support this case) is inevitable because if we also support setting files there are 2 Content-Types we can choose from and this depends on what the server expects. IMO we should keep the current behaviour and add an option for through params for users to say that this request even though it can be application/x-www-form-urlencoded you want it to be mutlipart/form-data. I guess the name should multipart: true. I would need to go look at all possible/likely Content-Types to see if there are other funky corner cases.
But this whole issue comes from magic that we have decided to add for user convenience.

And the thing while writing this that I found out is that there doesn't appear to be cases where k6 reads headers and than change them- it just change them to not make users set them but we require them to set other things in order for things like auth/multipart/formdata and so on to work, without the user setting the. We obviously will need to continue to set some headers on behalf of the user as thing such as Cookies and Content-Length and Agent and so on are hard to do. But IMO we should always leave what they have set. Otherwise there is no way to check how an API behaves if you send it crappy headers/data. To that end we should probably add another params flag such as don't touch anything or at least don't touch headers.
This of course all goes toward what exactly do we want our users to be able to do ... easy and how easy is easy for particular use case. I think less magic is always better. And in my opinion the route with "a lot" of params for different things leaves it possible for users to do testing for things that are not "usual"/"standard" or downright broken and see how their tests behave. Going down the route of us deciding based on headers what the user actually wanted seems to me to tie our and our users hand in the future to only ways of working that we have considered useful.

na-- · 2019-04-09T16:09:58Z

I think it might be a little too late to move the discussion to the issue, since a lot of the context is already here...

#843 is invalid - they've set the boundary wrong in the header. (even if they are using the map/json object body)

They haven't though - boundary=--myBoundary is perfectly valid, unless it's contained somewhere in the body.

second example - what does contentType does ? and leave it to what the Header is.

You've kind of answered yourself below in your post. Say that we "ignore" (i.e. not use for configuration but warn/error/? if there's mismatch?) the user-specified headers. Then we'd need another new http.Params option that could be used to specify how k6 should encode the body users send in http requests, if it's an object. I'll use it as an example again, since it may be the most tricky use case out of all of them...

The current Content-Type logic is roughly this:

if there's no body or if the body is string or []byte - k6 won't mess with the Content-Type header and whatever the user has specified will be sent, and if nothing is specified, nothing will be sent.
if we pass an object as body:
- if that object contains an http.FileData element, we'll send Content-Type: multipart/form-data; boundary=some-random-boundary
- otherwise, we'll urlencode the body and send Content-Type: application/x-www-form-urlencoded

This is mostly reasonable, and yes - slightly "magical" ✨ 😄 . If I was starting from scratch, maybe I would have had a much more explicit system, like how you're forced to build a multipart request in Go. But, given that k6 is a load testing tool, this small amount of magic is probably justified and results in better UX, especially if it actually worked better. It currently has a lot of shortcomings, and it breaks in some corner cases:

it's very inflexible, if users want to force Content-Type: multipart/form-data for non-file-upload requests, there's no way to do it (Better handling of multipart/form-data requests #747)
there's no way to specify a custom boundary (K6 Does not respect the boundary value placed in a Content-Type header #843)
even more general than the above issue, it breaks when there's a mismatch between a custom Content-Type header and the body we're sending. For example, sending Content-Type: application/x-www-form-urlencoded and an http.file() body produces something completely broken, since the headers have precedence
there are no warnings or errors since k6 doesn't check for custom Content-Type values

All of that can be easily fixed if we check for custom Content-Type headers values that users are specifying and adjust k6's behavior accordingly. For another minor UX win, we'll even be able to handle Content-Type: application/json requests accordingly (#878). I don't think this would substantially increase the amount of magic, it will just fix the currently broken magic 😉

But instead of making the two options (body and Content-Type header) work in sync, like users expect them to, you're instead proposing that we should add a third option multipart: true in the http.Params object so we avoid more magic 🙌 😄. Not only do I think that it won't fix any of the issues above, it will simply make things more complicated and unclear, and add more validation code that we'll have to add so users don't shoot themselves in the foot.

I feel your dislike for magic, I usually don't like it as well, but mostly the "magic at a distance" phenomenon. In this case, k6 reading the Content-Type and Content-Encoding headers of a single request and acting accordingly seems like an OK compromise to me... And even though I feel like testing for things that are not "usual"/"standard" or downright broken is a bit of a straw-man use case for a load-testing tool, it can easily be accommodated! I mentioned it above, but we can simply add single, simple, boolean option in http.Params that would cause k6 to disregard any request headers. We can call it disableMagic, or disregardHeaders or ignoreCustomHeaders or whatever - it will turn off any special handling for the following headers:

Host - we already do that... though probably in the wrong place 😄
Cookie - we already do that as well, as demonstrated in the example above
Content-Encoding - what started the argument
Content-Type - what I described above as the most complicated use case IMO
Accept-Encoding - we probably should verify that we support the encodings, otherwise servers will return garbage
any other "special" headers we're currently not validating but probably should: TE, Trailer, Upgrade, Connection, Keep-Alive, ...? there are a lot of corner cases we've likely glossed over and ignore when we should instead return a warning or an error...

na-- · 2019-04-10T04:56:34Z

After sleeping on this and thinking about it some more, I've mostly come around to your opinion, @mstoykov and @robingustafsson 😄, at least with regards to this specific feature (compression / contentEncoding). Probably also any other similar features we'll want in the future as well, with the only big exception being Content-Type.

Because we're already using some "magic" to determine the request Content-Type based on the body contents, and the complexity of introducing a third variable in that whole mess (see above message), I still feel like, in that specific case, it would make more sense for us to actually fix the magic so it works like people expect it to work, i.e. read any custom Content-Type header values that users are specifying and adjust k6's behavior accordingly. But only for this specific header, which we'll properly document.

I also propose that instead of a disableMagic (or equivalent 😄) option in http.Params, in the future we probably should add a global validateHeaders option, probably defaulting to true. If that option is enabled, any mismatches or wrong or unsupported headers will cause k6 to emit a warning, or if throw is also enabled - throw an error.

That will be a totally separate PR of course, and if you agree, I can add an issue about it.

js/modules/k6/http/request.go

release notes/upcoming.md

js/modules/k6/http/request_test.go

- Move compression to httpext - support deflate and gzip - support combination of the two

js/modules/k6/http/request_test.go

js/modules/k6/http/request.go

lib/netext/httpext/request.go

na--

Mostly LGTM now, besides that minor nitpick about more informative error messages and having this code not use the new constants.

lib/netext/httpext/request.go

js/modules/k6/http/request.go

… body

…from ParseRequest

lib/netext/httpext/request.go

na--

LGTM

artych · 2019-04-12T15:24:16Z

Thanks, @mstoykov and @na-- , will try this tomorrow :)

This would restore the behavior pre-#989, so that any content-length header values manually specified by users would be overwritten by k6. Only, this time it won't be done silently, a warning would be shown every time there was a mismatch. The HAR converter is also modified to not include the content-length header in the generated sources. While that header isn't going to be a problem if its values matches the actual body content length, some HAR files are broken and include the header without any body data, and other times users may modify the body but forget the header. So, since k6 would automatically send it, we shouldn't include it in the script.

This adds `compression` param to all http methods with that will compress the body set the correct `Content-Encoding` and `Content-Length` and will send the request with the compressed body. The current compressions supported are `gzip` and `deflate` and their combinations separated by a comma (`,`). If Content-Length of Content-Encoding has been set manually by the user, they won't be reset but instead a warning will be printed.

This would restore the behavior pre-grafana#989, so that any content-length header values manually specified by users would be overwritten by k6. Only, this time it won't be done silently, a warning would be shown every time there was a mismatch. The HAR converter is also modified to not include the content-length header in the generated sources. While that header isn't going to be a problem if its values matches the actual body content length, some HAR files are broken and include the header without any body data, and other times users may modify the body but forget the header. So, since k6 would automatically send it, we shouldn't include it in the script.

Add ability to gzip compress request body fix #988

9efecb2

This adds `compression` param to all http methods with only current possible value `gzip` that will compress the body set the correct `Content-Encoding` and `Content-Length` and will send the request with the compressed body.

mstoykov requested review from robingustafsson and na-- April 8, 2019 10:10

mstoykov modified the milestone: v1.0.0 Apr 8, 2019

robingustafsson approved these changes Apr 8, 2019

View reviewed changes

mstoykov added 3 commits April 8, 2019 14:24

Add Release note for #988

9e43517

Add test for unsupported body compression

d85f528

Move closing of gzip in defer

3993938

na-- requested changes Apr 9, 2019

View reviewed changes

js/modules/k6/http/request.go Outdated Show resolved Hide resolved

js/modules/k6/http/request.go Outdated Show resolved Hide resolved

release notes/upcoming.md Outdated Show resolved Hide resolved

na-- mentioned this pull request Apr 9, 2019

Wrong cookie merging #996

Closed

na-- requested changes Apr 10, 2019

View reviewed changes

js/modules/k6/http/request.go Outdated Show resolved Hide resolved

release notes/upcoming.md Outdated Show resolved Hide resolved

js/modules/k6/http/request_test.go Outdated Show resolved Hide resolved

na-- changed the title ~~Add ability to gzip compress request body fix #988~~ Add ability to gzip compress request body Apr 10, 2019

mstoykov added 2 commits April 10, 2019 17:05

Refactor body compression per PR comments

679b66d

- Move compression to httpext - support deflate and gzip - support combination of the two

Update the release notes for #988

8479b0c

mstoykov changed the title ~~Add ability to gzip compress request body~~ Add ability to compress request body Apr 10, 2019

WIP on not overwriting user set Content-(Length|Encoding) headers

816ace7

golangcibot reviewed Apr 10, 2019

View reviewed changes

mstoykov added 2 commits April 11, 2019 10:56

Update release notes

8561e39

refactor the compression of body loggic to a function

eae1eae

na-- approved these changes Apr 11, 2019

View reviewed changes

lib/netext/httpext/request.go Outdated Show resolved Hide resolved

js/modules/k6/http/request.go Outdated Show resolved Hide resolved

mstoykov added 4 commits April 11, 2019 16:41

Add more tests for handling rare/imposible body read problems

ab32def

Better error message when not overwriting headers and compressing the…

a8388a0

… body

Add test for bad/unknown compression algorithm

cca9c55

Refactor the whole setting of body and content length to MakeRequest …

5257fa0

…from ParseRequest

golangcibot reviewed Apr 11, 2019

View reviewed changes

lib/netext/httpext/request.go Outdated Show resolved Hide resolved

mstoykov added 2 commits April 11, 2019 18:06

Test that badCompressionType is bad

a1195f6

switch to switch

4fd808c

na-- reviewed Apr 12, 2019

View reviewed changes

lib/netext/httpext/request.go Outdated Show resolved Hide resolved

lib/netext/httpext/request.go Outdated Show resolved Hide resolved

mstoykov added 3 commits April 12, 2019 10:26

small type

e4bca50

Remove warning when Content-Length is set for now

727a31c

Use the new CompressionType for uncompression matching as well

1f2604b

golangcibot reviewed Apr 12, 2019

View reviewed changes

lib/netext/httpext/request.go Outdated Show resolved Hide resolved

lib/netext/httpext/request.go Show resolved Hide resolved

error strings should not be capitalized

5ae8d5c

na-- approved these changes Apr 12, 2019

View reviewed changes

mstoykov merged commit 3fc300c into master Apr 12, 2019

mstoykov deleted the feature/AddRequestBodyCompression branch April 12, 2019 08:50

na-- mentioned this pull request May 3, 2019

Better handling of multipart/form-data requests #747

Closed

na-- mentioned this pull request Jul 12, 2019

Support for zstd as response content-encoding and request body compression #1080

Closed

This was referenced Jul 30, 2019

Improve UX of HAR converter's output scripts grafana/har-to-k6#117

Open

Wrong content-length header behaves differently for HTTP 1 and 2 requests #1094

Closed

na-- mentioned this pull request Aug 7, 2019

Always set the correct content-length for requests #1106

Merged

na-- mentioned this pull request Mar 15, 2021

Automatically marshal JS-object HTTP request bodies when content-type is application/json #878

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to compress request body #989

Add ability to compress request body #989

mstoykov commented Apr 8, 2019 •

edited

robingustafsson left a comment

mstoykov commented Apr 8, 2019

codecov bot commented Apr 8, 2019

codecov bot commented Apr 8, 2019 •

edited

na-- commented Apr 9, 2019

robingustafsson commented Apr 9, 2019

na-- commented Apr 9, 2019 •

edited

mstoykov commented Apr 9, 2019

na-- commented Apr 9, 2019

na-- commented Apr 9, 2019

mstoykov commented Apr 9, 2019

na-- commented Apr 9, 2019 •

edited

na-- commented Apr 10, 2019 •

edited

na-- left a comment

na-- left a comment

artych commented Apr 12, 2019

Add ability to compress request body #989

Add ability to compress request body #989

Conversation

mstoykov commented Apr 8, 2019 • edited

robingustafsson left a comment

Choose a reason for hiding this comment

mstoykov commented Apr 8, 2019

codecov bot commented Apr 8, 2019

Codecov Report

codecov bot commented Apr 8, 2019 • edited

Codecov Report

na-- commented Apr 9, 2019

robingustafsson commented Apr 9, 2019

na-- commented Apr 9, 2019 • edited

mstoykov commented Apr 9, 2019

na-- commented Apr 9, 2019

na-- commented Apr 9, 2019

mstoykov commented Apr 9, 2019

na-- commented Apr 9, 2019 • edited

na-- commented Apr 10, 2019 • edited

na-- left a comment

Choose a reason for hiding this comment

na-- left a comment

Choose a reason for hiding this comment

artych commented Apr 12, 2019

mstoykov commented Apr 8, 2019 •

edited

codecov bot commented Apr 8, 2019 •

edited

na-- commented Apr 9, 2019 •

edited

na-- commented Apr 9, 2019 •

edited

na-- commented Apr 10, 2019 •

edited