New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark: support for multiple http benchmarkers #8140

Closed
wants to merge 9 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@bzoz
Contributor

bzoz commented Aug 17, 2016

Checklist
  • make -j4 test (UNIX), or vcbuild test nosign (Windows) passes
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

benchmark

Description of change

Continued from #7180

Add support for multiple HTTP benchmarkers. Adds autocannon as the secondary benchmarker.

This allows for all http benchmarks to be executed under Windows. All available tools (wrk and autocannon) will be used to run HTTP benchmarks. It will fail if none of the tools is installed.

cc @nodejs/benchmarking

benchmark: support for multiple http benchmarkers
This adds support for multiple HTTP benchmarkers. Adds autocannon
as the secondary benchmarker.

@bzoz bzoz referenced this pull request Aug 17, 2016

Closed

benchmark: use autocannon instead of wrk #7180

2 of 2 tasks complete
return false;
else
return true;
};

This comment has been minimized.

@jasnell

jasnell Aug 17, 2016

Member

Perhaps simplify this to just:

  return !(result.error && result.error.code === 'ENOENT');
}
}
if (!any_available) {
console.error('Couldn\'t locate any of the required http benchmarkers ' +

This comment has been minimized.

@jasnell

jasnell Aug 17, 2016

Member

nit: s/Couldn't/Could not

const elapsed = process.hrtime(child_start);
if (code) {
if (stdout === '') {
console.error(benchmarker + ' failed with ' + code);

This comment has been minimized.

@jasnell

jasnell Aug 17, 2016

Member

nit:

`${benchmarker} failed with ${code}`

(that is, using template strings here and elsewhere throughout)

@jasnell

This comment has been minimized.

Member

jasnell commented Aug 17, 2016

Good stuff. Left a few comments. @mscdex @nodejs/benchmarking

* When running the benchmakrs, set `NODE_HTTP_BENCHMARKER` environment variable
to desired benchmarker.
* To select the default benchmarker for a particular benchmark, specify it as
`benchmarker` key (e.g. `benchmarker: 'wrk'`) in configuration passed to

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 17, 2016

Member

I don't think this is a good idea. The options to createBenchmark should just be the benchmark parameters. If you want this feature, I think it would be much better to add an option object to bench.http.

var bench = common.createBenchmark(main, {
  num: [1, 4, 8, 16],
  size: [1, 64, 256],
  c: [100],
  benchmarker: ['wrk']
});

function main(conf) {
  bench.http({
    url: '/', 
    duration: 10,
    connections: conf.c,
    benchmarker: conf.benchmarker
  }, function () { ... });
}
'benchmarks. Check benchmark/README.md for further instructions.');
process.exit(1);
}
function AutocannonBenchmarker() {

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 17, 2016

Member

Perhaps these abstractions should be moved to a separate file. I find this hard to read.

const self = this;
duration = 1;
const picked_benchmarker = process.env.NODE_HTTP_BENCHMARKER ||

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 17, 2016

Member

What is the reason to run all benchmarks by default? It takes quite a long time to run the http benchmarks, I don't think we need to add more to that by default.

This comment has been minimized.

@bzoz

bzoz Aug 18, 2016

Contributor

If you have both tools installed, then I assume you want to use both. As for the time - each HTTP benchmark run by those tools takes 10s. On my box all of the HTTP benchmarks take 14 minutes with 1 tool, and only 3 minutes more with 2 both of them.

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 18, 2016

Member

It sounds like something is wrong then. If each benchmarker takes 10 sec and it takes 14 min with one benchmarker, then it should take 28 min with two benchmarkers?

I also disagree with the premise. The benchmarkers should be functionallity equivalent, and should thus give similar results. If they give very different results (in a non-linear propertional way) it's sounds like something is wrong.

This comment has been minimized.

@mcollina

mcollina Aug 18, 2016

Member

@AndreasMadsen:

I also disagree with the premise. The benchmarkers should be functionallity equivalent, and should thus give similar results. If they give very different results (in a non-linear propertional way) it's sounds like something is wrong.

Just to clarify the issue here (summing up from #7180):

a) we want to be able to run the http benchmarks on Windows too, and it seem extremely hard to get wrk on Windows
b) @bzoz proposed to use ab, but ab is significantly different from wrk
c) I proposed to use autocannon, which is based on Node, and so it works on all platforms equally
d) @mscdex and @jbergstroem argued that we are introducing a dependency on an installed version of Node, which might influence benchmarks (currently it is not #7180 (comment))
e) @Fishrock123 proposes to support both runners

@AndreasMadsen would you mind reviewing if autocannon and wrk can be functionally equivalent?

This comment has been minimized.

@bzoz

bzoz Aug 18, 2016

Contributor

To clarify: not all benchmarks in benchmark/http/ use external tool. There are things like check_invalid_header etc., which take most of the time. In any way - running node benchmark/run.js http with 2 tools does not take significantly more time than with 1.

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 18, 2016

Member

@bzoz please see #8139, the issue is that http/simple.js takes too long time, using two benchmarkers would make it take twice as long.

@mcollina

a) we want to be able to run the http benchmarks on Windows too, and it seem extremely hard to get wrk on Windows

I'm all for adding autocannon for getting Windows support.

d) @mscdex and @jbergstroem argued that we are introducing a dependency on an installed version of Node, which might influence benchmarks (currently it is not #7180 (comment))

That is mostly an issue in when continuously monitoring performance. When just comparing a benchmark between master and a PR, that shouldn't be a problem.

@AndreasMadsen would you mind reviewing if autocannon and wrk can be functionally equivalent?

It's actually impossible from a philosophy of science perspective to show this, however we can validate it.

First I run the http/simple.js using wrk and then using autocannon. (raw data https://gist.github.com/AndreasMadsen/619e4a447b8df7043f32771e64b7693f). To compare wrk with autocannon we need to compensate for the parameters, as there are few of them and we have many observations we can do that by using factors (the settings becomes a set of binary numbers) and then use a linear regression. Simultaneously we can also check how much the benchmarker affects the results by also making it a factor.

Call:
lm(formula = rate ~ ., data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-3749.9 -1446.7   110.9  1376.7  5350.7 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     14003.64     121.29 115.456  < 2e-16 ***
c500               60.51      85.76   0.706    0.481    
chunks1          -823.67     105.04  -7.841 6.95e-15 ***
chunks4         -2715.50     105.04 -25.852  < 2e-16 ***
length1024      -1689.93     105.04 -16.088  < 2e-16 ***
length102400    -7251.31     105.04 -69.034  < 2e-16 ***
type bytes      -2110.44      85.76 -24.607  < 2e-16 ***
benchmarker wrk    46.03      85.76   0.537    0.592    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1993 on 2152 degrees of freedom
Multiple R-squared:  0.7521,    Adjusted R-squared:  0.7513 
F-statistic: 932.5 on 7 and 2152 DF,  p-value: < 2.2e-16
Raw

script: https://gist.github.com/AndreasMadsen/619e4a447b8df7043f32771e64b7693f

From this result we can see that the benchmarker does not have a statistical significant effect on the output. We can see this because there are no stars right to benchmarker wrk. Where is benchmarker autocannon some tends to ask. What happens is that autocannon is set to the default and we are simply measuring the difference in performance caused by wrk. From the results we see that wrk is 46.03 ops/sec faster, but it is not significant.

I guess one could do a more detailed analysis of the interactions between the benchmarker and the individual parameters, but those results are tricky to interpret because of paired correlation. I would say that since we can't observe a statistical significant effect we should assume that the benchmarker doesn't matter.

This comment has been minimized.

@mcollina

mcollina Aug 19, 2016

Member

Maybe we should just pick the first that is available on the system, if it is not specified.

This comment has been minimized.

@bzoz

bzoz Aug 19, 2016

Contributor

I'll do that, pick first one available as default.

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 19, 2016

Member

I think we should prefer wrk and fallback to autocannon, just because autocannon depends on Node. Yes it shouldn't really matter, but its better to be safe.

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 19, 2016

Member

For the record: I have done a crude interaction analysis, it turns out that the benchmarker does affect performances when looking at a specific set of parameters:

  c chunks length    type improvement significant      p.value
  50      0      4  buffer     -7.91 %         *** 1.322680e-09
  50      0      4   bytes     17.84 %         *** 4.045045e-21
  50      0   1024  buffer    -20.69 %         *** 9.271728e-27
  50      0   1024   bytes    -22.11 %         *** 1.515380e-20
  50      0 102400  buffer    -30.34 %         *** 5.798797e-49
  50      0 102400   bytes      3.83 %         *** 2.321257e-11
  50      1      4  buffer    -12.34 %         *** 3.224916e-16
  50      1      4   bytes    -10.43 %         *** 5.706192e-13
  50      1   1024  buffer    -15.65 %         *** 2.850410e-07
  50      1   1024   bytes     22.22 %         *** 1.248468e-39
  50      1 102400  buffer    -31.47 %         *** 4.988408e-46
  50      1 102400   bytes     -0.63 %          ** 6.046574e-03
  50      4      4  buffer     38.50 %         *** 2.366325e-52
  50      4      4   bytes     29.70 %         *** 2.854201e-27
  50      4   1024  buffer     27.75 %         *** 6.241389e-36
  50      4   1024   bytes     64.73 %         *** 4.793425e-22
  50      4 102400  buffer     -8.18 %         *** 7.693781e-26
  50      4 102400   bytes      3.94 %         *** 4.965380e-06
 500      0      4  buffer      9.04 %         *** 7.695347e-18
 500      0      4   bytes     17.01 %         *** 3.182603e-34
 500      0   1024  buffer    -14.02 %         *** 2.095410e-33
 500      0   1024   bytes    -19.36 %         *** 1.450384e-32
 500      0 102400  buffer    -38.56 %         *** 2.547833e-67
 500      0 102400   bytes      4.35 %         *** 7.185082e-20
 500      1      4  buffer      8.00 %         *** 1.383958e-21
 500      1      4   bytes      5.21 %         *** 7.024325e-16
 500      1   1024  buffer     -9.36 %         *** 2.184297e-20
 500      1   1024   bytes      6.08 %         *** 7.288844e-14
 500      1 102400  buffer    -36.23 %         *** 1.578285e-65
 500      1 102400   bytes      1.00 %         *** 2.875669e-04
 500      4      4  buffer     19.72 %         *** 7.854005e-49
 500      4      4   bytes     44.20 %         *** 1.691548e-35
 500      4   1024  buffer     19.78 %         *** 1.023951e-38
 500      4   1024   bytes     34.85 %         *** 3.279581e-39
 500      4 102400  buffer     -9.27 %         *** 1.401027e-28
 500      4 102400   bytes      6.78 %         *** 1.708493e-32

table: relative performances improvement of using autocannon.

however this doesn't say anything about the benchmarks ability to benchmark an optimization proposal. It just means that the benchmarkers aren't equally performant in all aspects. Really it just optimization suggests for the benchmarker implementers :)

Anyway it was just for the record.

edit: perhaps it is possible to make an equal variances tests instead of an equal mean test, that should express the benchmark ability of the benchmarkers. It is not something I normally do, so I will have to look it up.

if (code) {
console.error('wrk failed with ' + code);
process.exit(code);
function runHttpBenchmarker(index, collected_code) {

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 17, 2016

Member

v8 optimizations is going to affect the next results if you don't restart node.js. I think it would be much better not to support this at all on the bench.http level, and just add an option to bench.http and let the configuration queue handle this. That should also reduce the code quite a bit.

var bench = common.createBenchmark(main, {
  num: [1, 4, 8, 16],
  size: [1, 64, 256],
  c: [100],
  benchmarker: ['wrk', 'autocannon'] /* will run both */
});

function main(conf) {
  bench.http({
    url: '/', 
    duration: 10,
    connections: conf.c,
    benchmarker: conf.benchmarker
  }, function () { ... });
}

This comment has been minimized.

@bzoz

bzoz Aug 18, 2016

Contributor

Good point. Just tested it, and second run is up to 10% faster.

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 18, 2016

Updated, PTAL.

I've moved the benchmarkers code to http-benchmarkers.js.

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 23, 2016

Updated the PR again, @AndreasMadsen PTAL

Benchmark options are now passed as object. For each benchmarker node will be restarted. By default only one benchmarker will be used.

process.exit(1);
self.report(result, elapsed);
if (cb) {
cb(0);

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

This is a little odd. I think you should make the .run signature

http_benchmarkers.run(options, function(error, results) { ... })

that way you can check the error and call the callback appropriately. Actually I don't care so much if the behaviour is the same. It is just that we should avoid calling process.exit() from more than one file, as that makes the program difficult to reason about. This way the process.exit() logic can be in common.js.

const autocannon_exe = process.platform === 'win32'
? 'autocannon.cmd'
: 'autocannon';
this.present = function() {

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

you should move these to the prototype, that is how all the other classes works.

};
}
const http_benchmarkers = [ new AutocannonBenchmarker(),

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

I would like to see wrk be the default benchmarker, since that doesn't depend on node in any way.

supported_http_benchmarkers.push(name);
if (present) {
if (!default_http_benchmarker) {

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

I think this is easier to read if moved out of the forEach, such it is:

if (process.env.NODE_HTTP_BENCHMARKER) {
  default_http_benchmarker = installed_http_benchmarkers[
    process.env.NODE_HTTP_BENCHMARKER
  ];
} else {
  default_http_benchmarker = installed_http_benchmarkers[
    Object.keys(installed_http_benchmarkers)[0]
  ];
}
if (default_http_benchmarker) {
  default_http_benchmarker.default = true;
}
```js
'use strict';
var common = require('../common.js');

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

should be const.

easily build it [from source][wrk] via `make`.
By default first found benchmark tool will be used to run HTTP benchmarks. You
can overridde this by seting `NODE_HTTP_BENCHMARKER` environment variable to

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

I can see it's usefulness, but do we really need it. I would like to avoid adding unnecessary environment flags. This could also be accomplished by using --set benchmarker=autocannon, which wouldn't require any extra code.

I would appreciate other opinions.

This comment has been minimized.

@mcollina

mcollina Aug 24, 2016

Member

I agree on the --set behavior, as it is already there.

I don't expect this to be changed that much anyway, so the env variable is probably ok as well.

@@ -1,6 +1,7 @@
'use strict';
const child_process = require('child_process');
const http_benchmarkers = require('./http-benchmarkers.js');

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

perhaps it should be called _http-benchmarkers.js, that is how the other utility files are named.

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 23, 2016

@bzoz Great. I think this is much better.

new WrkBenchmarker() ];
var default_http_benchmarker;
var supported_http_benchmarkers = [];

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

Use const

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 23, 2016

@AndreasMadsen updated with your suggestions

this.name = 'autocannon';
}
AutocannonBenchmarker.prototype.autocannon_exe = process.platform === 'win32'

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 23, 2016

Member

I don't think strings should be put on the prototype. I would just evaluate it in the constructor.

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 24, 2016

Updated, PTAL

As for ENV - the --set thing will not work. If the overriden key is not it configuration already, there will be TypeError in common.js here. Not all http benchmarks use benchmarking tool (e. g. bench-parser.js), so it won't work there.

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 24, 2016

If the overriden key is not it configuration already, there will be TypeError in common.js here.

That we could fix, either such it defaults to a string or simply skip the property. I think the latter would be best.

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 26, 2016

I think it would be better to just add this as string. Otherwise I think it would be confusing for users - it would seem that sometimes it does not work. Also, no feedback when one would misspell config option.

Anyhow - I'll change common.js so that it will assume string type and drop the ENV thing.

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 26, 2016

Updated, PTAL

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 26, 2016

@AndreasMadsen BTW, why the "dont-land-on-v*.x" labels?

'instructions.'));
return;
}
var benchmarker = benchmarkers[options.benchmarker];

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 26, 2016

Member

use const

const benchmarker_start = process.hrtime();
var child = benchmarker.create(options);

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 26, 2016

Member

use const

child.once('close', function(code) {
const elapsed = process.hrtime(benchmarker_start);
if (code) {
var error_message = `${options.benchmarker} failed with ${code}.`;

This comment has been minimized.

@AndreasMadsen
this.options = this._parseArgs(process.argv.slice(2), options);
const parsed_args = this._parseArgs(process.argv.slice(2), options);
this.options = parsed_args.cli;
this.extra_options = parsed_args.extra;

This comment has been minimized.

@AndreasMadsen

AndreasMadsen Aug 26, 2016

Member

I don't see any reason to introduce extra_options. The problem was just a bug in how the type was inferred, not how we handle options as a whole.

This comment has been minimized.

@bzoz

bzoz Aug 26, 2016

Contributor

If we just add those to options they will be displayed when running other benchmarks, even unrelated ones. It does not look, and could be confusing. So I store those extra options elsewhere and apply them only to related benchmarks

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 26, 2016

BTW, why the "dont-land-on-v*.x" labels?

Some major changes to the benchmark suite has been made, this PR depends on those changes and thus it can't land on v6 or earlier. See #7890

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 26, 2016

I don't want to hold this any more and I don't have this much time to review code. So except for the minor const and let issues, it LGTM.

The current version is much better than original, however I still think there are some very implicit things that I don't like (but can tolerate). What is implicit:

  • adding benchmark as an config option when bench.http is executed.
    • this is added such the used benchmarker is shown in the output.
  • --set benchmarker= also has an effect when there is no explicit benchmarker option.
    • this is to support --set benchmarker when there is no explicit benchmarker option.
  • misspelled options are not shown in the output or options object just the extra_options object.
    • I know I sugested they should just be ignored, I was wrong.

I think all this can be solved by just enforcing explicitly setting the benchmarker parameter in the createBenchmark options object.

var bench = common.createBenchmark(main, {
  num: [1, 4, 8, 16],
  size: [1, 64, 256],
  c: [100],
  benchmarker: bench.default_http_benchmarker
});

function main(conf) {
  bench.http({
    url: '/', 
    duration: 10,
    connections: conf.c,
    benchmarker: conf.benchmarker
  }, function () { ... });
}

This way the benchmarker is added to the output and no special logic is needed for --set benchmarker.

Yes, if --set benchmarker is used, benchmarker= will be also be added to the output of benchmarks that doesn't use it. I don't think this is a big issue. If it turns out to be an issue, we can filter the output to only show options passed explicitly to createBenchmark. But this is a tradeoff between readability and transparency.

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 26, 2016

I would like to keep it the way it is now - without explicitly adding default_http_benchmarker and with proper output for other benchmarks. I've updated that var/let/const thingy.

Thanks for all the suggestions!

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 26, 2016

/cc @jasnell @mscdex @mcollina, please review

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 26, 2016

Let's also cc @nodejs/collaborators as this is a fairly big addition and someone might have stronger opinions.

@mcollina

This comment has been minimized.

Member

mcollina commented Aug 26, 2016

LGTM

@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 29, 2016

Any more opinions?

If not, I would like to land this tomorrow.

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 29, 2016

Maybe you shouldn't put me under Reviewed-By since I'm not 100% onboard. I don't know what the policy is.

@mcollina

This comment has been minimized.

Member

mcollina commented Aug 29, 2016

@AndreasMadsen can you please recap why you are not onboard? Just to understand for someone jumping into this later on.

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 29, 2016

@mcollina I've covered that in #8140 (comment). tl;dr: there are a few implicit things which I think are confusion/surprising, and they could become explicit with minimal effort.

@mcollina

This comment has been minimized.

Member

mcollina commented Aug 29, 2016

can we get you to to agree, or some other opinion?

@AndreasMadsen

This comment has been minimized.

Member

AndreasMadsen commented Aug 29, 2016

@mcollina I do these reviews in my spare time. I'm sure we could agree if given enough time, but my time is quite limited these days (exams, assignments, etc.).

edit: I hope you don't take it the wrong way. I can understand that someone suddenly deciding not to spend more spare time on someone else can be offensive. I had reasonable time two weeks ago, but very little these weeks. I'm sorry.

bzoz added a commit that referenced this pull request Aug 31, 2016

benchmark: support for multiple http benchmarkers
This adds support for multiple HTTP benchmarkers. Adds autocannon
as the secondary benchmarker.

PR-URL: #8140
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
@bzoz

This comment has been minimized.

Contributor

bzoz commented Aug 31, 2016

CI: https://ci.nodejs.org/job/node-test-pull-request/3903/ (failures unrealated)

Landed in b1bbc68

@AndreasMadsen I did as you asked. Anyhow, thanks for all your input! It was very helpful in improving the quality of this PR.

@bzoz bzoz closed this Aug 31, 2016

@gibfahn gibfahn referenced this pull request Jun 15, 2017

Closed

Auditing for 6.11.1 #230

2 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment