benchmark: check end() argument to be > 0 #12030

vsemozhetbyt · 2017-03-24T23:24:24Z

Checklist

commit message follows commit guidelines

Affected core subsystem(s)

benchmark

Refs: #11972

vsemozhetbyt · 2017-03-24T23:25:49Z

CI: https://ci.nodejs.org/job/node-test-pull-request/7029/

vsemozhetbyt · 2017-03-24T23:42:22Z

~~test/sequential/test-benchmark-net.js starts failing. It seems this can be landed only after #11972 fixed.~~ (see #12030 (comment))

joyeecheung · 2017-03-25T13:39:02Z

I think the CI failure is not related to #11972, because #11979 set the dur to 0, which means the benchmark will be ended in next timeout(0) and won't have much time to do anything, so it's normal to get a rate=0. To fix this error test/sequential/test-benchmark-net.js will need to set the dur to somerthing above 0 (e.g. 0.1 so the timeout will be 100 ms)

vsemozhetbyt · 2017-03-25T14:04:18Z

cc @Trott

joyeecheung · 2017-03-25T14:08:53Z

@vsemozhetbyt Sorry, I was not being clear. rate is the name of the column displayed when you run the benchmark with compare.js, which is calculated from what you passed to bench.end()( bench.end() takes the operation count, and rate is operations/second, hence a 0 operation count will result in a 0 rate). In this case, rate being 0 means bytes in this benchmark is 0 (bench.end() is called with 0, which is why the CI fails). Since dur is set to 0 in the test, the benchmark won't have enough time to do anything so it's normal to get a 0 bytes. If you turn dur in test/sequential/test-benchmark-net.js to something above 0, some platforms in the CI should be green in this PR (I think 0.1 is probably enough).

vsemozhetbyt · 2017-03-25T14:15:31Z

@joyeecheung Yes, thank you. Sorry, it was my misunderstanding.

vsemozhetbyt · 2017-03-25T17:27:04Z

@joyeecheung I've set it to dur=0.2 because net-s2c.js also gives 0 operations with dur=0.1. But it can be flaky now.

I will launch new CI.

vsemozhetbyt · 2017-03-25T17:29:33Z

New CI: https://ci.nodejs.org/job/node-test-pull-request/7036/

vsemozhetbyt · 2017-03-25T17:47:31Z

@joyeecheung I am not sure how to see the needed output in proper way, but I can see here now (the same here for failed test/ppc-linux:

Label (click me):

net/net-c2s-cork.js
net/net-c2s-cork.js dur=0.2 type="buf" len=4: 0.028310918358748348
net/net-c2s-cork.js dur=0.2 type="buf" len=8: 0.06079112774146017
/home/iojs/build/workspace/node-test-commit-plinux/nodes/ppcle-ubuntu1404/benchmark/common.js:197
    throw new Error('called end() with operation count <= 0');
    ^

Error: called end() with operation count <= 0
    at Benchmark.end (/home/iojs/build/workspace/node-test-commit-plinux/nodes/ppcle-ubuntu1404/benchmark/common.js:197:11)
    at Timeout._onTimeout (/home/iojs/build/workspace/node-test-commit-plinux/nodes/ppcle-ubuntu1404/benchmark/net/net-c2s-cork.js:86:15)
    at ontimeout (timers.js:407:14)
    at tryOnTimeout (timers.js:271:5)
    at Timer.listOnTimeout (timers.js:235:5)

assert.js:81
  throw new assert.AssertionError({
  ^
AssertionError: 1 === 0
    at ChildProcess.child.on (/home/iojs/build/workspace/node-test-commit-plinux/nodes/ppcle-ubuntu1404/test/sequential/test-benchmark-net.js:20:10)
    at emitTwo (events.js:125:13)
    at ChildProcess.emit (events.js:213:7)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:208:12)

It seems it will be rewriten later by other CI results.

Trott · 2017-03-25T17:48:11Z

Perhaps instead of setting the duration to a small positive value, we can add a CLI option to turn off the zero-check? It could default to "use the check" but the tests can turn the check off. The option would be used only for the tests, I suppose, where we're not really interested in getting a benchmark but just confirming that benchmark code can run.

vsemozhetbyt · 2017-03-25T17:55:33Z

@Trott But how can we catch errors like #11972 then?

Trott · 2017-03-25T18:11:09Z

@Trott But how can we catch errors like #11972 then?

@vsemozhetbyt You mean catch the errors in tests? We wouldn't. Or we could write a separate test perhaps.

I'm concerned about test flakiness with the dur=0 to dur=0.2 change, especially on slower hardware like the Raspberry Pi devices.

vsemozhetbyt · 2017-03-25T18:26:55Z

@Trott I am not sure where this CLI option should be added — there are many different files in the various benchmark call chains.

vsemozhetbyt · 2017-03-25T19:24:05Z

BTW, is there any environment variables set while tests or CI run?

vsemozhetbyt · 2017-03-25T19:54:38Z

I've reverted the change for test/sequential/test-benchmark-net.js to avoid flakiness.

vsemozhetbyt · 2017-03-25T20:08:08Z

Is it safe to check elapsed[0] > 0 (from here) to ensure the benchmark have realy run or is this approach flaky too?

Trott · 2017-03-25T20:28:14Z

Maybe set an environment variable (say NODEJS_BENCHMARK_ZERO_ALLOWED) and pass it to options.env in the fork() call in test-benchmark-net.js?

const child = fork(runjs, ['--set', 'dur=0.2', 'net'],
                   {env: {NODEJS_BENCHMARK_ZERO_ALLOWED: 1}});

Then check for process.env.NODEJS_BENCHMARK_ZERO_ALLOWED in benchmark/common.js?

Then you can write a test for this change that invokes a benchmark in a way that's guaranteed to get 0 results (--set n=0 should do it on many benchmarks) and check that you get an error.

Refs: #11972

vsemozhetbyt · 2017-03-25T21:04:54Z

@Trott Done. New CI: https://ci.nodejs.org/job/node-test-pull-request/7038/

Trott

LGTM.

vsemozhetbyt · 2017-03-26T00:04:34Z

CI is OK and complete, but two checks have hung here in the interactive table. Is there a way to forcibly update it?

Trott · 2017-03-26T01:24:17Z

CI is OK and complete, but two checks have hung here in the interactive table. Is there a way to forcibly update it?

Not that I know of. This is a longstanding bug. :-(

joyeecheung · 2017-03-26T01:31:44Z

I'm concerned about test flakiness with the dur=0 to dur=0.2 change, especially on slower hardware like the Raspberry Pi devices.

Maybe we can make a separate test for #11972 and only test windows for it?

AndreasMadsen · 2017-03-26T12:31:23Z

Why do we need the NODEJS_BENCHMARK_ZERO_ALLOWED option? When is .end(0) sensible?

AndreasMadsen · 2017-03-26T13:58:42Z

@vsemozhetbyt I see. Well, I think it would be better to either fix the benchmark or the test, than to add a new option. But I don't hold a strong opinion.

vsemozhetbyt · 2017-03-26T14:05:53Z

I do hope that somebody with internal-Node.js-API-on-Windows experience will fix the tcp-raw-pipe.js.

vsemozhetbyt · 2017-03-27T13:15:37Z

@Trott Should the NODEJS_BENCHMARK_ZERO_ALLOWED be documented? If so, could you suggest the proper place (test/README.md or benchmark/README.md or guides/writing-and-running-benchmarks.md?) and wording as I am not so good in English.

joyeecheung · 2017-03-27T15:23:34Z

@vsemozhetbyt Since we are starting to write tests for benchmarks I think we can put a section about how to write a test for a benchmark in guides/writing-and-running-benchmarks.md (probably also suggesting to write a test when adding a new benchmark). NODEJS_BENCHMARK_ZERO_ALLOWED can be documented there.

vsemozhetbyt · 2017-03-27T17:22:52Z

@joyeecheung So I suppose this is a subject for a different PR. I hope this variable will not be forgotten (at least I've left a link to this discussion in the issue that may become the tracking issue for benchmark tests.

So if nobody has objections I will land this after 72 hours from the creation.

PR-URL: #12030 Ref: #11972 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>

vsemozhetbyt · 2017-03-28T00:22:26Z

Landed in 642baf4

mscdex · 2017-03-28T00:35:23Z

I just saw this PR and I'm not convinced that not allowing zero is a good idea, especially if you're running a comparison for a benchmark script that has a lot of different configurations. All it takes is for one configuration to fail and now you have to restart it all over again.... not fun times.

I understand not allowing negative numbers, because that almost certainly would never happen, especially since counters are initialized to zero in every benchmark script that I have seen.

vsemozhetbyt · 2017-03-28T00:58:50Z

@mscdex compare.R seems to fail too if one of benchmarks calls .end(0). Should we tolerate such case (i.e. a benchmarker will have to manually edit .csv file)?

mscdex · 2017-03-28T02:27:23Z

@vsemozhetbyt I haven't verified that, but at least you'd still (hopefully) have some useable CSV data, provided you didn't pipe directly to RScript. IMHO we should be supporting 0 ops/sec. results in both the benchmark runner and the results processor.

vsemozhetbyt · 2017-03-28T02:34:44Z

Is not NODEJS_BENCHMARK_ZERO_ALLOWED suffice for this tolerance?

mscdex · 2017-03-28T04:17:30Z

@vsemozhetbyt The problem is that's not the default... and if it were, there'd be no point in having it (unless you change it to check only for negative rates and change the environment variable name accordingly).

vsemozhetbyt · 2017-03-28T08:34:15Z

So should I revert the whole commit?

mscdex · 2017-03-28T17:32:30Z

@vsemozhetbyt At the very least we could just check for < 0, which is fine by me. Let's see what other @nodejs/collaborators think.

seishun · 2017-03-28T17:37:53Z

I agree with @mscdex

AndreasMadsen · 2017-03-28T21:51:51Z

All it takes is for one configuration to fail and now you have to restart it all over again.... not fun times.

If we test the benchmarks then this is very unlikely to happen. In my mind, it is not wrong to fail on .end(0) it is just wrong to ignore the failure in the tests (using NODEJS_BENCHMARK_ZERO_ALLOWED).

PR-URL: #12030 Ref: #11972 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>

MylesBorins · 2017-04-18T23:06:46Z

should this be backported?

gibfahn · 2017-06-17T22:51:26Z

should this be backported?

ping @vsemozhetbyt

vsemozhetbyt · 2017-06-17T23:15:29Z

@gibfahn I am not sure. It was frowned upon. And it is connected with benchmark tests that are not backported to v6 if I get this right. So maybe we should not backport it till we absolutely need.

nodejs-github-bot added the benchmark Issues and PRs related to the benchmark subsystem. label Mar 24, 2017

jasnell approved these changes Mar 24, 2017

View reviewed changes

vsemozhetbyt mentioned this pull request Mar 24, 2017

Cross-platform run of an arbitrary script nodejs/help#548

Closed

jasnell added the blocked PRs that are blocked by other issues or PRs. label Mar 24, 2017

vsemozhetbyt mentioned this pull request Mar 25, 2017

benchmark: net/tcp-raw-pipe.js gives wrong data #11972

Closed

benchmark: check end() argument to be > 0

9c361aa

Refs: #11972

vsemozhetbyt removed the blocked PRs that are blocked by other issues or PRs. label Mar 25, 2017

Trott approved these changes Mar 25, 2017

View reviewed changes

vsemozhetbyt mentioned this pull request Mar 27, 2017

test the benchmarks at one iteration #12068

Closed

vsemozhetbyt added a commit that referenced this pull request Mar 28, 2017

benchmark: check end() argument to be > 0

642baf4

PR-URL: #12030 Ref: #11972 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>

vsemozhetbyt closed this Mar 28, 2017

vsemozhetbyt deleted the benchmark-common-end branch March 28, 2017 00:22

MylesBorins pushed a commit that referenced this pull request Mar 28, 2017

benchmark: check end() argument to be > 0

751c115

PR-URL: #12030 Ref: #11972 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>

MylesBorins mentioned this pull request Mar 28, 2017

V7.8.0 proposal #12104

Merged

italoacasas mentioned this pull request Apr 10, 2017

v7.9.0 Release Proposal #12319

Merged

2 tasks

MylesBorins added the lts-watch-v6.x label Apr 18, 2017

gibfahn mentioned this pull request Jun 15, 2017

Auditing for 6.11.1 nodejs/Release#230

Closed

3 tasks

gibfahn added dont-land-on-v6.x and removed lts-watch-v6.x labels Jun 17, 2017

benchmark: check end() argument to be > 0 #12030

benchmark: check end() argument to be > 0 #12030

Conversation

vsemozhetbyt commented Mar 24, 2017 • edited Loading

Checklist

Affected core subsystem(s)

vsemozhetbyt commented Mar 24, 2017

vsemozhetbyt commented Mar 24, 2017 • edited Loading

joyeecheung commented Mar 25, 2017

vsemozhetbyt commented Mar 25, 2017

joyeecheung commented Mar 25, 2017 • edited Loading

vsemozhetbyt commented Mar 25, 2017

vsemozhetbyt commented Mar 25, 2017 • edited Loading

vsemozhetbyt commented Mar 25, 2017

vsemozhetbyt commented Mar 25, 2017 • edited Loading

Trott commented Mar 25, 2017 • edited Loading

vsemozhetbyt commented Mar 25, 2017

Trott commented Mar 25, 2017

vsemozhetbyt commented Mar 25, 2017

vsemozhetbyt commented Mar 25, 2017

vsemozhetbyt commented Mar 25, 2017

vsemozhetbyt commented Mar 25, 2017

Trott commented Mar 25, 2017 • edited Loading

vsemozhetbyt commented Mar 25, 2017

Trott left a comment

Choose a reason for hiding this comment

vsemozhetbyt commented Mar 26, 2017

Trott commented Mar 26, 2017

joyeecheung commented Mar 26, 2017

AndreasMadsen commented Mar 26, 2017 • edited Loading

AndreasMadsen commented Mar 26, 2017

vsemozhetbyt commented Mar 26, 2017

vsemozhetbyt commented Mar 27, 2017

joyeecheung commented Mar 27, 2017

vsemozhetbyt commented Mar 27, 2017

vsemozhetbyt commented Mar 28, 2017

mscdex commented Mar 28, 2017 • edited Loading

vsemozhetbyt commented Mar 28, 2017

mscdex commented Mar 28, 2017

vsemozhetbyt commented Mar 28, 2017

mscdex commented Mar 28, 2017 • edited Loading

vsemozhetbyt commented Mar 28, 2017

mscdex commented Mar 28, 2017

seishun commented Mar 28, 2017

AndreasMadsen commented Mar 28, 2017

MylesBorins commented Apr 18, 2017

gibfahn commented Jun 17, 2017

vsemozhetbyt commented Jun 17, 2017

vsemozhetbyt commented Mar 24, 2017 •

edited

Loading

vsemozhetbyt commented Mar 24, 2017 •

edited

Loading

joyeecheung commented Mar 25, 2017 •

edited

Loading

vsemozhetbyt commented Mar 25, 2017 •

edited

Loading

vsemozhetbyt commented Mar 25, 2017 •

edited

Loading

Trott commented Mar 25, 2017 •

edited

Loading

Trott commented Mar 25, 2017 •

edited

Loading

AndreasMadsen commented Mar 26, 2017 •

edited

Loading

mscdex commented Mar 28, 2017 •

edited

Loading

mscdex commented Mar 28, 2017 •

edited

Loading