tools: fix flakiness in test-tick-processor #2694

matthewloring · 2015-09-04T19:00:13Z

Per the discussion on #2471, the JS symbols checked for by this test
were occasionally too deep in the stack and were being ignored by the
tick processor.

I have addressed this by increasing the stack depth inspected by the
tick processor and looking for the eval symbol which is more likely
to be present.

I also sneaked in a polyfill that was missing from the original PR.
It is only a problem if incorrect command line arguments are passed
to the script so it was missed in initial testing.

/cc @ofrobots @bnoordhuis @Fishrock123 @joaocgreis @Trott

Fishrock123 · 2015-09-04T19:35:19Z

CI: https://ci.nodejs.org/job/node-test-pull-request/252/

bnoordhuis · 2015-09-04T20:00:35Z

tools/v8-prof/linux-tick-processor

@@ -18,6 +18,6 @@ if [ ! -x "$NODE" ] && [ -x "$(dirname "$0")/../../node" ]; then
  NODE="$(dirname "$0")/../../node"
 fi

-"$NODE" "$TEMP_SCRIPT_FILE" $@
+"$NODE" "$TEMP_SCRIPT_FILE" "--call-graph-size=10" $@


Can I suggest passing this as an argument in the test?

Definitely! Done.

bnoordhuis · 2015-09-04T20:02:00Z

The second commit needs some rewording to conform to the guidelines, e.g. tools: add missing tick processor polyfill.

matthewloring · 2015-09-04T20:20:57Z

Sorry about that, fixed.

bnoordhuis · 2015-09-04T20:41:46Z

LGTM

Trott · 2015-09-04T21:13:44Z

Works for me. I'd still prefer something deterministic (execute the function recursively 1200 times, then check the stack over the current run for two seconds and then check the stack) but there may be nuances to the test that I'm missing that preclude that approach.

matthewloring · 2015-09-04T21:43:49Z

@Trott It is definitely possible but would require tuning that number of iterations for the different processor speeds to make sure we have sufficient time to get enough ticks. Writing OS specific code seems fine but I hesitate to write code that is dependent on processor speed. If this is something that is important I can investigate it further.

ofrobots · 2015-09-04T21:47:26Z

@Trott Since we are doing time-based sampling, you want the test to run at least for ~2 seconds on a fast machine so that we get a reasonable sample. At the same time you don't want the test to run for 'a very long time' on slow machines such as a heavily loaded RPi.

IMO, the current solution of killing the test after 2 seconds is pretty reasonable as it runs for at least 2 seconds on the fastest machine of the future, while still working on the slowest machine from today.

joaocgreis · 2015-09-04T23:41:31Z

@matthewloring As part of the "fix flakiness" commit, you can remove the test from the list of flaky tests. Or add a separate commit, as you prefer. Just remove this line: https://github.com/matthewloring/io.js/blob/bcc1bb71bae06ef8ec746d6800416c0c53476aa7/test/parallel/parallel.status#L15

matthewloring · 2015-09-04T23:54:22Z

Unfortunately, I just discovered a new way that this test can flake. It is less frequent (this PR still improves things) but I don't think it should be removed from the list of flaky tests just yet. I need to dig in further and try to figure out whether this is something that can be fixed in Node or whether I have to continue this in V8.

matthewloring · 2015-09-14T20:04:52Z

I now have separate code snippets to test for JS and C++ symbols so that they can be tailored to have a higher percentage of the desired tick type. The remaining flakiness in the JS symbol check seemed to be caused by missed code-creation events in the raw log output. To correct for this, I see if the desired JS symbols are present or a percentage of UNKNOWN ticks which are registered if the code creation event is missed. This event is missed roughly 1/600 iterations after testing on my local machine. This is still considered a success since the scripts inside node are processing the log as desired. These missed code creation events need to be further investigated inside v8.

The new test passed 1000/1000 times so hopefully the flakiness is gone for good now.

Could we run a CI to make sure the changes to the test don't break on other platforms (tested on Mac and Ubuntu locally).

@joaocgreis I have removed this test from the list of flaky ones.

evanlucas · 2015-09-14T20:10:22Z

CI: https://ci.nodejs.org/job/node-test-pull-request/299/

Trott · 2015-09-14T20:23:09Z

matthewloring · 2015-09-14T21:14:20Z

Thanks for the CI! The only failure, test-child-process-emfile.js on freebsd101-64, appears unrelated.

@ofrobots @bnoordhuis Could you please take another look.

bnoordhuis · 2015-09-14T21:21:38Z

test/parallel/test-tick-processor.js

+  '  setImmediate(function() { f(); });' +
+  '};' +
+  'setTimeout(function() { process.exit(0); }, 2000);' +
+  'f();');


Suggestion: use template strings for legibility here and below.

bnoordhuis · 2015-09-14T21:23:18Z

LGTM with a nit and a suggestion.

Per the discussion on #2471, the JS symbols checked for by this test were occasionally too deep in the stack and were being ignored by the tick processor. I have addressed this by increasing the stack depth inspected by the tick processor and looking for the eval symbol which is more likely to be present. Additional flakiness was caused by occasional misses of the code creation event for the JS function being executed. I now have separate code snippets to test for JS and C++ symbols and if the code creation event is missed for the JS symbol test then I check for a percentage of UNKNOWN symbols in processed output. This is considered a success as the processing scripts in the node repository are still correctly processing the ticks recieved from the v8 scripts. Further investigation is needed into the v8 profiling scripts to determine why code creation events are being missed.

The polyfill is only needed if incorrect command line arguments are passed to the script so it was missed in initial testing.

Per the discussion on #2471, the JS symbols checked for by this test were occasionally too deep in the stack and were being ignored by the tick processor. I have addressed this by increasing the stack depth inspected by the tick processor and looking for the eval symbol which is more likely to be present. Additional flakiness was caused by occasional misses of the code creation event for the JS function being executed. I now have separate code snippets to test for JS and C++ symbols and if the code creation event is missed for the JS symbol test then I check for a percentage of UNKNOWN symbols in processed output. This is considered a success as the processing scripts in the node repository are still correctly processing the ticks recieved from the v8 scripts. Further investigation is needed into the v8 profiling scripts to determine why code creation events are being missed. PR-URL: #2694 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>

The polyfill is only needed if incorrect command line arguments are passed to the script so it was missed in initial testing. PR-URL: #2694 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>

bnoordhuis · 2015-09-14T22:14:29Z

Thanks Matt, landed in a85f4b5...40ec84d.

Per the discussion on #2471, the JS symbols checked for by this test were occasionally too deep in the stack and were being ignored by the tick processor. I have addressed this by increasing the stack depth inspected by the tick processor and looking for the eval symbol which is more likely to be present. Additional flakiness was caused by occasional misses of the code creation event for the JS function being executed. I now have separate code snippets to test for JS and C++ symbols and if the code creation event is missed for the JS symbol test then I check for a percentage of UNKNOWN symbols in processed output. This is considered a success as the processing scripts in the node repository are still correctly processing the ticks recieved from the v8 scripts. Further investigation is needed into the v8 profiling scripts to determine why code creation events are being missed. PR-URL: #2694 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>

The polyfill is only needed if incorrect command line arguments are passed to the script so it was missed in initial testing. PR-URL: #2694 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>

Per the discussion on #2471, the JS symbols checked for by this test were occasionally too deep in the stack and were being ignored by the tick processor. I have addressed this by increasing the stack depth inspected by the tick processor and looking for the eval symbol which is more likely to be present. Additional flakiness was caused by occasional misses of the code creation event for the JS function being executed. I now have separate code snippets to test for JS and C++ symbols and if the code creation event is missed for the JS symbol test then I check for a percentage of UNKNOWN symbols in processed output. This is considered a success as the processing scripts in the node repository are still correctly processing the ticks recieved from the v8 scripts. Further investigation is needed into the v8 profiling scripts to determine why code creation events are being missed. PR-URL: #2694 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>

The polyfill is only needed if incorrect command line arguments are passed to the script so it was missed in initial testing. PR-URL: #2694 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>

mscdex added the tools Issues and PRs related to the tools directory. label Sep 4, 2015

bnoordhuis reviewed Sep 4, 2015
View reviewed changes

rvagg force-pushed the master branch from 11c25c2 to ba02bd0 Compare September 6, 2015 11:55

silverwind mentioned this pull request Sep 6, 2015

A weird commit from 1970 #2713

Closed

bnoordhuis reviewed Sep 14, 2015
View reviewed changes

Matt Loring added 2 commits September 14, 2015 14:56

tools: add missing tick processor polyfill

8bebecc

The polyfill is only needed if incorrect command line arguments are passed to the script so it was missed in initial testing.

bnoordhuis closed this Sep 14, 2015

matthewloring mentioned this pull request Sep 14, 2015

Investigate flaky test test-tick-processor #2471

Closed

Fishrock123 mentioned this pull request Sep 15, 2015

Release proposal: v4.1.0 #2844

Closed

7 tasks

rvagg mentioned this pull request Sep 15, 2015

Release proposal: v3.3.1 #2698

Merged

rvagg mentioned this pull request Sep 22, 2015

Release proposal: v4.1.1 #2995

Merged

matthewloring deleted the tick-flaky-fix branch February 25, 2016 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tools: fix flakiness in test-tick-processor #2694

tools: fix flakiness in test-tick-processor #2694

matthewloring commented Sep 4, 2015

Fishrock123 commented Sep 4, 2015

bnoordhuis Sep 4, 2015

matthewloring Sep 4, 2015

bnoordhuis commented Sep 4, 2015

matthewloring commented Sep 4, 2015

bnoordhuis commented Sep 4, 2015

Trott commented Sep 4, 2015

matthewloring commented Sep 4, 2015

ofrobots commented Sep 4, 2015

joaocgreis commented Sep 4, 2015

matthewloring commented Sep 4, 2015

matthewloring commented Sep 14, 2015

evanlucas commented Sep 14, 2015

Trott commented Sep 14, 2015

matthewloring commented Sep 14, 2015

bnoordhuis Sep 14, 2015

matthewloring Sep 14, 2015

bnoordhuis commented Sep 14, 2015

bnoordhuis commented Sep 14, 2015

tools: fix flakiness in test-tick-processor #2694

tools: fix flakiness in test-tick-processor #2694

Conversation

matthewloring commented Sep 4, 2015

Fishrock123 commented Sep 4, 2015

bnoordhuis Sep 4, 2015

Choose a reason for hiding this comment

matthewloring Sep 4, 2015

Choose a reason for hiding this comment

bnoordhuis commented Sep 4, 2015

matthewloring commented Sep 4, 2015

bnoordhuis commented Sep 4, 2015

Trott commented Sep 4, 2015

matthewloring commented Sep 4, 2015

ofrobots commented Sep 4, 2015

joaocgreis commented Sep 4, 2015

matthewloring commented Sep 4, 2015

matthewloring commented Sep 14, 2015

evanlucas commented Sep 14, 2015

Trott commented Sep 14, 2015

matthewloring commented Sep 14, 2015

bnoordhuis Sep 14, 2015

Choose a reason for hiding this comment

matthewloring Sep 14, 2015

Choose a reason for hiding this comment

bnoordhuis commented Sep 14, 2015

bnoordhuis commented Sep 14, 2015