Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix flaky doctool and test #29979

Merged
merged 0 commits into from Oct 15, 2019

Conversation

@Trott
Copy link
Member

Trott commented Oct 15, 2019

Doctool tests have been failing a lot in CI on Win2008 R2. It appears
async functions and callback-based functions are being used in
combination such that the callback-based function cannot guarantee that
it will invoke its callback. Convert the callback-based functions to
async functions so we have one paradigm and reliable results.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
@nodejs-github-bot

This comment has been minimized.

@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

Relevant Windows CI is green. Taking this out of Draft mode. Collaborators please 👍 here to fast-track to unbreak CI.

@Trott Trott added the fast-track label Oct 15, 2019
@Trott Trott marked this pull request as ready for review Oct 15, 2019
@nodejs-github-bot

This comment has been minimized.

@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

Stress test won't work for this because doctool isn't built in that job, but in addition to the green Windows run in the regular CI, here's four more Windows CI runs with this code to confirm it fixes the doctool test issue there:

= Windows job was fully green
✔️ = Windows job was yellow or red, but the doctool tests on win2008r2 passed and that's what matters here

https://ci.nodejs.org/job/node-test-binary-windows-2/3531/
https://ci.nodejs.org/job/node-test-binary-windows-2/3533/ ✔️
https://ci.nodejs.org/job/node-test-binary-windows-2/3534/ ✔️
https://ci.nodejs.org/job/node-test-binary-windows-2/3535/ ✔️
https://ci.nodejs.org/job/node-test-binary-windows-2/3536/ ✔️

@Trott

This comment has been minimized.

@richardlau

This comment has been minimized.

Copy link
Member

richardlau commented Oct 15, 2019

And to show it failing on master, here are five from master:

= doctool test failed on win2008r2

https://ci.nodejs.org/job/node-test-binary-windows-2/3532/
https://ci.nodejs.org/job/node-test-binary-windows-2/3537/
https://ci.nodejs.org/job/node-test-binary-windows-2/3538/
https://ci.nodejs.org/job/node-test-binary-windows-2/3539/
https://ci.nodejs.org/job/node-test-binary-windows-2/3540/

Given the failure is so prominent on master are we sure it's not a real recent regression? The test has only failed in the two most recent node-daily-master runs (out of the twelve we have history for):
https://ci.nodejs.org/job/node-test-binary-windows-2/3495/
https://ci.nodejs.org/job/node-test-binary-windows-2/3532/

The change LGTM btw but I just want to make sure we're not papering over something that has recently changed to suddenly make this test fail consistently.

@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

And to show it failing on master, here are five from master:
= doctool test failed on win2008r2
https://ci.nodejs.org/job/node-test-binary-windows-2/3532/
https://ci.nodejs.org/job/node-test-binary-windows-2/3537/
https://ci.nodejs.org/job/node-test-binary-windows-2/3538/
https://ci.nodejs.org/job/node-test-binary-windows-2/3539/
https://ci.nodejs.org/job/node-test-binary-windows-2/3540/

Given the failure is so prominent on master are we sure it's not a real recent regression? The test has only failed in the two most recent node-daily-master runs (out of the twelve we have history for):
https://ci.nodejs.org/job/node-test-binary-windows-2/3495/
https://ci.nodejs.org/job/node-test-binary-windows-2/3532/

The change LGTM btw but I just want to make sure we're not papering over something that has recently changed to suddenly make this test fail consistently.

Pretty sure it's not a regression. I bisected (by running tests on CI) and the supposed problematic change was 5f80df8 which seems rather unlikely to me given that there's no http in this test.

@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

(One more 👍 to approve fast-tracking over at #29979 (comment) please?)

@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

Pretty sure it's not a regression. I bisected (by running tests on CI) and the supposed problematic change was 5f80df8 which seems rather unlikely to me given that there's no http in this test.

The likely scenarios to me seem to be that either that change (or some other nearby change) caused a timing change on win2008r2 such that it triggered this issue a lot more, or that something changed on the Windows host/configuration itself to make this problem occur far more often. (But yes, I'm speculating.)

@richardlau

This comment has been minimized.

Copy link
Member

richardlau commented Oct 15, 2019

And to show it failing on master, here are five from master:
= doctool test failed on win2008r2
https://ci.nodejs.org/job/node-test-binary-windows-2/3532/
https://ci.nodejs.org/job/node-test-binary-windows-2/3537/
https://ci.nodejs.org/job/node-test-binary-windows-2/3538/
https://ci.nodejs.org/job/node-test-binary-windows-2/3539/
https://ci.nodejs.org/job/node-test-binary-windows-2/3540/

Given the failure is so prominent on master are we sure it's not a real recent regression? The test has only failed in the two most recent node-daily-master runs (out of the twelve we have history for):
https://ci.nodejs.org/job/node-test-binary-windows-2/3495/
https://ci.nodejs.org/job/node-test-binary-windows-2/3532/
The change LGTM btw but I just want to make sure we're not papering over something that has recently changed to suddenly make this test fail consistently.

Pretty sure it's not a regression. I bisected (by running tests on CI) and the supposed problematic change was 5f80df8 which seems rather unlikely to me given that there's no http in this test.

Not in the tests themselves but the doctool itself does a https request (hence #29918).

@Trott Trott added the author ready label Oct 15, 2019
@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

I'm going to bed because it's far too late here, but if this gets another fast track approval and someone wants to land it, please do! Bonus points for going to the CI jobs associated with #29976, #29969, and #15735 and using "Rebuild" to start them over again with this change.

@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

Pretty sure it's not a regression. I bisected (by running tests on CI) and the supposed problematic change was 5f80df8 which seems rather unlikely to me given that there's no http in this test.

The likely scenarios to me seem to be that either that change (or some other nearby change) caused a timing change on win2008r2 such that it triggered this issue a lot more, or that something changed on the Windows host/configuration itself to make this problem occur far more often. (But yes, I'm speculating.)

Old PRs are still passing when run on CI, so that suggests that the second suggestion by me isn't correct. The first one seems likely: A code change caused a timing change on win2008r2 such that it triggered this issue a lot more.

const target = path.join(outputDir, `${basename}.json`);
fs.writeFileSync(target, JSON.stringify(content.json, null, 2));
const jsonTarget = path.join(outputDir, `${basename}.json`);
fs.writeFileSync(jsonTarget, JSON.stringify(content.json, null, 2));

This comment has been minimized.

Copy link
@Fishrock123

Fishrock123 Oct 15, 2019

Member

These could now be fsPromises.writeFile(...), although I can do that in a follow up if you prefer.

@Trott Trott closed this Oct 15, 2019
@Trott Trott force-pushed the Trott:async-all-the-way-down branch from b545682 to f4f856b Oct 15, 2019
@Trott Trott merged commit f4f856b into nodejs:master Oct 15, 2019
@Trott

This comment has been minimized.

Copy link
Member Author

Trott commented Oct 15, 2019

Landed in f4f856b

targos added a commit that referenced this pull request Nov 8, 2019
Doctool tests have been failing a lot in CI on Win2008 R2. It appears
async functions and callback-based functions are being used in
combination such that the callback-based function cannot guarantee that
it will invoke its callback. Convert the callback-based functions to
async functions so we have one paradigm and reliable results.

PR-URL: #29979
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
targos added a commit that referenced this pull request Nov 10, 2019
Doctool tests have been failing a lot in CI on Win2008 R2. It appears
async functions and callback-based functions are being used in
combination such that the callback-based function cannot guarantee that
it will invoke its callback. Convert the callback-based functions to
async functions so we have one paradigm and reliable results.

PR-URL: #29979
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.