Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround a node >=12.5.0 bug that causes the process not to exit after tests have completed and cancerous memory growth #8787

Merged
merged 3 commits into from Aug 16, 2019

Conversation

@niieani
Copy link
Contributor

niieani commented Aug 6, 2019

This is a workaround for an issue that occurs in node 12.5.0 and newer, described in detail here: nodejs/node#29001

The behavior of the code in question is the same, however the change is enough to prevent locking the process into a "cancerous" state of memory growth.

This is one of the weirdest bugs I've ever debugged.

Under certain circumstances, Node process (starting from version 12.5.0, previous versions not affected), would not exit (even if we call process.exit() at the end of the file), and instead start to consume memory until it fills the whole available memory, using 100% CPU while at it.

Here's a reproduction based on the HasteMap:

https://gist.github.com/niieani/a527e9b2e6caf28e3021a50b938e4a91

Standard memory or CPU profiling doesn't show anything (literally, there are no function calls shown in the profiler). Dumping the heap snapshot results in a 36MB file, even when the memory consumption of the process is 10 GB.

Summary

The backstory is that in our case jest would not end AFTER all the tests have completed, and instead would start consuming huge amounts of memory (on CI, multiple processes from separate PRs would deplete the 128GB of RAM on the machine) so I started going down the rabbit hole to try and pinpoint what is the cause of the issue, which lead me to this minimal reproduction scenario, but also to discover all the quirks that helped to avoid triggering the scenario.

Test plan

Testing on node 12 is not enabled in CI due to #8397, so it's not possible to make a test for this case at this time.

Locally, it's possible to test it using the data I've provided in the gist above, after replacing the devDependency 'weak' with:

    "weak": "github:lxe/node-weak#node-12"

Feedback

Looking for feedback before adding the CHANGELOG for this. Perhaps we should fix running jest in node 12 first, so this issue could be tested for?

This is a workaround for an issue that occurs in node 12.5.0 and newer, described in detail here:  nodejs/node#29001

The behavior of the code in question is the same, however the change is enough to prevent locking the process into a "cancerous" state of memory growth.

This is one of the weirdest bugs I've ever debugged.

Under certain circumstances, Node process (starting from version 12.5.0, previous versions not affected), would not exit (even if we call `process.exit()` at the end of the file), and instead start to consume memory until it fills the whole available memory, using 100% CPU while at it.

Here's a reproduction based on the `HasteMap`:

https://gist.github.com/niieani/a527e9b2e6caf28e3021a50b938e4a91

Standard memory or CPU profiling doesn't show anything (literally, there are no function calls shown in the profiler). Dumping the heap snapshot results in a 36MB file, even when the memory consumption of the process is 10 GB.

The backstory is that in our case [`jest`](https://github.com/facebook/jest) would not end AFTER all the tests have completed, and instead would start consuming huge amounts of memory (on CI, multiple processes from separate PRs would deplete the 128GB of RAM on the machine) so I started going down the rabbit hole to try and pinpoint what is the cause of the issue, which lead me to this minimal reproduction scenario, but also to discover all the quirks that helped to avoid triggering the scenario.
Copy link
Collaborator

thymikee left a comment

Oh my, this is indeed weird. Guess that will gone once we move to Node 8 as a compilation target, so we don't need transpiling async/await. But until then, it would be cool to release a patch with this fix. It's also worth noting that this affects React Native projects as well. Glad you found a workaround other than locking Node version

@mikehardy

This comment has been minimized.

Copy link

mikehardy commented Aug 9, 2019

Confirmed this works for me on node v12.8.0 on linux where previously I had consistent terrible results since v12.5.0 and on macOS where I had inconsistent but frequently terrible results.

Here's a patch-package usable patch vs jest-haste-map 24.8.0 since that's what you get by default with react-native right now (my use case, and likely the majority). '.txt' suffix added for github-compatibility, just strip that and drop it in your patches folder after integrating patch-package.

The patch also contains the fix needed for #8558 (fixed in 24.8.1 but as mentioned 24.8.0 is what you get by default in react-native 0.60.x I believe)

THANK YOU :-)

jest-haste-map+24.8.0.patch.txt

@mikehardy

This comment has been minimized.

Copy link

mikehardy commented Aug 13, 2019

This is one of those rare patches that has saves me like 15 minutes a day because of the startup speed improvement - over the last 4 days since I integrated it locally it's at least an hour. One a smaller-ram designer co-workers machine it saved me from explaining how to downgrade node etc as it makes our current process work without crashing him.

@niieani

This comment has been minimized.

Copy link
Contributor Author

niieani commented Aug 13, 2019

@thymikee should I add a CHANGELOG or you wanna do it to the PR?

@thymikee

This comment has been minimized.

Copy link
Collaborator

thymikee commented Aug 13, 2019

Please add a changelog entry, thanks :)

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Aug 14, 2019

It'll be interesting to see what sort of memory leaks will be fixed and performance gained by dropping node 6 and avoiding the transpilation of async-await. #8806 confirmed the issue is fixed by just changing the target in the babel config without a code change. 25 can't come soon enough 😀

@scotthovestadt

This comment has been minimized.

Copy link
Contributor

scotthovestadt commented Aug 14, 2019

Looking forward to dropping transpilation of await / async, but let's get this into the release.

I have a fix for the bug blocking release almost finished, so if we can get this in shortly we can publish it too.

@scotthovestadt scotthovestadt merged commit 9ad0f4b into facebook:master Aug 16, 2019
1 of 11 checks passed
1 of 11 checks passed
ci/circleci: lint-and-typecheck CircleCI is running your tests
Details
ci/circleci: test-browser CircleCI is running your tests
Details
ci/circleci: test-jest-circus CircleCI is running your tests
Details
ci/circleci: test-node-10 CircleCI is running your tests
Details
ci/circleci: test-node-11 CircleCI is running your tests
Details
ci/circleci: test-node-6 CircleCI is running your tests
Details
ci/circleci: test-node-8 CircleCI is running your tests
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
deploy/netlify Deploy preview processing.
Details
facebook.jest in progress
Details
ci/circleci: test-or-deploy-website Your tests passed on CircleCI!
Details
@mikehardy

This comment has been minimized.

Copy link

mikehardy commented Aug 22, 2019

node v12.9 fixes the underlying cause that needed the workaround, for what it's worth - but this patch was well loved while it was needed. Thanks guys

@SimenB

This comment has been minimized.

Copy link
Collaborator

SimenB commented Aug 22, 2019

Great news 👍

@niieani

This comment has been minimized.

Copy link
Contributor Author

niieani commented Aug 22, 2019

Thanks for moving this forward @scotthovestadt, I never made it to update the changelog.
Happy to hear that 12.9 fixes the underlying issue!

This was referenced Oct 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.