Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codecov uploader OOMs while searching for file network on large repositories #704

Closed
gabrielrussoc opened this issue Apr 11, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@gabrielrussoc
Copy link
Contributor

Describe the bug
Running the uploader on large repositories causes it to OOM while discovering the file network.

[2022-04-11T12:39:42.234Z] ['info'] 
     _____          _
    / ____|        | |
   | |     ___   __| | ___  ___ _____   __
   | |    / _ \ / _` |/ _ \/ __/ _ \ \ / /
   | |___| (_) | (_| |  __/ (_| (_) \ V /
    \_____\___/ \__,_|\___|\___\___/ \_/

  Codecov report uploader 0.1.20
[2022-04-11T12:39:42.240Z] ['info'] => Project root located at: /ephemeral/home/ubuntu/worker_workspace
[2022-04-11T12:39:42.242Z] ['info'] ->  Token found by environment variables
[2022-04-11T12:39:42.242Z] ['info'] ->  Token set by environment variables

<--- Last few GCs --->

[22463:0x7fe065602300]   168398 ms: Mark-sweep 4052.5 (4135.6) -> 4042.5 (4141.9) MB, 2926.0 / 0.0 ms  (average mu = 0.584, current mu = 0.255) allocation failure scavenge might not succeed
[22463:0x7fe065602300]   172823 ms: Mark-sweep 4060.3 (4143.6) -> 4049.4 (4148.6) MB, 4104.4 / 0.0 ms  (average mu = 0.367, current mu = 0.072) allocation failure scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

I'm forced to used -X network to unblock.

To Reproduce

  1. On a very large repository (I couldn't find any public one)
  2. Run the uploader binary without any args or environment (I'm on Ubuntu 18):
    ./codecov

Expected behavior
Uploader should not OOM

Screenshots
n/a

Additional context
I did some digging and the problem seems to be exactly here:

const {
stdout,
status,
error,
} = spawnSync('git', ['-C', dirPath, 'ls-files'], { encoding: 'utf8' })
let files = []
if (error instanceof Error || status !== 0) {
files = glob
.sync(['**/*', '**/.[!.]*'], {
cwd: dirPath,
ignore: manualBlocklist().map(globstar),
})
} else {
files = stdout.split(/[\r\n]+/)
}
if (args.networkFilter) {
files = files.filter(file => file.startsWith(String(args.networkFilter)))
}
if (args.networkPrefix) {
files = files.map(file => String(args.networkPrefix) + file)
}
return files

child_process.spawnSync has a maxBuffer of about 1MB (docs) so the call to git ls-files returns an error because the stdout is too large. The fallback to glob.sync consumes an enormous amount of memory, ultimately causing an OOM. Not sure why glob.sync uses so much memory.

@gabrielrussoc
Copy link
Contributor Author

I guess a quick hack would be to increase the memory but looks like one can't set NODE_OPTIONS: #475

@gabrielrussoc
Copy link
Contributor Author

Ok, I did a bit more digging and it seems what's destroying glob are bazel-* directories. They should certainly be ignored, but the ignore list is currently hardcoded.

@drazisil-codecov
Copy link
Contributor

Every time I think we have a way to add to the ignore list from the command line... 🤦‍♀️

@gabrielrussoc
Copy link
Contributor Author

I took at stab at a partial fix here #729

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants