Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a lot of worker threads at once doesn't fully release their memory once they're terminated #51868

Closed
aamiaa opened this issue Feb 25, 2024 · 6 comments

Comments

@aamiaa
Copy link

aamiaa commented Feb 25, 2024

Version

v20.11.1

Platform

Linux ovps2 5.15.0-1051-oracle #57-Ubuntu SMP Wed Jan 24 18:31:24 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

Subsystem

No response

What steps will reproduce the bug?

The following code creates 50 worker threads which load 20mb of data into memory and then terminates them. This is repeated 4 times, and the RSS memory usage either grows or slightly decreases with each iteration.

Main file:

const { Worker } = require("worker_threads")
const path = require("path")

const sleep = ms => new Promise(resolve => setTimeout(resolve, ms))

async function doLeak() {
	const jobs = []

	for(let i=0;i<50;i++) {
		const worker = new Worker(path.join(__dirname, "worker.js"))
		jobs.push(new Promise(resolve => {
			worker.once("message", () => {
				worker.terminate()
				resolve()
			})
		}))
	}

	await Promise.all(jobs)
}

async function main() {
	console.log("RSS on start:", Math.floor(process.memoryUsage().rss/1024/1024), "MB")
	
	for(let i=1;i<=4;i++) {
		console.log("Starting...", i)

		await doLeak()
		await sleep(5000) // This is not required, but helps show that the memory isn't released even after waiting

		console.log("Finished! RSS:", Math.floor(process.memoryUsage().rss/1024/1024), "MB")
	}
}

main()

worker.js:

const { parentPort } = require("worker_threads")
const fs = require("fs")
const path = require("path")

// Create some data in memory
const data = fs.readFileSync(path.join(__dirname, "20mb.txt"))

parentPort.postMessage("hi")

For convenience, text file with 20mb of random data: 20mb.txt

How often does it reproduce? Is there a required condition?

Every time

What is the expected behavior? Why is that the expected behavior?

RSS memory doesn't leak

What do you see instead?

RSS memory grows drastically (from ~36mb up to ~700mb after running the provided example code)

ubuntu@ovps2:~/memleak$ node index.js
RSS on start: 36 MB
Starting... 1
Finished! RSS: 469 MB
Starting... 2
Finished! RSS: 576 MB
Starting... 3
Finished! RSS: 676 MB
Starting... 4
Finished! RSS: 617 MB
ubuntu@ovps2:~/memleak$

Additional information

The memory leak only occurs if the worker threads are created at the same time (i.e. awaiting for 100ms between each worker's creation will not cause it to happen).

It also requires the worker threads to have some data in their memory, which leads me to believe the excessive RSS memory is a result of failed de-allocation of the worker threads' memory segments (I could be wrong).

Note that it doesn't seem to matter when the worker threads are terminated. Only their creation matters.

I've reproduced this on

  • Linux ovps2 5.15.0-1051-oracle #57-Ubuntu SMP Wed Jan 24 18:31:24 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
  • Linux kvps 5.4.0-62-generic #70-Ubuntu SMP Tue Jan 12 12:45:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

I've been unable to reproduce this on Microsoft Windows NT 10.0.19045.0 x64

@User9684
Copy link

User9684 commented Feb 25, 2024

Was able to reproduce on Linux user 6.5.0-10-generic #10-Ubuntu SMP PREEMPT_DYNAMIC Fri Oct 13 13:49:38 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux using the code provided

image

@joyeecheung
Copy link
Member

joyeecheung commented Feb 25, 2024

Does it crash out of memory if you set a memory limit (e.g. to 300MB) for the process e.g. using ulimit? If you monitor the process is the VSZ growing as well?

@aamiaa
Copy link
Author

aamiaa commented Feb 25, 2024

Yes it does crash

ubuntu@ovps2:~/memleak$ ulimit -v 500000
ubuntu@ovps2:~/memleak$ node index.js
RSS on start: 37 MB
Starting... 1

#
# Fatal process OOM in Failed to reserve virtual memory for CodeRange
#

Trace/breakpoint trap (core dumped)
ubuntu@ovps2:~/memleak$

but I believe that's to be expected, since the worker threads use a lot more memory at peak (up to 1GB RES and 15.8GB VIRT from my tests) before dropping to the printed values, due to each of them having the 20mb of data in memory.

The VIRT memory seems to grow from initial 450MB to 2430MB after 1st iteration, and then always drops to that value after the remaining iterations.

@joyeecheung
Copy link
Member

joyeecheung commented Feb 26, 2024

If you limit the memory to slightly above what's needed for 50 workers (ulimit -m should be the right command to limit RSS), and run that many, many times (maybe 1000) would it grow indefinitely and crash? Or what happens if you have another process on the side that takes a lot of memory, forcing the OS to swap out pages?

I tried to reproduce locally and it will just go on forever, with the RSS and VSZ fluctuating around a constant. I think that indicates that it's not actually leaking. RSS is just the amount of physical memory that the OS allocates to the process. If the process demonstrates a pattern of high memory usage, and your OS is not under any memory pressure, it's likely that the OS/allocator will just let the process keep it in case it needs to allocate them again (and in this case, it does) and/or it's too expensive to defragment. If there is a real leak, the memory growth would've been unbounded and the process would crash eventually, instead of staying constant after some point.

@joyeecheung
Copy link
Member

And, if you do want the unused memory to be returned to the OS as soon as possible, you can try switching to a different memory allocator that returns memory to the OS more promptly, like jemalloc. Locally after swapping the memory allocator to jemalloc, the RSS stays around 80MB (before with glibc it stays around 1100MB).

@joyeecheung
Copy link
Member

joyeecheung commented Feb 26, 2024

Closing as I don't think it's a leak. More like an issue with how the glibc memory allocator manages freed memory. If the high RSS matters, try switching to a different memory allocator that prioritizes this like jemalloc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants