-
Notifications
You must be signed in to change notification settings - Fork 29.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for adjustment of stdout buffer size (aka highWaterMark?) of child_process.spawn() #41611
Comments
Related to Node.js PR #39097 and issue #39092 I am currently trying to push through libuv/libuv#3429 & libuv/libuv#3422 which will greatly reduce the performance impact of having smaller buffers for blocking I/O The problem is that Node's I/O must remain fair and be able to advance a large number of streams with a limited number of threads - thus those streams need to be interruptible Currently the performance impact of this is very significant because of the latency of the wake up mechanism between the worker thread and the event loop which schedules the next reads sequentially - IMO this is the real problem that needs solving Reading 64Kb chunks in a tight loop is only marginally slower then reading big chunks (I clocked it at 75ns per additional syscall) - the scheduling of the next chunk is the only problem |
@AndrewJDR if you want to test the impact of the buffer size, you must rebuild Node changing this line: node/deps/uv/src/unix/stream.c Line 1151 in 12608d3
(it is in fact in libuv) |
@mmomtchev Thanks for all your help on this. Just to confirm if I understand correctly, you're saying if I increased the 64K setting in the libuv code that you pointed out and ran some tests with large data from child_process.spawn()'s stdout, I would find minimal performance improvement since many 64K chunks are actually being fed to my code by an (interruptable) tight loop inside the event loop, rather than my code being fed only one 64K chunk per event loop iteration? Thanks again, Andrew |
I just don't know - try it - for |
There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment. For more information on how the project manages feature requests, please consult the feature request management document. |
There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment. For more information on how the project manages feature requests, please consult the feature request management document. |
Don't close this issue, it's a needed feature. |
Can this issue be reopened? It doesn't look like it ever got addressed, and it's a real issue. |
+1 to reopen this one. my use case is a streaming CSV parser using fs.createReadStream() you'll typically have a malformed (partially parsed) row at the end of each chunk, and it's easier to throw that partial parse away and pre-concat the partial line to the next chunk, and run the parser on that. with 64KB chunks, you're throwing away a lot more partial parses than you would with 2MB chunks. when i manually accumulate (string concat) the 64KB chunks in memory until my 2MB threshold and flush that to my parser, i see the throughput increase from 94MiB/s to 104MiB/s. it feels rather silly to to have to do this in userland simply due to lack of control over this 64KB chunk size. i'm certain the memory usage would also be reduced if this was done in libuv instead of in JS. |
What is the problem this feature will solve?
It does not appear to be possible [1] to adjust the buffer size (highWaterMark) for the stdout from a child_process.spawn() call. This very likely [2] leads to lower throughput when dealing with large amounts of output from a child process, since more event loop iterations are required to consume it. Increasing the buffer size involves trading away event loop latency, but the latency vs. throughput tradeoff decision should ideally be in the hands of the javascript developer as it is with most other read / write streams in nodejs.
What is the feature you are proposing to solve the problem?
Some way of adjusting the highWaterMark (or equivalent) of the stdout stream of a child_process.spawn() call.
What alternatives have you considered?
As far as I know, there isn't one. You just have to accept the default of 64KB.
The text was updated successfully, but these errors were encountered: