New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'data' event not emitted for every flush #33465
Comments
There is no way for Node to know if/how the data was flushed. The stdio pipe is just a stream of bytes. |
@bzoz when the flush happens bytes are written to the pipe, in which case node should know right? I suppose I'm not understanding stdio properly, can you clarify when node will emit a data event please? |
TBH, the best answer would be "when it feels like it". Deep down Node looks how many bytes are there in the pipe and then reads them. If for whatever reason Node event loop gets delayed, you will get one data event regardless of how many writes there where. You should not depend on those, you should handle all of the child process output coming at once. BTW, this has some info on how Node behaves if the stdio is either a pipe (like in this python example) or when it is a TTY on different systems : https://nodejs.org/api/process.html#process_a_note_on_process_i_o. |
I don't think there is anything actionable here for Node, as it behaves as expected. Feel free to reopen this issue or create a new one if you find other problems with Node. |
Impossible to predict how much data arrives in stdout So I need to split by newline to avoid JSON error nodejs/node#33465
What steps will reproduce the bug?
Python on windows (or at least on my machine) has a buffer size of 8912 bytes so the above write results in two flushes. You can check the buffer size by running:
python -c "import io;print(io.DEFAULT_BUFFER_SIZE)"
. Also note that the above is a simplified one-liner that reproduces the issue, you can find a full python script with several examples here.I'm not sure if it's related because I have no idea when node flushes stdout but I ran into the same issue when spawning a node child.
How often does it reproduce? Is there a required condition?
This only happens on Windows. On Linux it works as expected, as evidenced by https://repl.it/@almenon/testBufferSize
The other required condition is for flushes to happen within extremely quick succession, as in one flush right after another. Sleeping inbetween flushes results in the 'data' event coming through as expected.
Practically speaking this has resulted in unexpected behavior in AREPL. Presumably other projects with a python subprocess that writes a lot of data in one write run into this problem as well. Python on Windows 10 has a small buffer size of 8912 bytes so it doesn't take that much data to exceed the buffer and cause a flush.
What is the expected behavior?
The node docs for the 'data' event of net.Socket say that 'data' is "Emitted when data is received," in which case I would expect that data is emitted whenever a flush happens in the child process.
What do you see instead?
'data' event is emitted after being received
Additional information
The text was updated successfully, but these errors were encountered: