Skip to content
This repository was archived by the owner on Apr 22, 2023. It is now read-only.

child_process: fix sending utf-8 to child process#5016

Closed
bnoordhuis wants to merge 3 commits intonodejs:v0.10from
bnoordhuis:fix-child-process-send
Closed

child_process: fix sending utf-8 to child process#5016
bnoordhuis wants to merge 3 commits intonodejs:v0.10from
bnoordhuis:fix-child-process-send

Conversation

@bnoordhuis
Copy link
Copy Markdown
Member

In process#send() and child_process.ChildProcess#send(), use 'utf8' as
the encoding instead of 'ascii' because 'ascii' mutilates non-ASCII
input.

It worked by accident in v0.8 but not in v0.10 because the high bits
are now stripped when converting Buffers to ASCII strings. See commit
96a314b for details.

Fixes #4999 and #5011.

Reviewer: @isaacs or @TooTallNate

@Nodejs-Jenkins
Copy link
Copy Markdown

Merged build triggered.

@Nodejs-Jenkins
Copy link
Copy Markdown

Merged build started.

@TooTallNate
Copy link
Copy Markdown

Would there be a good reason to use a StringDecoder? Otherwise LGTM.

@bnoordhuis
Copy link
Copy Markdown
Member Author

Would there be a good reason to use a StringDecoder?

I'm not sure. The reason it doesn't use one now is simplicity and performance.

Using a StringDecoder sounds reasonable because you'd expect partial character sequences to happen sooner or later, when the pipe fills up. But that must have been possible before 96a314b too, when ASCII was passed through as is, and I don't remember any bug reports about that.

I'll see if I can create a test case where that's an issue before I land this PR.

@isaacs
Copy link
Copy Markdown

isaacs commented Mar 14, 2013

Yes, it needs to use a StringDecoder. In theory, it would not be an issue with ascii, because they're always a single byte.

It's very unlikely that a pipe to a separate process on the same machine would ever get its chunks split, but it can conceivably happen.

@bnoordhuis
Copy link
Copy Markdown
Member Author

it would not be an issue with ascii, because they're always a single byte.

Well, what I mean is that if the issue exists, v0.8 is affected too because the 'ascii' encoding isn't actually ASCII, it's 8 bits. It lets UTF-8 through unmolested so, in theory, a character sequence could get split over read() / write() syscalls.

IOW, not using a StringDecoder is arguably wrong but it's not a regression. I guess the thing to do here is to benchmark the impact of a StringDecoder and decide if that's something we can live with.

@isaacs
Copy link
Copy Markdown

isaacs commented Mar 15, 2013

No, it's not a regression, but it is certainly a bug.

@koichik
Copy link
Copy Markdown

koichik commented Mar 15, 2013

if the issue exists, v0.8 is affected too

Ben is right, this bug is also in v0.8 (gist).

In process#send() and child_process.ChildProcess#send(), use 'utf8' as
the encoding instead of 'ascii' because 'ascii' mutilates non-ASCII
input.

It worked by accident in v0.8 but not in v0.10 because the high bits
are now stripped when converting Buffers to ASCII strings. See commit
96a314b for details.

Fixes nodejs#4999 and nodejs#5011.
Handle partial character sequences correctly, use a StringDecoder.
@bnoordhuis
Copy link
Copy Markdown
Member Author

@isaacs @TooTallNate Re-review please. I've added a string decoder and a benchmark and - much to my surprise - there seems to be no appreciable performance impact.

That's nice for a change unless it means child process I/O was dog slow to start with. :-/

@isaacs
Copy link
Copy Markdown

isaacs commented Mar 24, 2013

Yeah, StringDecoder is pretty efficient, and no new test failures as a result of this. LGTM.

@bnoordhuis
Copy link
Copy Markdown
Member Author

Thanks. Landed in 44843a6 and back-ported to v0.8 in 84bb0ec.

@bnoordhuis bnoordhuis closed this Mar 25, 2013
@bnoordhuis bnoordhuis deleted the fix-child-process-send branch March 25, 2013 12:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants