child_process: fix sending utf-8 to child process by bnoordhuis · Pull Request #5016 · nodejs/node-v0.x-archive

bnoordhuis · 2013-03-14T16:11:20Z

In process#send() and child_process.ChildProcess#send(), use 'utf8' as
the encoding instead of 'ascii' because 'ascii' mutilates non-ASCII
input.

It worked by accident in v0.8 but not in v0.10 because the high bits
are now stripped when converting Buffers to ASCII strings. See commit
96a314b for details.

Fixes #4999 and #5011.

Reviewer: @isaacs or @TooTallNate

Nodejs-Jenkins · 2013-03-14T16:15:56Z

Merged build triggered.

Nodejs-Jenkins · 2013-03-14T16:15:57Z

Merged build started.

TooTallNate · 2013-03-14T19:45:22Z

Would there be a good reason to use a StringDecoder? Otherwise LGTM.

bnoordhuis · 2013-03-14T20:26:24Z

Would there be a good reason to use a StringDecoder?

I'm not sure. The reason it doesn't use one now is simplicity and performance.

Using a StringDecoder sounds reasonable because you'd expect partial character sequences to happen sooner or later, when the pipe fills up. But that must have been possible before 96a314b too, when ASCII was passed through as is, and I don't remember any bug reports about that.

I'll see if I can create a test case where that's an issue before I land this PR.

isaacs · 2013-03-14T23:58:45Z

Yes, it needs to use a StringDecoder. In theory, it would not be an issue with ascii, because they're always a single byte.

It's very unlikely that a pipe to a separate process on the same machine would ever get its chunks split, but it can conceivably happen.

bnoordhuis · 2013-03-15T00:27:42Z

it would not be an issue with ascii, because they're always a single byte.

Well, what I mean is that if the issue exists, v0.8 is affected too because the 'ascii' encoding isn't actually ASCII, it's 8 bits. It lets UTF-8 through unmolested so, in theory, a character sequence could get split over read() / write() syscalls.

IOW, not using a StringDecoder is arguably wrong but it's not a regression. I guess the thing to do here is to benchmark the impact of a StringDecoder and decide if that's something we can live with.

isaacs · 2013-03-15T00:43:05Z

No, it's not a regression, but it is certainly a bug.

koichik · 2013-03-15T12:58:17Z

if the issue exists, v0.8 is affected too

Ben is right, this bug is also in v0.8 (gist).

In process#send() and child_process.ChildProcess#send(), use 'utf8' as the encoding instead of 'ascii' because 'ascii' mutilates non-ASCII input. It worked by accident in v0.8 but not in v0.10 because the high bits are now stripped when converting Buffers to ASCII strings. See commit 96a314b for details. Fixes nodejs#4999 and nodejs#5011.

Handle partial character sequences correctly, use a StringDecoder.

bnoordhuis · 2013-03-21T12:31:12Z

@isaacs @TooTallNate Re-review please. I've added a string decoder and a benchmark and - much to my surprise - there seems to be no appreciable performance impact.

That's nice for a change unless it means child process I/O was dog slow to start with. :-/

isaacs · 2013-03-24T21:56:26Z

Yeah, StringDecoder is pretty efficient, and no new test failures as a result of this. LGTM.

bnoordhuis · 2013-03-25T12:39:49Z

Thanks. Landed in 44843a6 and back-ported to v0.8 in 84bb0ec.

bnoordhuis added 3 commits March 21, 2013 12:38

bench: add child process read perf benchmark

6d50ecf

child_process: fix sending utf-8 to child process, v2

9f7d0f8

Handle partial character sequences correctly, use a StringDecoder.

bnoordhuis closed this Mar 25, 2013

bnoordhuis deleted the fix-child-process-send branch March 25, 2013 12:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

child_process: fix sending utf-8 to child process#5016

child_process: fix sending utf-8 to child process#5016
bnoordhuis wants to merge 3 commits intonodejs:v0.10from
bnoordhuis:fix-child-process-send

bnoordhuis commented Mar 14, 2013

Uh oh!

Nodejs-Jenkins commented Mar 14, 2013

Uh oh!

Nodejs-Jenkins commented Mar 14, 2013

Uh oh!

TooTallNate commented Mar 14, 2013

Uh oh!

bnoordhuis commented Mar 14, 2013

Uh oh!

isaacs commented Mar 14, 2013

Uh oh!

bnoordhuis commented Mar 15, 2013

Uh oh!

isaacs commented Mar 15, 2013

Uh oh!

koichik commented Mar 15, 2013

Uh oh!

bnoordhuis commented Mar 21, 2013

Uh oh!

isaacs commented Mar 24, 2013

Uh oh!

bnoordhuis commented Mar 25, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

bnoordhuis commented Mar 14, 2013

Uh oh!

Nodejs-Jenkins commented Mar 14, 2013

Uh oh!

Nodejs-Jenkins commented Mar 14, 2013

Uh oh!

TooTallNate commented Mar 14, 2013

Uh oh!

bnoordhuis commented Mar 14, 2013

Uh oh!

isaacs commented Mar 14, 2013

Uh oh!

bnoordhuis commented Mar 15, 2013

Uh oh!

isaacs commented Mar 15, 2013

Uh oh!

koichik commented Mar 15, 2013

Uh oh!

bnoordhuis commented Mar 21, 2013

Uh oh!

isaacs commented Mar 24, 2013

Uh oh!

bnoordhuis commented Mar 25, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants