Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: convert string to Buffer when calling `unshift(<string>)` #27194

Merged
merged 1 commit into from Jun 2, 2019

Conversation

Projects
None yet
7 participants
@marcosc90
Copy link
Contributor

commented Apr 11, 2019

readable.unshift can take a string as an argument, but that
string wasn't being converted to a Buffer, which caused a
TypeError: Argument must be a buffer in some cases.
A second optional argument encoding was added to unshift to
prevent all strings being coerced to utf8.

Fixes: #27192

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

A documentation change will be needed, for the encoding argument, but wanted to submit the PR first to see if it will get approved. I will also make a few tests for this if everything is okay.

@Trott

This comment has been minimized.

Copy link
Member

commented Apr 11, 2019

Can you add a test for this?

@marcosc90 marcosc90 force-pushed the marcosc90:fix-27192-stream-unshift branch from 972fdda to 89a9c34 Apr 12, 2019

@marcosc90

This comment has been minimized.

Copy link
Contributor Author

commented Apr 12, 2019

@Trott added multiple cases, if something is missing or needs to be changed let me know.

@Trott

This comment has been minimized.

Copy link
Member

commented Apr 12, 2019

Thanks! Only other thing to add that I can see is the encoding argument to the documentation in doc/api/stream.md.

R: @nodejs/streams

@mcollina

This comment has been minimized.

Copy link
Member

commented Apr 12, 2019

I think this is semver-minor as it's adding a new parameter.

CI: https://ci.nodejs.org/job/node-test-pull-request/22398/
CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1821/

@nodejs-github-bot

This comment has been minimized.

@addaleax
Copy link
Member

left a comment

Can you also add a test for how this interacts with .setEncoding()?

@marcosc90

This comment has been minimized.

Copy link
Contributor Author

commented Apr 12, 2019

Can you also add a test for how this interacts with .setEncoding()?

I'll submit one in a few. Forgot about that scenario, thanks!

@marcosc90

This comment has been minimized.

Copy link
Contributor Author

commented Apr 13, 2019

@addaleax When doing that test, I remember that when state.decoder is set, the chunks are not converted to Buffer. The bug reported in #27192 won't happen, because when state.decoder is set, buffer.concat won't be used.

I don't think this PR should change that, I'm not even sure if it should be saved as a Buffer in that case, even though it's called BufferList, but I'm not very familiar with string decoder.

So I can test that the behaviour is unchanged, meaning that they're saved as string, and that data is emitted with the encoding set in .setEncoding.

Is that ok?

@marcosc90 marcosc90 force-pushed the marcosc90:fix-27192-stream-unshift branch from fa9b80e to cf28144 Apr 13, 2019

@marcosc90

This comment has been minimized.

Copy link
Contributor Author

commented Apr 13, 2019

There is one change left, but would like some feedback.

if (!state.objectMode) {
if (typeof chunk === 'string') {
encoding = encoding || state.defaultEncoding;
if (encoding !== state.encoding) {
chunk = Buffer.from(chunk, encoding);
encoding = '';
}
skipChunkCheck = true;
}

When .setEncoding is set and a chunk is .pushed, the convertion to that encoding is handled by the state.decoder, of course I can't use that same state.decoder for .unshifting otherwise it will mess with the internal buffer.

But in order to keep the same behaviour, I have to convert the pushed string, to the encoding defined by setEncoding, so the solution that I have right now working is the following:

if (addToFront && state.encoding && state.encoding !== encoding) {
  chunk = Buffer.from(chunk, encoding).toString(state.encoding);
} else if (encoding !== state.encoding) {
  chunk = Buffer.from(chunk, encoding);
  encoding = '';
}

Because the string must be saved in BufferList in the same encoding as setEncoding, so then is emitted to data that way.

Thoughts? Maybe someone has another alternative.

@marcosc90 marcosc90 force-pushed the marcosc90:fix-27192-stream-unshift branch from c25d1e0 to 22e814b Apr 13, 2019

@marcosc90

This comment has been minimized.

Copy link
Contributor Author

commented Apr 24, 2019

Is there something else that should be added/changed?

@mcollina
Copy link
Member

left a comment

I’ve left some notes related to the tests. I think we should remove the usage of private API as much as possible - the majority of this feature should be testable without it.

assert.strictEqual(
readable._readableState.buffer.head.data,
chunk
);

This comment has been minimized.

Copy link
@mcollina

mcollina Apr 24, 2019

Member

This test can be rewritten without using internals, it should consume the stream and check that the chunks comes out in order.

Show resolved Hide resolved test/parallel/test-stream-readable-unshift.js Outdated
assert.strictEqual(
readable._readableState.buffer.head.data.toString('utf8'),
string
);

This comment has been minimized.

Copy link
@mcollina

mcollina Apr 24, 2019

Member

This can be tested by consuming the stream and verifying that the data comes out in order.

@mcollina
Copy link
Member

left a comment

LGTM

@nodejs-github-bot

This comment was marked as outdated.

@Trott Trott added the author ready label Jun 2, 2019

@nodejs-github-bot

This comment has been minimized.

@Trott

Trott approved these changes Jun 2, 2019

stream: convert string to Buffer when calling `unshift(<string>)`
`readable.unshift` can take a string as an argument, but that
string wasn't being converted to a Buffer, which caused a
<TypeError: Argument must be a buffer> in some cases. Also if a
string was passed, that string was coerced to utf8 encoding.

A second optional argument `encoding` was added to `unshift` to
fix the encoding issue.

Fixes: #27192
PR-URL: #27194
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>

@Trott Trott force-pushed the marcosc90:fix-27192-stream-unshift branch from 4bcd0cf to df339bc Jun 2, 2019

@Trott

This comment has been minimized.

Copy link
Member

commented Jun 2, 2019

Landed in df339bc

@Trott Trott merged commit df339bc into nodejs:master Jun 2, 2019

1 of 2 checks passed

Travis CI - Pull Request Build Errored
Details
Travis CI - Branch Build Passed
Details

targos added a commit that referenced this pull request Jun 3, 2019

stream: convert string to Buffer when calling `unshift(<string>)`
`readable.unshift` can take a string as an argument, but that
string wasn't being converted to a Buffer, which caused a
<TypeError: Argument must be a buffer> in some cases. Also if a
string was passed, that string was coerced to utf8 encoding.

A second optional argument `encoding` was added to `unshift` to
fix the encoding issue.

Fixes: #27192
PR-URL: #27194
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>

targos added a commit that referenced this pull request Jun 3, 2019

2019-06-04, Version 12.4.0 (Current)
Notable changes:

* doc:
  * The JSON variant of the API documentation is no longer experimental
    (Rich Trott) #27842.
* esm:
  * JSON module support is always enabled under
    `--experimental-modules`. The `--experimental-json-modules` flag
    has been removed (Myles Borins)
    #27752.
* http,http2:
  * A new flag has been added for overriding the default HTTP server
    socket timeout (which is two minutes). Pass
    `--http-server-default-timeout=milliseconds`
    or `--http-server-default-timeout=0` to respectively change or
    disable the timeout. Starting with Node.js 13.0.0, the timeout will
    be disabled by default
    (Ali Ijaz Sheikh) #27704.
* inspector:
  * Added an experimental `--heap-prof` flag to start the V8 heap
    profiler on startup and write the heap profile to disk before exit
    (Joyee Cheung) #27596.
* stream:
  * The `readable.unshift()` method now correctly converts strings to
    buffers. Additionally, a new optional argument is accepted to
    specify the string's encoding, such as `'utf8'` or `'ascii'`
    (Marcos Casagrande) #27194.
* v8:
  * The object returned by `v8.getHeapStatistics()` has two new
    properties: `number_of_native_contexts` and
    `number_of_detached_contexts` (Yuriy Vasiyarov)
    #27933.

PR-URL: TODO

@targos targos referenced this pull request Jun 3, 2019

Merged

v12.4.0 release proposal #28040

targos added a commit that referenced this pull request Jun 3, 2019

2019-06-04, Version 12.4.0 (Current)
Notable changes:

* doc:
  * The JSON variant of the API documentation is no longer experimental
    (Rich Trott) #27842.
* esm:
  * JSON module support is always enabled under
    `--experimental-modules`. The `--experimental-json-modules` flag
    has been removed (Myles Borins)
    #27752.
* http,http2:
  * A new flag has been added for overriding the default HTTP server
    socket timeout (which is two minutes). Pass
    `--http-server-default-timeout=milliseconds`
    or `--http-server-default-timeout=0` to respectively change or
    disable the timeout. Starting with Node.js 13.0.0, the timeout will
    be disabled by default
    (Ali Ijaz Sheikh) #27704.
* inspector:
  * Added an experimental `--heap-prof` flag to start the V8 heap
    profiler on startup and write the heap profile to disk before exit
    (Joyee Cheung) #27596.
* stream:
  * The `readable.unshift()` method now correctly converts strings to
    buffers. Additionally, a new optional argument is accepted to
    specify the string's encoding, such as `'utf8'` or `'ascii'`
    (Marcos Casagrande) #27194.
* v8:
  * The object returned by `v8.getHeapStatistics()` has two new
    properties: `number_of_native_contexts` and
    `number_of_detached_contexts` (Yuriy Vasiyarov)
    #27933.

PR-URL: #28040

pull bot pushed a commit to Rachelmorrell/node that referenced this pull request Jun 4, 2019

2019-06-04, Version 12.4.0 (Current)
Notable changes:

* doc:
  * The JSON variant of the API documentation is no longer experimental
    (Rich Trott) nodejs#27842.
* esm:
  * JSON module support is always enabled under
    `--experimental-modules`. The `--experimental-json-modules` flag
    has been removed (Myles Borins)
    nodejs#27752.
* http,http2:
  * A new flag has been added for overriding the default HTTP server
    socket timeout (which is two minutes). Pass
    `--http-server-default-timeout=milliseconds`
    or `--http-server-default-timeout=0` to respectively change or
    disable the timeout. Starting with Node.js 13.0.0, the timeout will
    be disabled by default
    (Ali Ijaz Sheikh) nodejs#27704.
* inspector:
  * Added an experimental `--heap-prof` flag to start the V8 heap
    profiler on startup and write the heap profile to disk before exit
    (Joyee Cheung) nodejs#27596.
* stream:
  * The `readable.unshift()` method now correctly converts strings to
    buffers. Additionally, a new optional argument is accepted to
    specify the string's encoding, such as `'utf8'` or `'ascii'`
    (Marcos Casagrande) nodejs#27194.
* v8:
  * The object returned by `v8.getHeapStatistics()` has two new
    properties: `number_of_native_contexts` and
    `number_of_detached_contexts` (Yuriy Vasiyarov)
    nodejs#27933.

PR-URL: nodejs#28040
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.