Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: improve performance for sync write finishes #30710

Closed
wants to merge 2 commits into from

Conversation

@addaleax
Copy link
Member

addaleax commented Nov 29, 2019

Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                     confidence improvement accuracy (*)    (**)   (***)
streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

@nodejs/streams @ronag

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367
state.afterWriteTickInfo.count++;
} else {
state.afterWriteTickInfo = { count: 1, cb, stream, state };
process.nextTick(afterWriteTick, state.afterWriteTickInfo);

This comment has been minimized.

Copy link
@mscdex

mscdex Nov 29, 2019

Contributor

Is there any difference in just using afterWrite directly here (process.nextTick(afterWrite, stream, ...))?

This comment has been minimized.

Copy link
@addaleax

addaleax Nov 29, 2019

Author Member

@mscdex We need to allocate an object anyway so that we can modify count later, so that’s why it’s not just spreading the arguments right now

Copy link
Member

mcollina left a comment

lgtm

how did you find out about this problem? Was this tied with a specific use case? I'm almost never pass a callback to write.

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Nov 29, 2019

how did you find out about this problem? Was this tied with a specific use case? I'm almost never pass a callback to write.

Well … Somebody asked for help privately because they were running a Node.js program, which consisted of a single synchronous loop and printed output using console.log() on each iteration, which lead to memory exhaustion after a few hours because the process.nextTick() queue was filling up 🙃

@ronag

This comment has been minimized.

Copy link
Member

ronag commented Nov 29, 2019

LGTM

Using the callback is very uncommon in my experience. I'm not sure the extra complexity is worth it? I'm a little worried about the maintenance cost (in general) of streams.

I would maybe instead consider looking into why console.log uses a callback and whether it's strictly necessary.

@ronag
ronag approved these changes Nov 29, 2019
@ronag

This comment has been minimized.

Copy link
Member

ronag commented Nov 29, 2019

Actually looking into this further this is not an optimization just for the callback case, but in general when doing multiple write calls in the same tick.

lib/_stream_writable.js Outdated Show resolved Hide resolved
@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Nov 29, 2019

Using the callback is very uncommon in my experience. I'm not sure the extra complexity is worth it? I'm a little worried about the maintenance cost (in general) of streams.

Yeah, this also applies when no callback is passed -- that being said, I would understand if people were concerned about this working only for streams that potentially call write callbacks synchronously (although a number of builtin streams do that).

@nodejs-github-bot

This comment has been minimized.

Copy link

nodejs-github-bot commented Nov 29, 2019

}
}
}

function afterWrite(stream, state, cb) {
function afterWriteTick({ stream, state, count, cb }) {

This comment has been minimized.

Copy link
@ronag

ronag Nov 29, 2019

Member

This might clear the wrong object. I think clearing the count and cb of the passed object is safer then modifying state?

function afterWriteTick(info) {
  const { stream, state, count, cb } = info;
  info.cb = null;
  return afterWrite(stream, state, count, cb);

This would also allow reusing the object and avoiding allocations:

if (!state.afterWriteTickInfo || state.afterWriteTickInfo.cb) {
  state.afterWriteTickInfo = { stream, state, cb, count: 1 };
} else {
  state.afterWriteTickInfo.cb = cb;
  state.afterWriteTickInfo.count = 1;
}

This comment has been minimized.

Copy link
@ronag

ronag Nov 29, 2019

Member

Not sure if it matter though

This comment has been minimized.

Copy link
@ronag

ronag Nov 29, 2019

Member

This comment is for the row below.

This comment has been minimized.

Copy link
@addaleax

addaleax Nov 29, 2019

Author Member

@ronag So … the effect of setting afterWriteTickInfo to null is that the next time the code above is reached, a new process.nextTick() call with a new afterWriteTickInfo object is made. That’s always safe, right?

I think setting .cb to null would have the same effect, and .count is cleared anyway. I can do that instead, if you prefer, although it might screw with the map/hidden class of afterWriteTickInfo, as .cb is always a function right now.

This comment has been minimized.

Copy link
@ronag

ronag Nov 30, 2019

Member

I was more thinking of the case where you have two different cbs, e.g.

write('a', cba) // schedule tick a
write('b', cbb) // clear info a, schedule tick b

// ...

// tick a
// clear info b

// tick b
// clear nothing

The a tick will actually clear the info for the b tick.

Probably not a problem, but maybe a little weird... I don't have a strong opinion if you think it's fine.

This comment has been minimized.

Copy link
@ronag

ronag Nov 30, 2019

Member

it might screw with the map/hidden class

Oh, I didn't know that null could cause a problems with that once it's been a function type.

This comment has been minimized.

Copy link
@addaleax

addaleax Nov 30, 2019

Author Member

The a tick will actually clear the info for the b tick.

Probably not a problem, but maybe a little weird... I don't have a strong opinion if you think it's fine.

Yeah, I think that’s fine, because it would only make a difference if there’s a write('b', cbb) inside cba(), and that seems like a somewhat unlikely scenario, and even then it would only affect performance, not behaviour.

@ronag ronag mentioned this pull request Nov 30, 2019
4 of 4 tasks complete
addaleax added a commit that referenced this pull request Dec 1, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Dec 1, 2019

Landed in 2205f85

@addaleax addaleax closed this Dec 1, 2019
@addaleax addaleax deleted the addaleax:writable-nocb branch Dec 1, 2019
targos added a commit that referenced this pull request Dec 1, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BridgeAR BridgeAR mentioned this pull request Dec 3, 2019
targos added a commit that referenced this pull request Dec 5, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BethGriggs BethGriggs mentioned this pull request Dec 9, 2019
Sebastien-Ahkrin added a commit to Sebastien-Ahkrin/node that referenced this pull request Dec 11, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: nodejs#18013
Refs: nodejs#18367

PR-URL: nodejs#30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins added a commit that referenced this pull request Dec 17, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BethGriggs BethGriggs mentioned this pull request Dec 23, 2019
Hakerh400 added a commit to Hakerh400/node that referenced this pull request Jan 9, 2020
Pull request nodejs#30710 has
introduced afterWriteTickInfo property for optimizing
synchronous write completions and it is backported
to v12.x as a semver-minor release, but it breaks
use case of JSON stringifying writable stream object.
This PR makes that property non-enumerable.
@Hakerh400 Hakerh400 mentioned this pull request Jan 9, 2020
3 of 3 tasks complete
Hakerh400 added a commit to Hakerh400/node that referenced this pull request Jan 10, 2020
Pull request nodejs#30710 has
introduced afterWriteTickInfo property for optimizing
synchronous write completions and it is backported
to v12.x as a semver-minor release, but it breaks
use case of JSON stringifying writable stream object.
This PR hides that property behind a symbol.
Hakerh400 added a commit to Hakerh400/node that referenced this pull request Jan 12, 2020
Pull request nodejs#30710 has
introduced afterWriteTickInfo property for optimizing
synchronous write completions and it is backported
to v12.x as a semver-minor release, but it breaks
use case of JSON stringifying writable stream object.
This PR hides that property behind a symbol.
Hakerh400 added a commit to Hakerh400/node that referenced this pull request Jan 13, 2020
Pull request nodejs#30710 has
introduced afterWriteTickInfo property for optimizing
synchronous write completions and it is backported
to v12.x as a semver-minor release, but it breaks
use case of JSON stringifying writable stream object.
This PR hides that property behind a symbol.
Hakerh400 added a commit to Hakerh400/node that referenced this pull request Jan 13, 2020
Pull request nodejs#30710 has
introduced afterWriteTickInfo property for optimizing
synchronous write completions and it is backported
to v12.x as a semver-minor release, but it breaks
use case of JSON stringifying writable stream object.
This PR hides that property behind a symbol.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.