Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scale the socket receive buffer on the fly #39092 #39097

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mmomtchev
Copy link
Contributor

As an experiment, I tried scaling the receive buffer on the fly depending on how full it was on the last iteration
It has the advantage of using less memory for slow streams and being more efficient for faster streams

It does not allow for manual control (but this can probably be added) and it effectively short-circuits the back pressure mechanism - as the first read always gets through - which is not very well suited for stream originating from the network

@github-actions github-actions bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Jun 20, 2021
@mmomtchev mmomtchev changed the title scale the socket receive buffer on the fly #39062 scale the socket receive buffer on the fly #39092 Jun 20, 2021
@mmomtchev
Copy link
Contributor Author

Sorry, I meant to refer to #39092

@mscdex
Copy link
Contributor

mscdex commented Jun 20, 2021

Can't the onread feature of net.Socket already achieve the same goal?

@mmomtchev
Copy link
Contributor Author

Can't the onread feature of net.Socket already achieve the same goal?

Not if you are piping
(unless this is implementing in the piping too)

src/stream_wrap.cc Outdated Show resolved Hide resolved
src/stream_wrap.cc Outdated Show resolved Hide resolved
@mscdex
Copy link
Contributor

mscdex commented Jun 20, 2021

Not if you are piping
(unless this is implementing in the piping too)

I don't follow. If you had socket.push(data) inside the onread callback and had the buffer as a function returning Buffer instances of varying sizes to provide the scaling, would that not allow the socket to be piped like normal?

@mmomtchev
Copy link
Contributor Author

Not if you are piping
(unless this is implementing in the piping too)

I don't follow. If you had socket.push(data) inside the onread callback and had the buffer as a function returning Buffer instances of varying sizes to provide the scaling, would that not allow the socket to be piped like normal?

The way I understand this part of the code, is that this buffer is static and it is reused for every write? Which is a very good idea performance-wise, but I don't think you can make it work with the current piping implementation?

src/stream_wrap.h Outdated Show resolved Hide resolved
@mscdex
Copy link
Contributor

mscdex commented Jun 21, 2021

The way I understand this part of the code, is that this buffer is static and it is reused for every write?

Not necessarily, the buffer property can also be a function that returns any Buffer that should be used. So it could return a new Buffer every time.

@mmomtchev
Copy link
Contributor Author

The way I understand this part of the code, is that this buffer is static and it is reused for every write?

Not necessarily, the buffer property can also be a function that returns any Buffer that should be used. So it could return a new Buffer every time.

Yes, I agree, I had missed the part in stream_base_commons.js which reallocates the buffer on every read. This would leave it to the user to scale the buffer? Or do you propose that the same mechanism is implemented in the JS-part of the library?

@mscdex
Copy link
Contributor

mscdex commented Jun 21, 2021

This would leave it to the user to scale the buffer? Or do you propose that the same mechanism is implemented in the JS-part of the library?

I would lean more towards the former unless there is strong evidence with data to back it up that this scaling behavior should existing in node core by default.

src/stream_wrap.h Outdated Show resolved Hide resolved
Co-authored-by: Darshan Sen <raisinten@gmail.com>
@mmomtchev
Copy link
Contributor Author

This would leave it to the user to scale the buffer? Or do you propose that the same mechanism is implemented in the JS-part of the library?

I would lean more towards the former unless there is strong evidence with data to back it up that this scaling behavior should existing in node core by default.

This is subject of discussion of course.
My rationale is that currently if you have a 20MB/s input stream (my example), this would require 320 event loop iterations to process. An application might have only 10% overall CPU load - if it comes at chunks of 5ms of processing time each, it won't be able to keep up. And the end result won't be a 10MB/s input stream - it would be a constantly stalling input stream - one that sets its TCP window to zero 10 times per second. In my case the 20MB/s stream was going at about 10KB/s.
It is an absolutely horrible behavior.
But in fact, you don't need to change anything in Node to avoid it - setting the highWaterMark of the socket solves the problem - it is just that is not a very well documented parameter (in fact, if you are not aware of Node's internals there is no way you can find it yourself).

  • Making an explicit note in the documentation is one solution
  • Exposing this parameter in the http.request is another
  • Asking the user to manually scale the buffer works too
  • But this solution is automatic and it also has better performance for all high-bandwidth incoming streams, even if the receiving application is not doing anything but waiting for the download to complete
  • And there is the simple fact that the backpressure mechanics work very well for pipes where Node controls both ends, but are simply ill-suited for streams coming from the network where Node has no control over the sender

@mscdex
Copy link
Contributor

mscdex commented Jun 21, 2021

The net benchmarks look fine to me. I wonder if this will have a noticeable impact on memory usage one way or the other in practice?

@mmomtchev
Copy link
Contributor Author

The net benchmarks look fine to me. I wonder if this will have a noticeable impact on memory usage one way or the other in practice?

It uses less memory for slow connections - and when it comes to fast connections don't forget that the kernel too allocates up to 256Kb per successful TCP connection (ie after the SYN exchange) - so normally, if it worked before, it should continue to work

I will be on solar power for some time, so I can't build Node from here to test it

I made a draft PR because there a few points that need discussing:

  • Is scaling back really needed?
    • I had horrible yoyo effects when I tried scaling back at 1/2
  • Should we break bad applications that are used to receiving 64k per read

@mscdex
Copy link
Contributor

mscdex commented Jun 24, 2021

  • Should we break bad applications that are used to receiving 64k per read

Seeing as how we're dealing with streams here, nobody should be relying on chunk sizes.

@mmomtchev mmomtchev marked this pull request as ready for review July 8, 2021 13:29
@aduh95
Copy link
Contributor

aduh95 commented Sep 20, 2023

What's the status here? If we still want to land this, it should be rebased on top of main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants