-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pack backpressure #114
pack backpressure #114
Conversation
PR makes sense. Will merge when I’m at my office. Curious, why you aren’t using the streaming api instead of passing the full buffer? |
I'm doing this at the end of a series of object streams: object_stream() -> object_stream() -> tar_stream() -> shell_stream() When it comes time to pack files, I am extracting some fields from the object to generate the header and then calling At this point I already have the data as a string so it seemed appropriate to use the buffer interface. |
I think also the requirement to specify a byte size up-front makes the streaming interface less attractive compared to the buffer interface. I'm guessing that this is due to a requirement of the TAR format, presumably a leading byte which indicates the content length? |
Yes tar requires you to declare the size upfront. Usecase makes sense |
Out in 2.1.2, thanks! |
What about making it optional, and try to find the actual lenght if not defined? Or would it lead to problems like memory usage? |
@piranna that would require buffering the entire stream, so in that case just use the api this PR improves |
Yes, i suposse, but in that case, is it possible to do it automatically? :-) |
You could magically do it on the stream yes, but I think that'd lead to serious bugs for people not realising it buffers. Better docs for this is prob the way to go if we wanna improve it. |
I like the options that developers have, they cater to different use cases: I have a stream-like thing and I know the length up-front (eg. a file) I have some bytes in memory already I have a stream-like thing and I don't know the length up-front (eg. an object stream) ...and I am constrained by memory but am willing to spend CPU ...I am not worried about memory usage |
My idea was to do under the hood something like:
But you are right, explaining the use cases in the docs seems a better alternative, probably the developer knows it before hand :-) |
Heya,
I'm using this amazing lib to
pack
GBs of geographic data and then pipe it on tobzip2
for compression.Unfortunately
bzip2
is sloooow, so those GBs I packed end up being buffered in nodejs memory waiting forbzip2
to ask for more.The source of my memory woes seems to be that the return variable from
this.push()
isn't being checked which results inpack._readableState.buffer.length
growing uncontrolled until I run out of RAM 😭Looking at the source code there is already a
this._drain
variable which is perfect for implementing backpressure:I've added a simple test case which I'm happy to clean up if this PR is acceptable?
The difference you'll notice in how the test displays are:
Please let me know if you think this is something you would consider including 🙇