-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull based streams: ability to read $num items #74
Comments
This is good data, thank you! I would be especially interested in @isaacs's experiences with this, as I believe he has mentioned a few times that node's I've also heard him say that, toward the backpressure strategy point, high water marks are probably unnecessary (see #13). The latter would, to me, indicate we'd probably want to expose even more low-level buffer/backpressure-management primitives, and force most of the work into userspace (see #24)---but without the hazards you mention which cause waste by maintaining a separate buffer. In this case the idea might be to allow building higher-level things like On the other hand, the idea of mapping to low-level OS APIs is a very compelling one, and probably what drove Node in that direction in the first place. I am hesitant to do anything that takes us away from that. I am just unsure whether they might be an attractive nuisance, or something that is implemented inconsistently among streams, or only works for binary streams, or similar. What has been your experience for these binary-file-format use cases when working with Node's APIs? Are they sufficient? Are they ergonomic? Is |
I am unfamiliar with using read(n) in "object mode". Generally though abstractions have a type that you are concerned with (similar to how typed arrays use views) and it may be handy to swap between them. This can be recreated with either In nodejs/readable-stream#30 @TooTallNate suggests a peek instead. Having
A good start to working with binary data structures is As for the developer experience:
Overall I am not against |
I've been editing another streams spec at W3C [1] before I join here to define only one primitive set here together with Domenic. This spec basically focused on binary handling. Please take a look. I'm planning to provide read(n) kind interface as an extension for binary (or similar things) handling streams in a separate doc. Since the core primitive is going to support arbitrary objects which are neither divisible nor joinable in general, introducing read(n) stuff to the core complicates it much. Actually, I tried to do so in the W3C version, but was very hard. The only feasible way to introduce read(n) to the core is define the "size" for all objects as 1. [1] https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm |
My idea is briefly:
|
This could be covered by https://github.com/whatwg/streams/blob/master/BinaryExtension.md |
I believe almost all of the features discussed here has been implemented. See e601d69 |
The ability to read a specific number of things at a time and leave remaining queued data is an important feature of streams. This is present in the low level OS APIs as well:
Unix read: http://man7.org/linux/man-pages/man2/read.2.html
AIO: http://man7.org/linux/man-pages/man7/aio.7.html , http://man7.org/linux/man-pages/man3/aio_read.3.html
Windows: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365467(v=vs.85).aspx
IOCP: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365198(v=vs.85).aspx (uses ReadFile)
In particular use cases around reading binary file formats are greatly simplified if we do not have to maintain a buffer in addition to the OS buffer for a stream. Even if the resulting
read
does not contain all of the data we need, we can maintain a single buffer for the segment of the file we are processing, rather than one for the segment we are processing and one in case we need to "unshift" (in user land) data that is outside of the segment we are processing.While we can just add on user land wrappers to do this, the situation gets blurry about memory consumption and leads us to take a step away from back pressure. If we are queueing up chunks of memory in a user land buffer naively we can negate back pressure all together, thus leading to a need to ensure our user land buffering stream keeps back pressure. In addition to needing back pressure if a stream is blocked by a long standing operation (such as waiting on a HTTP response) before wanting more data; we are wasting memory and some CPU (for re-forming a pull based stream) in our user land solution.
Currently, there is no way to work with the low level buffer that backs a stream without also needing to maintain a separate buffer for pull based workflows. I propose we include the ability to limit the maximum number of items that can be pulled off of a Stream while using
read
.The text was updated successfully, but these errors were encountered: