Skip to content
This repository has been archived by the owner on Jul 31, 2018. It is now read-only.

EP-003: stream.Readable#readInto #27

Closed
wants to merge 3 commits into from

Conversation

indutny
Copy link
Member

@indutny indutny commented Jun 1, 2016

@indutny
Copy link
Member Author

indutny commented Jun 1, 2016

cc @nodejs/collaborators @nodejs/streams

// Asynchronously read data into `buf.slice(offset, offset + length)`
// Invoke continuation with either error (`err`) or number of bytes
// read (`length`)
stream.readInto(buf, offset, length, (err, length) => {
Copy link
Member

@jasnell jasnell Jun 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worthwhile going ahead and passing buf, offset and length to the callback as additional arguments. (or buf.slice(offset, offset + length)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, though passing slice will probably be in contrary to the performance reasons of this feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed. simply passing the additional buf and offset arguments on should be good enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

000-index.md Outdated
@@ -2,3 +2,4 @@

* [Public C++ Streams](001-public-stream-base.md)
* [ES6 Module Interoperability](002-es6-modules.md)
* [stream.Readable#readInto()](003-readable-read-into.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this line and rename 003-* to XXX-*? Whoever merges the PR changes those, since another might land before this and all. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. 👍

@trevnorris
Copy link
Contributor

Totally dig the idea.

@indutny
Copy link
Member Author

indutny commented Jun 2, 2016

So, does it mean that I need to change "DRAFT" to "ACCEPTED"? What is required for this?

@trevnorris
Copy link
Contributor

@indutny Well, if it were possibly considered controversial then it would need to be voted on by the CTC. I would at least like to see more of the CTC give their thoughts before it's ACCEPTED. Meaning, if an EP reaches ACCEPTED then the PR accompanying it will definitely be accepted, and there shouldn't be much if any discussion between CTC members about the PR landing (this doesn't include other devs who feel the need to chime in).

@indutny
Copy link
Member Author

indutny commented Jun 2, 2016

@trevnorris makes sense. Thank you!

@indutny
Copy link
Member Author

indutny commented Jun 2, 2016

cc @nodejs/ctc then

@jasnell
Copy link
Member

jasnell commented Jun 2, 2016

Fwiw, I'm +1 on this.
On Jun 1, 2016 8:38 PM, "Fedor Indutny" notifications@github.com wrote:

cc @nodejs/ctc https://github.com/orgs/nodejs/teams/ctc then


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
#27 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAa2efV1hdwFv4Ef403YFb3KvX3eppCPks5qHlAigaJpZM4IrtXl
.

@chrisdickinson
Copy link

Who is expected to call ._readInto? .read(), or a new public .readInto()? Does this interact with .pipe?

@chrisdickinson
Copy link

(Also: is it possible that we define this for a subset of streams — maybe just net.Socket?)

@indutny
Copy link
Member Author

indutny commented Jun 2, 2016

@chrisdickinson

Who is expected to call ._readInto? .read(), or a new public .readInto()? Does this interact with .pipe?

readInto() call ._readInto if present, .read() calls it if ._read is not present. It does interact in quite the same way as the readable streams work right now.

@indutny
Copy link
Member Author

indutny commented Jun 2, 2016

@chrisdickinson

Who is expected to call ._readInto? .read(), or a new public .readInto()? Does this interact with .pipe?

Personally, I don't have much preference, but it seems to be better to have a wider support for it.

@benjamingr
Copy link
Member

This is something I would have used plenty of times (readInto). I'm +1 on this and it would make my life easier.

The proposal itself sounds reasonable to me. I'd make some stylistic changes to the code samples in the proposal but that's really not important.

@mcollina
Copy link
Member

mcollina commented Jun 2, 2016

@indutny just have a section clarifying the relationship with pipe.

@bnoordhuis
Copy link
Member

This proposal still has the issue that you need at least as many buffers as you have streams, doesn't it?

Ideally, it would be possible to have a single buffer and read into that on demand. E.g.:

const streams = [ /* ... */ ];
const buffer = new Buffer(256);

for (const stream of streams) {
  stream.on('readable', () => {
    const n = stream.readInto(buffer, 0, buffer.length);
    if (n < 0)
      handleError(stream, n);
    else
      handleData(stream, buffer, n);
  });
}

### 2.2 Core Perspective

TLS, HTTP is now performed mostly within the C++ layer, raising the complexity
and making things less observable for the JavaScript user code. `readInto` may
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"may be" is not very confidence building here

@rvagg
Copy link
Member

rvagg commented Jun 2, 2016

It's such a leaky abstraction tho, I'm not really keen on increasing the complexity of streams tbh and this certainly does that. I'd prefer if this were something we could keep private in core for its own use until we're comfortable that the wins are big enough to share with users.

@mcollina
Copy link
Member

mcollina commented Jun 3, 2016

@rvagg that will be hard, mainly because it will probably leak into readable-stream at the next release.
My idea is to flag that as experimental, and if it is something too complex to maintain kill it in the next major.

My concern is mainly related to the current non applicability of this to pipe(), to get "global" benefits.

I'll be more than happy to "port" some of my MQTT stuff to readInto to check if I get benefits.

@trevnorris
Copy link
Contributor

@rvagg What's unfortunate is that the idea is conceptually simple. Streams are what make everything else more complicated. I'd like to see this in core. Creating many buffers for incoming packets is the limiting throughput factor for TCP I/O.

@bnoordhuis That's sort of the API I was hoping for. Problem is the 'readable' event only fires once data has already been read in. So the Buffer that will hold the data needs to be supplied before the 'readable' event fires.

I see the flow going something like so:

  • TCP server created, without running readStart()
  • Once user supplies a Buffer, that pointer is set as the next return value for the alloc_cb
  • readStart() is now executed
  • After data is read in, hand Buffer back to user
  • At the end of synchronous execution, if no Buffer has been handed back run readStop()

Well, that's a rough outline of what I was thinking.

@jorangreef
Copy link

Any progress on this? This would be really awesome.

@mscdex
Copy link

mscdex commented Aug 21, 2017

I'm still very interested in this as well.

FWIW I just wrote my own implementation that is strictly limited to sockets and bypasses streams altogether to make things simple and straight-forward (although the 'end' event should still be emitted during socket close). This means that the socket is either always using a single Buffer during its lifetime or it's always in "malloc mode." A Buffer and a callback are both needed to activate the new behavior. I see almost 2x throughput in benchmark/net-s2c.js with this new mode.

It works like this:

const socket = net.connect({
  port: 12345,
  buffer: Buffer.alloc(65535),
  onread: (nread, buffer) => {
    // buffer argument === buffer property above

    // return `false` explicitly to stop reading, `socket.read(0)`
    // or similar should restart it
  }
});

Example "malloc mode" result:

net/net-s2c.js dur=5 recvbuf=0 type="buf" len=102400: 15.772152211480213

Example new mode result:

net/net-s2c.js dur=5 recvbuf=65535 type="buf" len=102400: 28.297843930773467

@mcollina
Copy link
Member

I would love to have something like this implemented. I just do not see how we can achieve that with the current stream API. I'm open to suggestions.

This might be relevant to the discussion: https://github.com/mcollina/syncthrough. It's like Transform but sync (it respects the Streams 3 contract, apart from the buffering). Because it's sync, we can guarantee that the chunk has been processed afterwards.

@mscdex
Copy link

mscdex commented Aug 21, 2017

Perhaps we will have to just skip streams to make it simple and/or doable. I think starting with net.Socket will go a long way, as there are lots of existing modules that could benefit just from that (e.g. database drivers).

@jorangreef
Copy link

Thanks @mscdex, would you be able to provide me with a copy of that code (even very raw)? I don't mind doing more work to open source it.

@mscdex
Copy link

mscdex commented Aug 22, 2017

@jorangreef I just pushed what I currently have to https://github.com/mscdex/io.js/tree/net-socket-static-buffer. I'm sure some of the C++ changes could be designed better but meh...

@Trott
Copy link
Member

Trott commented Jun 12, 2018

Closing as this seems inactive. By all means, comment or re-open if this is still an ongoing concern, although I suspect it should be moved to a more active repository (probably the main node repo) if that's the case.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet