Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can streams be transferred via postMessage()? #276

Closed
wanderview opened this issue Feb 4, 2015 · 49 comments
Closed

can streams be transferred via postMessage()? #276

wanderview opened this issue Feb 4, 2015 · 49 comments

Comments

@wanderview
Copy link
Member

@wanderview wanderview commented Feb 4, 2015

Are there plans to support transferring streams between workers or iframes using postMessage?

This question came up in https://bugzilla.mozilla.org/show_bug.cgi?id=1128959#c2

Does this allow receiving a ReadableStream, from for example a filesystem
API or a network API, and then transferring that ReadableStream to a Worker
thread using postMessage({ readThis: myStream }, [myStream]);

Likewise, can a page instantiate a ReadableStream (through a ReadableStream
constructor or through some other means) that can then be transferred to a
Worker thread?

This doesn't mean that all ReadableStreams have to be transferrable. For example a stream of plain JS objects can obviously not be transferred. So maybe we need a notion of transferrable and non-transferrable streams.

@wanderview
Copy link
Member Author

@wanderview wanderview commented Feb 5, 2015

I don't know if there are plans for this, but one thought I had:

  1. Transferred stream is locked and drained by the UA. The values are effectively piped to the new environment.
  2. If a stream value is not a transferrable DOM object or structured clonable JS object, then the stream aborts.
  3. If content wants to use the stream in two environments then it can call clone() before transferring.
@domenic
Copy link
Member

@domenic domenic commented Feb 5, 2015

Yeah, that sounds about right. I was hoping to work out the details, and at some point I had an ambitious plan to first figure out how we clone/transfer promises and then do the same for streams (see dslomov/ecmascript-structured-clone#5). But maybe we can just jump to the end if this is important for a particular use case that an implementer wants to implement.

@wanderview
Copy link
Member Author

@wanderview wanderview commented Mar 24, 2015

I was talking with @sicking today and he raised an interesting point. What do we do in this situation:

  1. Script gets a ReadableStream for a native resource like a file or something.
  2. Script overrides ReadableStream.read() with stream.read = function() {...}
  3. Script tries to transfer the stream to a worker with postMessage()

Ideally the worker would have access to the stream data without all the buffers going through the original thread, but that seems impossible if .read() is JS implemented in the original context.

Can we make the .read() and .writer() methods unforgeable? Or should postMessage() just throw in this situation?

@domenic
Copy link
Member

@domenic domenic commented Mar 24, 2015

I would imagine we just do what we do for all other platform objects (including e.g. promises), and define the behavior of how other code interacts with them not in terms of their public API but in terms of abstract operations that apply to them. So e.g. the postMessage algorithm references TeeReadableStream(s) instead of s.tee().

@domenic
Copy link
Member

@domenic domenic commented Mar 24, 2015

(Sometimes this is also stated in specs as "using the original value of X.prototype.method," if the appropriate abstract-operaton factoring hasn't happened to the spec in question.)

@wanderview
Copy link
Member Author

@wanderview wanderview commented Mar 24, 2015

And because the tee concept operates on the source it effectively by-passes any overrides on the outer object?

@domenic
Copy link
Member

@domenic domenic commented Mar 24, 2015

I mean, I'd say that it operates on the stream, but does so by reaching into its innards instead of through the public API, in order to be more tamper-proof.

@sicking
Copy link

@sicking sicking commented Mar 25, 2015

So that's true even for piping? I.e. if I do

var writable = nativeFileSystemAPI.appendToFile(filename);
writable.write = function(buffer) {
  console.log(buffer);
  WritableStream.prototype.write.call(this, buffer);
}
readable.pipeTo(writable);

Then nothing will get logged since pipeTo is reaching into the innards of the argument that's passed to it, rather than using the public API?

@domenic
Copy link
Member

@domenic domenic commented Mar 25, 2015

@sicking That's a good question. My initial answer was no: pipeTo operates polymorphically on its argument, which must take the structural type of a writable stream, but could be any implementation of the writable stream contract. (E.g., queue-backed WritableStream, or specialized WritableByteStream, or any author-created writable stream types.) So necessarily it operates through the public interface of its arguments, and thus the override will be consulted. The motivation behind this kind of operation is discussed in some length in a new section of the spec; feedback welcome on that. In the end I think this is key to a thriving stream ecosystem with both specialization and generality built in.

However I of course recognize the impulse behind it; you want to have fast off-main-thread piping be the norm for UA-created streams talking to each other. I see a few perspectives on your question in that light:

  • Modifying a UA-created writable stream in such a way "deopts" you to the normal pipe algorithm, since you have made it observable. This honestly seems fine to me; it's similar to how most JS objects work. (E.g. if you start adding numeric properties to Array.prototype/string-valued properties to an array, all arrays/that array will now be slow.)
  • Alternately, we could say that it depends on the implementation of readable. If readable is a type of stream that knows how to recognize NativeFileSystemAPIFileAppendWritableStreams (or whatever), then its pipeTo implementation might be different than ReadableStream.prototype.pipeTo. E.g., as its first line it could do a brand-check to see if its argument is a NativeFileSystemAPIFileAppendWritableStream, and if so reach into its internals. This would of course mean readable is not a ReadableStream, but instead a more specialized type that has such additional logic. But that seems expected; if it has a different implementation, then it needs to be a different type.
  • Going even further, we could try to make ReadableStream.prototype.pipeTo itself extensible in this fashion, thus avoiding the proliferation of types. That is, in the same way you want to have specialized piping algorithms for your UA-created streams, we should be able to explain that specialization mechanism and give authors access to it, since of course it's not only UA streams which could benefit from this specialization. There's a few approaches to this, including a double-dispatch one I spent some time typing up before realizing it was probably too complicated for this bug at this stage. But the simplest model is probably just a registry of (rsBrandCheck, wsBrandCheck) -> pipeAlgorithm which pipeTo can consult.

I'd be quite interested in an implementer perspective on which of these seems more attractive (from everyone, @tyoshino and @yutakahirano included!).

@wanderview
Copy link
Member Author

@wanderview wanderview commented Mar 26, 2015

  • Modifying a UA-created writable stream in such a way "deopts" you to the normal pipe algorithm, since you have made it observable. This honestly seems fine to me; it's similar to how most JS objects work. (E.g. if you start adding numeric properties to Array.prototype/string-valued properties to an array, all arrays/that array will now be slow.)

I asked @bzbarsky and this option is possible, but it would probably not be our first choice. It would require adding back features that were removed from SpiderMonkey for being a bit hacky. It seems we should go with other options if possible.

@bzbarsky
Copy link

@bzbarsky bzbarsky commented Mar 26, 2015

I should note that option 1 requires carefully defining exactly what happens if the deopt happens in the middle of the copy operation somewhere; you have to synchronize somehow so that the writes that need to be observable after the deopt are actually observable or something.

@tyoshino
Copy link
Member

@tyoshino tyoshino commented Mar 26, 2015

We've redesigned the readable stream to have the reader+stream structure and are going to also introduce the writer+stream structure to the writable stream. Given that change, the example by Jonas (#276 (comment)) doesn't work since pipeTo() obtains a writer by itself. There's no point where we can substitute writer.write which the pipeTo() uses.

@tyoshino
Copy link
Member

@tyoshino tyoshino commented Mar 26, 2015

So, to implement the 1st option in Domenic's comment (#276 (comment)), for example, we should make overriding getReader() deopt the special piping algorithm of the readable stream writable stream pair, I guess.

@tyoshino
Copy link
Member

@tyoshino tyoshino commented Mar 26, 2015

But getReader() is very special. It requires interaction with internal states of streams. Some API support is needed.

@yutakahirano
Copy link
Member

@yutakahirano yutakahirano commented Mar 26, 2015

@tyoshino, sorry, I don't understand what you meant at #276 (comment). Can you explain?

@yutakahirano
Copy link
Member

@yutakahirano yutakahirano commented Mar 26, 2015

Not streams but the reader and the writer are important for pipeTo. If the reader and the writer gotten via getReader() and getWriter() are authentic, we can enable the optimization.

var rs = ...; // rs is a UA-created ReadableByteStream.
var ws = ...; // ws is a US-created WritableByteStream.

var reader = rs.getReader();
var writer = ws.getWriter();

var rsx = {
  getReader: () => reader,
  pipeTo: rs.pipeTo
};

var wsx = {
  getWriter: () => writer
};

rsx.pipeTo(wsx); // This works with the optimization.
@bzbarsky
Copy link

@bzbarsky bzbarsky commented Mar 26, 2015

OK, that seems to push the issue off to how pipeTo interacts with the reader and writer. That is, all of #276 (comment) applies but now to the reader and writer, right?

@domenic
Copy link
Member

@domenic domenic commented Mar 26, 2015

I think it helps reduce the surface area a little bit. In particular one strategy would be that you only have to check if dest.getWriter is the same as the original getWriter, and as long as that can be done unobservably, you're then good to go. (I think it can, too: you insert the equality check in between the "get" and the "call" steps of the invoke-a-method algorithm.)

Except... what if someone overwrites WritableStreamWriter.prototype.write or similar -_-.


Another strategy would be that the first thing the algorithm does is grab all the methods off the writer, and then use them forever more after. E.g.

const [write, close] = [writer.write, writer.close];

// we can unobservably insert a check here that write and close are the expected ones...

// later:
write.call(writer) // instead of writer.write()

Except ... this doesn't work for our ready getter, without hacks. And, seems pretty JS-unnatural anyway...


I'm starting to feel that we need to program what we want more directly into the spec, instead of trying to make it an unobservable optimization. That implies one of the other two strategies. (Or perhaps less-extensible versions of them, at least in the short term...)

I hope to think on this more productively next Monday, when my brain has had a chance to un-fry itself after a TC39 week. I feel like I need to take a step back, and say what the high-level goals and constraints are here. (I've left a lot implicit, I think.) But the above is kind of where I'm at right now; hope it helps people understand that we're taking this seriously at least :)

@bzbarsky
Copy link

@bzbarsky bzbarsky commented Mar 26, 2015

You don't have to check dest.getWriter, right? You have to check something about the object it returns, as you note. I assume getWriter is only called once, up front.

But yeah, my point was that we have the same problem with the writer. :(

@domenic
Copy link
Member

@domenic domenic commented Mar 26, 2015

Well, you could do either, but yeah, both fail.

@wanderview
Copy link
Member Author

@wanderview wanderview commented Apr 6, 2015

It seems we can avoid most the issues if we just make it so you can postMessage() the stream, but not the reader. When you postMessage() the stream then it locks the stream and transfers the underlying source. This doesn't give 3rd party js any surface area to override as far as I can tell.

@domenic
Copy link
Member

@domenic domenic commented Apr 6, 2015

Right, that does address the OP, although we did get into the question of how to make pipeTo unobservable (independent of postMessage).

@domenic domenic added the question label Apr 7, 2015
@tyoshino
Copy link
Member

@tyoshino tyoshino commented Apr 8, 2015

Maybe we should first discuss the overwrite detection issue at #321.

@isonmad
Copy link
Contributor

@isonmad isonmad commented Jun 2, 2016

This issue is really old now, but what happened to the alternative from #97 (comment) that just uses an explicit 'socketpair' connecting to the other page? Issue #244 never had any discussion about it, and it sidesteps any transfer-of-state issues.

@ricea
Copy link
Collaborator

@ricea ricea commented Apr 15, 2019

I think @MattiasBuelens analysis is correct. I see 4 major concerns:

  1. Can we implement it?
  2. Can we write the standard text for it?
  3. Is the behaviour reasonable in edge cases?
  4. Is it worth the extra complexity?

@MattiasBuelens has convinced me we can implement it. If we can implement it then in principle we can standardise it.

It will take a lot of work (specifically writing tests) to convince me that the edge cases are okay.

I think the complexity is probably worth it, because optimised implementations of transferable streams will naturally want to bypass intermediate steps. The fact that this issue has been discovered before we've even shipped it also indicates that it may be an important use case.

@ricea
Copy link
Collaborator

@ricea ricea commented Jun 23, 2020

I've thought about special-casing double-transfer a bit more and now understand why it's hard. The problem is setting up the initial state of the transferred stream.

A normal transfer involves setting up a special kind of identity transform that communicates over a message pipe. We pipe the original stream into the local end, and the remote end becomes the transferred stream.

There are two key features of this

  1. The only thing we actually need to transfer is one end of a message pipe, and
  2. The initial state of the original stream is copied to the transferred stream implicitly via the use of pipeTo().

Now in the double-transfer case we'd like to reuse the message pipe embedded in the transferred stream, transfer it again, and use it to create the doubly-transferred stream in the target realm. Which works fine if the transferred stream happens to be in its default state, but if it's in some other state that won't be correctly reflected in the doubly-transferred stream.

So this second transfer involves transferring not just a message pipe, but the complete current state of the stream. So we have two choices,

  1. Have two very different ways of setting up the remote proxy stream, or
  2. Remove the current elegant method and only have an ugly method, possibly removing the "special identity transform" and pipeTo() and instead proxying the underlying source methods directly.

Neither of these options are particularly attractive. I would prefer to ship the first version of transferable streams without special-casing double-transfer at all, but if we are going to change the whole approach then we should probably decide it now.

ricea added a commit to ricea/streams that referenced this issue Jul 8, 2020
Make readable, writable, and transform streams transferable via
postMessage(stream, [stream]).

The streams themselves must be transferred, but the chunks written or
read from the streams are cloned, not transferred. Support for
transferring chunks will require API changes and is expected to be added
in a future update.

There is no reference implementation of this functionality as jsdom does
not support transferrable objects, and so it wouldn't be testable.

Closes whatwg#276.
@ricea ricea mentioned this issue Jul 8, 2020
3 of 3 tasks complete
@domenic
Copy link
Member

@domenic domenic commented Jul 9, 2020

Could we build something on top of the current draft, by reestablishing the state with a special message whose contents are [[state]], [[storedError]], and maybe [[disturbed]]? I guess it gets a decent bit uglier if we also have to do [[queue]] and [[queueTotalSize]], but I think that'd be all of it...

@ricea
Copy link
Collaborator

@ricea ricea commented Jul 9, 2020

we build something on top of the current draft, by reestablishing the state with a special message whose contents are [[state]], [[storedError]], and maybe [[disturbed]]? I guess it gets a decent bit uglier if we also have to do [[queue]] and [[queueTotalSize]], but I think that'd be all of it...

We don't have the other end of the message port. It's still in the original realm. So our only communication channel is the |dataHolder|. Then add in the fact that there may be a message in transit from the original realm that's trying to change our state at the same time we're trying to clone it and the whole thing becomes very painful.

@domenic
Copy link
Member

@domenic domenic commented Jul 9, 2020

Sorry, I'm pretty behind on the discussion. Please allow me to ask some stupid questions to try and get caught up...

From what I understand we're considering a situation like this:

// in window
const s = new ReadableStream({ start(c) { c.error(); } });
frames[0].postMessage(s, [s]);

// in frames[0]
worker.postMessage(s, [s]);

I guess what I'm envisioning is that in frames[0], the transfer steps for s would send the frames[0] MessagePort to worker, alongside s's [[state]] and [[storedError]]. Then those values make their way to the transfer-recieving steps, at which point SetUpCrossRealmTransformReadable would go poke around the internals of the newly-created stream and set its [[state]] and [[storedError]] to the values.

(I was initially thinking that the transfer-recieving steps would dispatch a synthetic message event on the MessagePort that was something like { type: "reestablish state", value: { state, storedError } }, but I think that's messier...)

Then add in the fact that there may be a message in transit from the original realm that's trying to change our state at the same time we're trying to clone it and the whole thing becomes very painful.

I feel like this should all be handled by HTML's (and implementations') existing infrastructure. As long as we reestablish the state that was last seen before any such state-changing messages arrive, everything would be consistent, and HTML gives us the ordering guarantees needed to make that possible.

@MattiasBuelens
Copy link
Contributor

@MattiasBuelens MattiasBuelens commented Jul 9, 2020

I think you might be on to something. Something like:

ReadableStream transfer steps, given value and dataHolder:

  1. If ! IsReadableStreamLocked(value) is true, throw a TypeError exception.
  2. Let port2 be value.[[transferMessagePort]].
  3. If port2 is undefined, then:
    1. Let port1 be the result of creating a new MessagePort object whose owner is the incumbent settings object.
    2. Set port2 to the result of creating a new MessagePort object whose owner is the incumbent settings object.
    3. Entangle the port1 and port2 objects.
    4. Let writable be a new WritableStream.
    5. Perform ! SetUpCrossRealmTransformWritable(writable, port1).
    6. Let promise be ! ReadableStreamPipeTo(value, writable, false, false, false, undefined).
    7. Set promise.[[PromiseIsHandled]] to true.
  4. Let messagePortDataHolder be { [[Type]]: "MessagePort" }.
  5. Perform MessagePort's transfer steps with port2 and messagePortDataHolder.
  6. Set dataHolder.[[port]] to messagePortDataHolder.
  7. Set dataHolder.[[disturbed]] to value.[[disturbed]].
  8. Set dataHolder.[[state]] to value.[[state]].
  9. Set dataHolder.[[storedError]] to value.[[storedError]].

ReadableStream transfer-receiving steps, given dataHolder and value:

  1. Let port be a new MessagePort.
  2. Perform MessagePort's transfer-receiving steps with dataHolder.[[port]] and port.
  3. Perform ! SetUpCrossRealmTransformReadable(value, port).
  4. Set value.[[transferMessagePort]] to port.
  5. Set value.[[disturbed]] to dataHolder.[[disturbed]].
  6. Set value.[[state]] to dataHolder.[[state]].
  7. Set value.[[storedError]] to dataHolder.[[storedError]].

We only ever construct and pipe to a WritableStream inside the realm where the ReadableStream was originally created. In all other realms, we simply construct a cross-realm transform readable using the transferred port. When we transfer the stream again to another realm, that cross-realm transform readable will effectively become "neutered": its event handlers can no longer fire, since the message queue is transferred to the other realm. So we can pass the same message port that we received to the next realm.

Of course, in order for the next realm to know how to continue, we have to tell it where we left off. That's why we save and restore the state of the internal slots before and after transferring. 🙂

@ricea
Copy link
Collaborator

@ricea ricea commented Jul 9, 2020

I thought this wouldn't work because both PipeTo() and transfer-receiving steps would both try to set the state on the transferred stream. But on further reflection, it's probably okay:

state=readable: PipeTo won't try to change the state, so no problem
state=closed: PipeTo will try to close it, but it's a no-op to close a closed stream, so no problem
state=errored: PipeTo will try to error it, but it's a no-op to error and errored stream, so again no problem

The real difficult case is WritableStream, because the queue can be non-empty, and you shouldn't clone the chunks if you're going to pipe them.

@MattiasBuelens
Copy link
Contributor

@MattiasBuelens MattiasBuelens commented Jul 9, 2020

Hmm, okay, this might be a bit trickier than I thought. 😅

If we're going to transfer the port that we were using for the cross-realm transform readable/writable, we have to be absolutely sure that it no longer needs it inside the current realm. I can see this being a problem with the current writeAlgorithm for SetUpCrossRealmTransformWritable: it receives a chunk, waits for backpressure to be relieved and only then sends the chunk over the message port. If we neuter the port while it's waiting on the backpressure promise, then we will lose the chunk.

Also, the mere fact that we were waiting for backpressure is actually part of the stream's state, so we need to somehow transfer that as well. We could keep a boolean flag to indicate whether there is backpressure or not, send that flag when transferring the stream, and use that when we transfer-receive the stream to initialize backpressurePromise to either a pending or a resolved promise.

I still think we can do this with the current "elegant" approach. We just have to be very careful to re-construct the cross-real transform readable/writable in exactly the same state as it was inside the realm from which we transferred it.

@ricea
Copy link
Collaborator

@ricea ricea commented Jul 9, 2020

I feel like this should all be handled by HTML's (and implementations') existing infrastructure. As long as we reestablish the state that was last seen before any such state-changing messages arrive, everything would be consistent, and HTML gives us the ordering guarantees needed to make that possible.

Yes, I see what you mean. I think implementations are a lot more complex than the standard's platonic ideal, but they must be providing the same ordering guarantees for interoperability.

@ricea
Copy link
Collaborator

@ricea ricea commented Jul 9, 2020

Also, the mere fact that we were waiting for backpressure is actually part of the stream's state, so we need to somehow transfer that as well. We could keep a boolean flag to indicate whether there is backpressure or not, send that flag when transferring the stream, and use that when we transfer-receive the stream to initialize backpressurePromise to either a pending or a resolved promise.

I didn't think of that. It seems doable, but ugly.

@MattiasBuelens
Copy link
Contributor

@MattiasBuelens MattiasBuelens commented Jul 9, 2020

I didn't think of that. It seems doable, but ugly.

Yeah, it only occurred to me while I was looking at how we use the message port inside the cross-realm transform streams. I'm gonna have to think about this some more, perhaps there is more "hidden state" in there somewhere.

@domenic
Copy link
Member

@domenic domenic commented Jul 14, 2020

It occurs to me that it might be a bit strange that for author-created duplex streams, the syntax is

destination.postMessage(duplex, { transfer: [duplex.readable, duplex.writable] });

whereas for transform streams it is

destination.postMessage(transform, { transfer: [transform] });

I can't think of any nice solution for this though. Unfortunately the structured-serialize algorithm will throw for any platform object not explicitly marked as serializable (like a TransformStream), so to make the former syntax work for them we'd need make them [Serializable]. But that doesn't seem doable in general. It'd require new infrastructure in the structured clone algorithm I guess, which is probably not worth it.

@domenic
Copy link
Member

@domenic domenic commented Jul 14, 2020

So #1053 is nearing completion. The remaining open issue, which I think we could use community and implementer input on, is whether to support double-transfers (e.g. window -> window -> worker) now or later.

I'm a bit hesitant about putting double-transfers off until later, given that we had developers try this out behind a flag and become sad about them not working. (See upthread.) If we do put them off, I'd like us to add an explicit transfer-time error to avoid the confusing developer experience. E.g. each transferred stream gets a bit saying "was created by transferring" that prevents further transfers by throwing an "NotImplemented" DOMException.

EDIT: see the next couple of comments; the double-transfer issue is a bit more subtle than this comment alone indicates.

However, it also sounds like there are developers (who haven't found their way to this thread, but have showed up in e.g. the origin trial feedback) which would benefit from this feature with only single-transfers supported.

My impression is that recent discussions have started to discover how double-transfers could be supported, without too much effort. And it sounds like supporting double-transfers would not cause any backward-incompatible changes. But, the actual mechanics are still up in the air and uncertain, and we'd need a good amount of time for formalizing them, testing them, and implementing them to work out the kinks caused by the increased complexity.

Given this, I'd recommend that we merge #1053 and then work on double-transfers in the Chromium implementation and spec ASAP afterward. We could open a new issue distilling the various discussions that have happened here so far.

If folks disagree, either web developers planning to use this feature or implementers with concerns about merging #1053, I'd love to hear about it.

@ricea
Copy link
Collaborator

@ricea ricea commented Jul 21, 2020

I'm not in favour of making double-transfers fail. They're useful in scenarios where you don't tear down the middle context. For example, I use them in the tests 😄

Synchronously transferring the state of a WritableStream is looking increasingly difficult to me. For example, writeAlgorithm returns a promise. We can't clone or transfer the promise, and we can't synchronously observe its state. With code like the following:

writer.write(chunk);
writer.releaseLock();
worker.postMessage(writable, [writable]);

postMessage would need to synchronously observe the state of the stream. But the state of the stream depends on whether or not chunk could be serialized, information which is encoded in the promise returned by writeAlgorithm.

PipeTo doesn't have a problem with this because it operates asynchronously.

@domenic
Copy link
Member

@domenic domenic commented Jul 21, 2020

I hadn't fully appreciated the distinction between "double transfers are broken" and "double transfers are broken if you tear down the middle context". I see now that we are in the latter situation, not the former.

To properly warn developers in that case we'd need to do something complicated, like, on middle-context teardown, error all streams that are currently being passed through the context. Do you think that'd be reasonably implementable? I can think of how to spec it, but the spec isn't concerned with efficiency.

In spec-land (and implementation-land?) you can synchronously observe the state of a promise. It's something we try to avoid but this seems like a special case.

@ricea
Copy link
Collaborator

@ricea ricea commented Jul 21, 2020

To properly warn developers in that case we'd need to do something complicated, like, on middle-context teardown, error all streams that are currently being passed through the context. Do you think that'd be reasonably implementable? I can think of how to spec it, but the spec isn't concerned with efficiency.

I agree that it would be the best behaviour, but I think it falls under the category of "things we could implement, but probably wouldn't". It would involve messing around in the internals of MessagePort, which is unlikely to be pleasant and quite likely to break other things.

I can imagine other browsers might have implemented MessagePort in a way that makes it impossible to efficiently observe disentanglement.

In practice people will usually send data through their streams, so maybe they'll find out soon enough?

@MattiasBuelens
Copy link
Contributor

@MattiasBuelens MattiasBuelens commented Jul 21, 2020

Okay, the following is more of a brain-dump, sorry in advance. 😛

With code like the following:

writer.write(chunk);
writer.releaseLock();
worker.postMessage(writable, [writable]);

It gets even worse. You can hold on to a promise returned by writer.write(), and await it after you've released the lock and transferred the stream.

let writer = writable.getWriter();
let promise = writer.write(chunk);
writer.releaseLock();
worker.postMessage(writable, [writable]);
await promise; // can happen immediately, or much later

However, after we've transferred the writable to another context, there's no way for us to get the result of any of our queued writes.

So... will we need to resolve any pending write() promises immediately upon transferring the stream? Or reject them with some sort of "detached" error? Or keep them pending forever? 😕

For example, writeAlgorithm returns a promise. We can't clone or transfer the promise, and we can't synchronously observe its state.

We need to maintain enough bookkeeping as part of creating a cross-realm transform readable/writable, such that we can transfer that bookkeeping to another realm and have it re-create the cross-realm transform readable/writable in exactly the same state.

This state needs to include the state of the backpressurePromise. We could manually keep track of it with a boolean backpressure flag as part of the cross-realm transform writable, or we could just synchronously observe the promise's state as Domenic suggested. It's a matter of preference.

One tricky bit is the chunk argument inside writeAlgorithm. If we start writing the chunk but we're waiting for backpressure to be relieved, then we have not yet sent a "chunk" message. However, I think we can get around that if we simply transfer the entire queue (i.e. writable.[[writableStreamController]].[[queue]])? We only remove a chunk from the queue after the write has completed, so if we transfer the stream before that happens, the transferred stream will still have that chunk at the start of its queue.

I guess this does have an edge case: we might have just sent the "chunk" message, but then the stream gets transferred before we can run the "Upon fulfillment of sinkWritePromise" steps in WritableStreamDefaultControllerProcessWrite to remove the chunk from the queue. In that case, we must not re-enqueue the chunk in the new realm. It's probably easiest if we add another "bookkeeping flag" to track if the cross-realm transform writable has actually managed to send the last chunk, so that we can remove it from the queue before we transfer the queue.

One problem with transferring the queue in its entirety though is that it is closely tied to writable.[[writeRequests]]. Each entry in [[queue]] must correspond to an entry in [[writeRequests]] (or to the [[inFlightWriteRequest]]). If we add a bunch of chunks to the transferred stream's queue, we could "fix" things by adding an equal number of "dummy" promises to [[writeRequests]]. That way, if the user calls writer.write(), the new promise from that call will be linked to the correct chunk in the queue.

...I don't know if I'm explaining this properly in prose. Maybe I should just make a draft PR to try out a few ideas? 😅

ricea added a commit to ricea/streams that referenced this issue Aug 13, 2020
Make readable, writable, and transform streams transferable via
postMessage(stream, [stream]).

The streams themselves must be transferred, but the chunks written or
read from the streams are cloned, not transferred. Support for
transferring chunks will require API changes and is expected to be added
in a future update.

There is no reference implementation of this functionality as jsdom does
not support transferrable objects, and so it wouldn't be testable.

Closes whatwg#276.
ricea added a commit to ricea/streams that referenced this issue Aug 14, 2020
Make readable, writable, and transform streams transferable via
postMessage(stream, [stream]).

The streams themselves must be transferred, but the chunks written or
read from the streams are cloned, not transferred. Support for
transferring chunks will require API changes and is expected to be added
in a future update.

There is no reference implementation of this functionality as jsdom does
not support transferrable objects, and so it wouldn't be testable.

Closes whatwg#276.
@ricea ricea closed this in #1053 Aug 14, 2020
ricea added a commit that referenced this issue Aug 14, 2020
Make readable, writable, and transform streams transferable via
postMessage(stream, [stream]).

The streams themselves must be transferred, but the chunks written or
read from the streams are cloned, not transferred. Support for
transferring chunks will require API changes and is expected to be added
in a future update.

There is no reference implementation of this functionality as jsdom does
not support transferrable objects, and so it wouldn't be testable.

Closes #276.
@ricea
Copy link
Collaborator

@ricea ricea commented Aug 14, 2020

I've closed this issue since transferable streams have landed.

I've created a new issue for discussion of the double-transfer problem: #1063.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

10 participants