New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous clone = global.structuredClone(value, transfer = []) API #793

Open
annevk opened this Issue Mar 3, 2016 · 36 comments

Comments

@annevk
Member

annevk commented Mar 3, 2016

As proposed in https://lists.w3.org/Archives/Public/public-webapps/2015AprJun/thread.html#msg251 at some point there seems to be some interest in doing this and it would expose a primitive without having to go through postMessage()/onmessage.

Is this still a good idea in 2016?

@banksJeremy

This comment has been minimized.

banksJeremy commented Dec 19, 2016

I think there's a lot of demand for a structuredClone() function like this. Though it may not be the best practice, realistically it is very common to have a lot of application state in ad-hoc graphs of standard data structures. I've had to implement similar cloning functions to help support that on a few different projects, and I've seen bugs from people using JSON.parse(JSON.stringify(...)) instead without considering cyclic data structures, leading to unexpected crashes later.

A built-in feature or interface for deep cloning standard data structures is common in other dynamic languages, and is something I've seen many novices look for when starting to use JavaScript. It would be nice to have a standard that includes making this fully extensible, but as you discussed on the mailing list that is quite complicated and doesn't seem to be happening soon. Extensibility is somewhat orthogonal to exposing this existing functionality to users.

As an outsider this appears to be relatively low-hanging fruit that many users would benefit from.

@annevk

This comment has been minimized.

Member

annevk commented Jan 23, 2018

As pointed out in https://twitter.com/DasSurma/status/955484341358022657 by @surma you can already do this synchronously today by (ab)using the history API. Seems like another reason to expose this.

@surma

This comment has been minimized.

surma commented Jan 23, 2018

I think there is a use-case for wanting to deep-copy objects, and the structured clone algorithm comes very close to that — it would solve the vast majority of use-cases.

The hack with the History API can be slow as there’s some cross-process communication going on.

I also wrote an asynchronous version using MessageChannel that turned out to be faster than the History API or JSON.parse(), even for big objects:

function structuredClone(obj) {
  return new Promise(resolve => {
    const {port1, port2} = new MessageChannel();
    port2.onmessage = ev => resolve(ev.data);
    port1.postMessage(obj);
  });
}

But sometime a synchronous version is very desirable.

As an outsider this appears to be relatively low-hanging fruit that many users would benefit from.

I agree with this.

@annevk

This comment has been minimized.

Member

annevk commented Jan 23, 2018

The main thing blocking this is getting interest from implementers (judging by that 2015 thread there is interest from Mozilla) and then finding someone who wants to write the specification and someone who wants to write the tests (can be the same someone).

@ajklein @othermaciej @dstorey thoughts?

@surma

This comment has been minimized.

surma commented Jan 23, 2018

Maybe I'm naïve, but shouldn't specification be rather trivial, considering that the structured clone algorithm is already specified? If that is the case, I'm happy to take this as my opportunity to write my first spec bits and tests :D (Provided we get the interest bit sorted, of course)

@jeremyroman

This comment has been minimized.

jeremyroman commented Jan 23, 2018

I wonder if this structured clone is actually what the developer wants. For instance, structured clone does some replication of the object graph, but makes no attempt to replicate the original prototype chain, so if the author has Point objects and expects to get Point objects out, they will be disappointed. (There's not an obvious reasonable way to do this cross-realm, but perhaps that is what authors want within the realm.)

I guess it's at least as close a match as JSON.parse(JSON.stringify(o)), though.

@RamIdeas

This comment has been minimized.

RamIdeas commented Jan 23, 2018

@jeremyroman Not entirely sure it would be a full/strict clone if you still referenced the old Point constructor in the prototype chain so this would be kind of expected, right?

@jeremyroman

This comment has been minimized.

jeremyroman commented Jan 23, 2018

It depends what the application is trying to do. Naively, I wouldn't blame an author for thinking this is reasonable:

let p = new Point(3, 4);
let p2 = clone(p);
console.assert(p2 instanceof Point);

Structured cloning necessarily clones only the things that you can kinda reasonably do across realms. I'm not sure it's a general-purpose deep clone (though I admit I'm not familiar with the ones apparently present in other dynamic languages), though it's possible that it is suitable for some author use cases.

@surma

This comment has been minimized.

surma commented Jan 25, 2018

though it's possible that it is suitable for some author use cases

I’d argue it’s enough for the majority of cases, but we’d have to look into that. Types by the author have never been cloned (unless the author also wrote a custom cloning function), so I think we should expose structured clone first, before thinking about how to handle the prototype chain.

@othermaciej

This comment has been minimized.

othermaciej commented Jan 25, 2018

Tagging @cdumez and @rniwa to give WebKit thoughts on this.

@samal84

This comment has been minimized.

samal84 commented Jan 25, 2018

Maybe we want to have two different clone variants, one that just does structural cloning and one that also clones the prototype hierarchy properly as well. Call em structualClone() and cloneWithPrototypeHierarchy() or whatever. In any case the former is basically already done, as surma mentioned, so why not? Let's go already.

@jeremyroman

This comment has been minimized.

jeremyroman commented Jan 25, 2018

If what's wanted is a generic way to deeply clone ECMAScript objects, maybe that's something that belongs in the ECMAScript spec (or maybe just a third-party library) rather than the HTML spec.

On the other hand, if it's useful for authors to have semantics that match postMessage, IndexedDB, etc. (for which it's not really clear what dealing with the prototype chain would even mean), then perhaps HTML should expose the existing primitive as suggested.

@surma

This comment has been minimized.

surma commented Jan 25, 2018

I think for now this issue is about exposing the already existing structured cloning algorithm. I totally see that there’s a need for a proper copy (including prototype), but that would have to be a new algorithm and, as you said, is probably a better fit for ECMA262.

@jeremyroman

This comment has been minimized.

jeremyroman commented Jan 26, 2018

Another thought here: assuming authors want these semantics, would it be more useful to expose a combined "structured clone" primitive (that serializes and deserializes immediately), or separate structured-serialize and structured-deserialize functions as some opaque SerializedValue object (which would allow deserialize to happen at a separate time, and if there is no transfer, even multiple times)?

@Dan503

This comment has been minimized.

Dan503 commented Jan 27, 2018

I'm very much in favour of having an easy way to create a deep clone of an object :)

I'd prefer a syntax like this though:
Object.clone({key: "value"})

@annevk

This comment has been minimized.

Member

annevk commented Jan 27, 2018

FWIW, I think there'll be the most chance of success if we start very simple, even simpler than OP suggests, with just global.structuredClone(value) which does StructuredDeserialize and StructuredSerialize internally. That'll be fairly straightforward to implement as well.

Supporting transferables, exposing StructuredDeserialize/StructuredSerialize separately as well as an intermediate value you can copy/message, making StructuredDeserialize/StructuredSerialize extensible for arbitrary JavaScript objects, etc. are definitely interesting, but seem less necessary for a v0 and flushing them out and gathering support would take a lot of time. None of them are blocked by this simple API v0 API either.

@domenic

This comment has been minimized.

Member

domenic commented Jan 27, 2018

Although I agree with the tendency toward simplicity, I would argue that adding a transfer list is potentially valuable and shouldn't add much complexity given how it builds on spec primitives that are already there.

@surma

This comment has been minimized.

surma commented Jan 27, 2018

I started a PR for the spec change with #3414. I haven’t exposed the transfer list yet, but I can add that once we get the technicalities right :)

@annevk

This comment has been minimized.

Member

annevk commented Jan 28, 2018

Note that the primitives for transferables might be wrong:

onmessage = e => w(e.ports[0])
postMessage(null, "*", [new MessageChannel().port1]);

The above ends up logging a MessagePort object, which I don't think works at the moment as the specification describes things. That's also why I cautioned against exposing transferables, as you need a more complex API; it's not just adding a second argument, it's also figuring out a new return value (or accepting you're not 1:1 with postMessage(), which gives room for arguments).

@surma

This comment has been minimized.

surma commented Jan 28, 2018

I think it’s okay to diverge from the behavior of postMessage() here. Strictly speaking, it wouldn’t even be diverging behavior because the structured clone in that scenario would be in e.data, which would still be null.

@annevk

This comment has been minimized.

Member

annevk commented Jan 28, 2018

@surma it's diverging if you want to include transferables.

@domenic

This comment has been minimized.

Member

domenic commented Jan 28, 2018

which I don't think works at the moment as the specification describes things

Why do you think that? We fixed all that a while back, from what I understand.

@annevk

This comment has been minimized.

Member

annevk commented Jan 28, 2018

@domenic can you explain how the MessagePort object gets transfered (including allocation of a new object)?

@domenic

This comment has been minimized.

@annevk

This comment has been minimized.

Member

annevk commented Jan 28, 2018

@domenic how would serialized contain a [[TransferConsumed]] field?

@domenic

This comment has been minimized.

Member

domenic commented Jan 28, 2018

@annevk

This comment has been minimized.

Member

annevk commented Jan 28, 2018

But that does not end up affecting serialized in my example, as far as I can tell.

@domenic

This comment has been minimized.

Member

domenic commented Jan 28, 2018

I see, yeah.

annevk added a commit that referenced this issue Feb 1, 2018

Fix StructuredDeserializeWithTransfer()
97d644c was more breaking than planned as, e.g.,

postMessage(null, "*", [new ArrayBuffer(2)])
postMessage(null, "*", [new MessageChannel().port1])

still need to detach (and visibly transfer in case of MessagePort) the values given in the third argument.

See #793 (comment) for additional context.
@annevk

This comment has been minimized.

Member

annevk commented Feb 1, 2018

I posted a fix for that issue. I'm not sure to what extent it should affect the API. I find it a little weird if we don't expose the full primitive for transferables since unless we add a second method it would be hard to do so going forward. And if we have two methods, I'd rather have the simpler variant not support transferables. Just take one value and return one.

@samal84

This comment has been minimized.

samal84 commented Feb 1, 2018

"And if we have two methods, I'd rather have the simpler variant not support transferables. Just take one value and return one."

A good example of "worse is better". This would make it even easier, and therefore maybe also even quicker, to get implemented. And it would cover the 99% use case right? Has anyone even thought of the use cases for transferrables when doing a simple direct structuralClone() invocation? Why do we care?

@domenic

This comment has been minimized.

Member

domenic commented Feb 1, 2018

I don't think omitting transferables would have any impact on implementability. Structured clone with transfer already exists in all browsers.

Transferring is quite useful for many use cases, e.g. when you want to take ownership of memory.

domenic added a commit that referenced this issue Feb 5, 2018

Fix StructuredDeserializeWithTransfer()
97d644c was more breaking than planned as, e.g.,

postMessage(null, "*", [new ArrayBuffer(2)])
postMessage(null, "*", [new MessageChannel().port1])

still need to detach (and visibly transfer in case of MessagePort) the values given in the third argument.

See #793 (comment) for additional context.
@wanderview

This comment has been minimized.

Member

wanderview commented May 15, 2018

It seems synchronous structured clone is already exposed via history state, but I wonder if we will regret not making it async. It would be really nice to store async consumable things like Request objects with stream bodies in IDB. These can only be consumed or copied async. Hopefully synchronous structured clone will not prevent this use case.

@jeremyBanks

This comment has been minimized.

jeremyBanks commented May 15, 2018

@wanderview That is interesting, but would be much more complicated and fraught than the structured clone primitive we have. It seems unlikely that structured clone would ever be evolved in that direction, as it would impact almost every existing use and require significant changes across every browser and spec.

I think that would need to be something new and different, and shouldn't affect the decision here.

@surma

This comment has been minimized.

surma commented May 23, 2018

I wonder if we will regret not making it async.

We have a fairly performant async structured clone exposed via MessageChannel. I think for any async cases that API is actually good enough for now. It’s still somewhat of an API abuse, but much less than History API ^^ Having an explicit synchronous clone seems useful to me, especially to keep certain APIs in tact but make them more performant. Or am I misunderstanding your point?

@jeremyroman

This comment has been minimized.

jeremyroman commented May 23, 2018

It's not clear to me what it would mean to make it asynchronous. Nothing prevents you from synchronously cloning an object that you consume asynchronously. The basic problem is that at least the serialization half must be done ~synchronously if you want to preserve the existing semantics (otherwise the visible property access etc. can occur at some later time). Anything asynchronous will probably have less predictable behavior and greater overhead on reasonably small object graphs (bearing in mind it is impossible to tell in advance, in general, whether the object graph will be small, because we can trigger getters, proxies, etc, while traversing).

The serialization and deserialization steps could be fairly easily decoupled, which more or less splits the work in two.

@Loilo

This comment has been minimized.

Loilo commented Sep 2, 2018

To even open up another flank on this (sorry in advance if this is inappropriate/out of scope):

It may be reasonable to think about a hook for modifying an object's structural clone from its inside — similar to what toJSON() enables to do:

JSON.parse(JSON.stringify({
  toJSON () {
    return 'foo'
  }
})) === 'foo'

I know this goes beyond just exposing existing functionality, but it may at least be a thought to consider (or reject) since it could not be added as a follow-up without a breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment