Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous clone = global.structuredClone(value, transfer = []) API #793

Closed
annevk opened this issue Mar 3, 2016 · 50 comments
Closed

Synchronous clone = global.structuredClone(value, transfer = []) API #793

annevk opened this issue Mar 3, 2016 · 50 comments
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: serialize and transfer

Comments

@annevk
Copy link
Member

annevk commented Mar 3, 2016

As proposed in https://lists.w3.org/Archives/Public/public-webapps/2015AprJun/thread.html#msg251 at some point there seems to be some interest in doing this and it would expose a primitive without having to go through postMessage()/onmessage.

Is this still a good idea in 2016?

@annevk annevk added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Mar 3, 2016
@banksJeremy
Copy link

banksJeremy commented Dec 19, 2016

I think there's a lot of demand for a structuredClone() function like this. Though it may not be the best practice, realistically it is very common to have a lot of application state in ad-hoc graphs of standard data structures. I've had to implement similar cloning functions to help support that on a few different projects, and I've seen bugs from people using JSON.parse(JSON.stringify(...)) instead without considering cyclic data structures, leading to unexpected crashes later.

A built-in feature or interface for deep cloning standard data structures is common in other dynamic languages, and is something I've seen many novices look for when starting to use JavaScript. It would be nice to have a standard that includes making this fully extensible, but as you discussed on the mailing list that is quite complicated and doesn't seem to be happening soon. Extensibility is somewhat orthogonal to exposing this existing functionality to users.

As an outsider this appears to be relatively low-hanging fruit that many users would benefit from.

@annevk
Copy link
Member Author

annevk commented Jan 23, 2018

As pointed out in https://twitter.com/DasSurma/status/955484341358022657 by @surma you can already do this synchronously today by (ab)using the history API. Seems like another reason to expose this.

@surma
Copy link
Contributor

surma commented Jan 23, 2018

I think there is a use-case for wanting to deep-copy objects, and the structured clone algorithm comes very close to that — it would solve the vast majority of use-cases.

The hack with the History API can be slow as there’s some cross-process communication going on.

I also wrote an asynchronous version using MessageChannel that turned out to be faster than the History API or JSON.parse(), even for big objects:

function structuredClone(obj) {
  return new Promise(resolve => {
    const {port1, port2} = new MessageChannel();
    port2.onmessage = ev => resolve(ev.data);
    port1.postMessage(obj);
  });
}

But sometime a synchronous version is very desirable.

As an outsider this appears to be relatively low-hanging fruit that many users would benefit from.

I agree with this.

@annevk
Copy link
Member Author

annevk commented Jan 23, 2018

The main thing blocking this is getting interest from implementers (judging by that 2015 thread there is interest from Mozilla) and then finding someone who wants to write the specification and someone who wants to write the tests (can be the same someone).

@ajklein @othermaciej @dstorey thoughts?

@surma
Copy link
Contributor

surma commented Jan 23, 2018

Maybe I'm naïve, but shouldn't specification be rather trivial, considering that the structured clone algorithm is already specified? If that is the case, I'm happy to take this as my opportunity to write my first spec bits and tests :D (Provided we get the interest bit sorted, of course)

@jeremyroman
Copy link
Collaborator

I wonder if this structured clone is actually what the developer wants. For instance, structured clone does some replication of the object graph, but makes no attempt to replicate the original prototype chain, so if the author has Point objects and expects to get Point objects out, they will be disappointed. (There's not an obvious reasonable way to do this cross-realm, but perhaps that is what authors want within the realm.)

I guess it's at least as close a match as JSON.parse(JSON.stringify(o)), though.

@RamIdeas
Copy link

@jeremyroman Not entirely sure it would be a full/strict clone if you still referenced the old Point constructor in the prototype chain so this would be kind of expected, right?

@jeremyroman
Copy link
Collaborator

It depends what the application is trying to do. Naively, I wouldn't blame an author for thinking this is reasonable:

let p = new Point(3, 4);
let p2 = clone(p);
console.assert(p2 instanceof Point);

Structured cloning necessarily clones only the things that you can kinda reasonably do across realms. I'm not sure it's a general-purpose deep clone (though I admit I'm not familiar with the ones apparently present in other dynamic languages), though it's possible that it is suitable for some author use cases.

@surma
Copy link
Contributor

surma commented Jan 25, 2018

though it's possible that it is suitable for some author use cases

I’d argue it’s enough for the majority of cases, but we’d have to look into that. Types by the author have never been cloned (unless the author also wrote a custom cloning function), so I think we should expose structured clone first, before thinking about how to handle the prototype chain.

@othermaciej
Copy link

Tagging @cdumez and @rniwa to give WebKit thoughts on this.

@samal-rasmussen
Copy link

Maybe we want to have two different clone variants, one that just does structural cloning and one that also clones the prototype hierarchy properly as well. Call em structualClone() and cloneWithPrototypeHierarchy() or whatever. In any case the former is basically already done, as surma mentioned, so why not? Let's go already.

@jeremyroman
Copy link
Collaborator

If what's wanted is a generic way to deeply clone ECMAScript objects, maybe that's something that belongs in the ECMAScript spec (or maybe just a third-party library) rather than the HTML spec.

On the other hand, if it's useful for authors to have semantics that match postMessage, IndexedDB, etc. (for which it's not really clear what dealing with the prototype chain would even mean), then perhaps HTML should expose the existing primitive as suggested.

@surma
Copy link
Contributor

surma commented Jan 25, 2018

I think for now this issue is about exposing the already existing structured cloning algorithm. I totally see that there’s a need for a proper copy (including prototype), but that would have to be a new algorithm and, as you said, is probably a better fit for ECMA262.

@jeremyroman
Copy link
Collaborator

Another thought here: assuming authors want these semantics, would it be more useful to expose a combined "structured clone" primitive (that serializes and deserializes immediately), or separate structured-serialize and structured-deserialize functions as some opaque SerializedValue object (which would allow deserialize to happen at a separate time, and if there is no transfer, even multiple times)?

@Dan503
Copy link

Dan503 commented Jan 27, 2018

I'm very much in favour of having an easy way to create a deep clone of an object :)

I'd prefer a syntax like this though:
Object.clone({key: "value"})

@annevk
Copy link
Member Author

annevk commented Jan 27, 2018

FWIW, I think there'll be the most chance of success if we start very simple, even simpler than OP suggests, with just global.structuredClone(value) which does StructuredDeserialize and StructuredSerialize internally. That'll be fairly straightforward to implement as well.

Supporting transferables, exposing StructuredDeserialize/StructuredSerialize separately as well as an intermediate value you can copy/message, making StructuredDeserialize/StructuredSerialize extensible for arbitrary JavaScript objects, etc. are definitely interesting, but seem less necessary for a v0 and flushing them out and gathering support would take a lot of time. None of them are blocked by this simple API v0 API either.

@domenic
Copy link
Member

domenic commented Jan 27, 2018

Although I agree with the tendency toward simplicity, I would argue that adding a transfer list is potentially valuable and shouldn't add much complexity given how it builds on spec primitives that are already there.

@surma surma mentioned this issue Jan 27, 2018
3 tasks
@surma
Copy link
Contributor

surma commented Jan 27, 2018

I started a PR for the spec change with #3414. I haven’t exposed the transfer list yet, but I can add that once we get the technicalities right :)

@annevk
Copy link
Member Author

annevk commented Jan 28, 2018

Note that the primitives for transferables might be wrong:

onmessage = e => w(e.ports[0])
postMessage(null, "*", [new MessageChannel().port1]);

The above ends up logging a MessagePort object, which I don't think works at the moment as the specification describes things. That's also why I cautioned against exposing transferables, as you need a more complex API; it's not just adding a second argument, it's also figuring out a new return value (or accepting you're not 1:1 with postMessage(), which gives room for arguments).

@surma
Copy link
Contributor

surma commented Jan 28, 2018

I think it’s okay to diverge from the behavior of postMessage() here. Strictly speaking, it wouldn’t even be diverging behavior because the structured clone in that scenario would be in e.data, which would still be null.

@annevk
Copy link
Member Author

annevk commented Jan 28, 2018

@surma it's diverging if you want to include transferables.

@domenic
Copy link
Member

domenic commented Jan 28, 2018

which I don't think works at the moment as the specification describes things

Why do you think that? We fixed all that a while back, from what I understand.

@annevk
Copy link
Member Author

annevk commented Jan 28, 2018

@domenic can you explain how the MessagePort object gets transfered (including allocation of a new object)?

@domenic
Copy link
Member

domenic commented Jan 28, 2018

@annevk
Copy link
Member Author

annevk commented Jan 28, 2018

@domenic how would serialized contain a [[TransferConsumed]] field?

@domenic
Copy link
Member

domenic commented Jan 28, 2018

@loilo
Copy link

loilo commented Sep 2, 2018

To even open up another flank on this (sorry in advance if this is inappropriate/out of scope):

It may be reasonable to think about a hook for modifying an object's structural clone from its inside — similar to what toJSON() enables to do:

JSON.parse(JSON.stringify({
  toJSON () {
    return 'foo'
  }
})) === 'foo'

I know this goes beyond just exposing existing functionality, but it may at least be a thought to consider (or reject) since it could not be added as a follow-up without a breaking change.

alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
97d644c was more breaking than planned as, e.g.,

postMessage(null, "*", [new ArrayBuffer(2)])
postMessage(null, "*", [new MessageChannel().port1])

still need to detach (and visibly transfer in case of MessagePort) the values given in the third argument.

See whatwg#793 (comment) for additional context.
@jeremyBanks
Copy link

jeremyBanks commented Feb 5, 2019

Potential relevant: elsewhere in the JavaScript ecosystem Node exposes their structured cloning/serialization implementation directly through their v8 built-in module, although it is still marked as "experimental".

const v8 = require('v8');
// ...
let clone = v8.deserialize(v8.serialize(original));

@annevk
Copy link
Member Author

annevk commented Feb 5, 2019

@jeremyBanks thanks for posting that!

The format is backward-compatible (i.e. safe to store to disk).

Is quite interesting. If browsers could agree on this format we'd have a new kind of JSON...

@jeremyroman
Copy link
Collaborator

Node is exposing V8's structured serialization implementation, which uses an evolution of Blink's (and before that, WebKit's) wire format, which is what Chromium stores in IndexedDB on disk, etc. (I imagine Mozilla has some similar format?)

Ours is missing some traits that might be desirable if it were to be used in places where JSON is. For instance, it is not forward-compatible: we assume you only ever read data in equal-or-greater versions of Chromium, which is probably not acceptable for use over a network or otherwise passed between different implementations.

@annevk
Copy link
Member Author

annevk commented Feb 5, 2019

Ah yeah, that would indeed not work. If we want to go there it's probably best discussed in its own issue, sorry for distracting this one.

@jakearchibald
Copy link
Contributor

@wanderview

but I wonder if we will regret not making it async

Which parts can be async? Since it's crawling a JS object, and creating new JS objects, I thought the bulk of the work would be main thread anyway.

@jeremyroman
Copy link
Collaborator

Conceivably deserialization of a very large object could be done in small pieces, yielding to the scheduler. It's unclear whether we would ever do this, but I think it would in principle be possible.

@jakearchibald
Copy link
Contributor

Yeah, fair enough. Reading the JS object would need to be sync, but creating the new ones could be spread over tasks.

Maybe we should have an async API too (but we can already do it with message ports), but we should definitely have a sync version.

@wanderview
Copy link
Member

I now think my original concern of storing consumables in IDB, etc, can be handled in a way separate from structured cloning. We would instead make these consumables use a "transfer" instead of a "copy". So you would transfer a Response and its body ReadableStream into IDB.

waves hands

@GrosSacASac
Copy link

I found a tc39 proposal in the stage 0 list. It looks like it is not active anymore, could someone present it ?

Should it be a tc39 proposal or a whatwg one ?

What are the next steps necessary to have it implemented in browsers ?

@domenic
Copy link
Member

domenic commented Aug 20, 2019

This works best as a WHATWG proposal as the WHATWG is the body that specifies structured cloning.

The next steps necessary to have it implemented in browsers are for browsers to determine that it's a high priority on their product roadmap (compared to other things they could spend engineering effort on). That is usually helped by evidence such as web developers advocating for it or showing what they're using instead. However, I think we've already reached a pretty good amount of evidence that this would be useful, so I'm not sure how to make progress on increasing the priority in browser teams' backlogs :(.

@rniwa
Copy link

rniwa commented Aug 20, 2019

It looks like two use cases being discussed are:

Both of these tweets are about cloning JS objects, not structured cloning, and some of the discussions explicitly mention a "proper" way of cloning JS objects.

I'd be curious to know more concrete use cases, and whatever v0 API being proposed here would satisfy any of them.

@surma
Copy link
Contributor

surma commented Aug 20, 2019

Yes, I was looking at deep-cloning at the time. Standardizing/exposing structured clone seemed like a low risk and low friction first step. It behaves correct for the majority of use cases (as far as I can tell) and is already specified and implemented in most engines.

The use-cases are mostly related to architectures relying on immutable data structures and chaining-style APIs.

@agm1984
Copy link

agm1984 commented Sep 14, 2019

I am encountering an issue currently in Vue JS whereby I pass an Object prop from a parent to child component like this:

Also I apologize for showing framework code, but my intent is to help establish context for a need.

export default {
    props: {
        initialValue: {
            type: Object,
            required: false,
            default: () => ({}),
        },
    },

    data() {
        return {
            value: this.initialValue,
        };
    },
};

The initialValue reference is copied to value, which unfortunately copies the Vue getter/setter functions that exist on initialValue.

This means if a person mutates the local state of value, Vue fires the hidden setter functions on the upstream reference and therefore breaks the component encapsulation. In this way, the data flow is not unidirectional, but it is expected behaviour simply because the reference is copied in a non-immutable fashion. Part of this is Vue's problem in my opinion. I think the framework itself should deep clone props as they enter into a component.

This example means you can pass a reference through 100 dimensions of child components and mutate the root component's state in a way that is extremely difficult to trace by visual code analysis. I suspect countless application developers will experience some form of this issue moving forward.

Currently, shallow cloning is not adequate as a solution because nested Objects do not have their child references broken. So in my opinion, the only viable solution is to use Lodash's cloneDeep function or equivalent.

My described issue would be solved using structuredClone like this:

    data() {
        return {
            value: structuredClone(this.initialValue),
        };
    },

I think browsers should natively support deep cloning ASAP, so that a third party cloning dependency can be avoided. This will save bandwidth for all parties by reducing bundle size in every library that packages some form of deep cloning in order to operate immutably.

Personally, I like the idea of following the implementation details from node.js because it will make it easier for both node.js and browsers to benefit from continued innovations from either node.js or browsers with respect to deep cloning and immutable paradigms.

@krisdages
Copy link

Does anyone have any suggestions on how to advocate for something like this to the browser vendors?
What if one was to just implement it in Chromium or Firefox and try to submit it to the project?

Or maybe standing outside headquarters chanting with a protest sign? :)

@jeremyroman
Copy link
Collaborator

You'd have to follow their respective launch processes to submit it. For Chromium, that's this process.

domenic pushed a commit to surma-dump/html that referenced this issue Jul 27, 2021
domenic pushed a commit that referenced this issue Jul 27, 2021
domenic pushed a commit that referenced this issue Jul 27, 2021
@whatwg whatwg deleted a comment from shijiexiao Nov 2, 2021
@whatwg whatwg temporarily blocked shijiexiao Nov 2, 2021
@whatwg whatwg deleted a comment from shijiexiao Nov 2, 2021
mfreed7 pushed a commit to mfreed7/html that referenced this issue Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: serialize and transfer
Development

No branches or pull requests