Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[WIP] Refactor structured clone into serialize/deserialize steps #2421
This needs a lot of polish but I wanted to get it up there before @annevk woke up in case he was interested. Some review of the overall strategy would be good too.
Points I am actively thinking about:
Points I haven't thought hard about yet but am really hoping that this new framework makes it easy:
In this last list, all points but the last one don't necessarily need to be resolved before merge, but we should feel relatively confident in the path toward solving them, so that we don't have to throw this all away and start over yet again.
First of, thanks so much for tackling this. Hopefully this third rewrite of the algorithm is the one that puts us on solid ground.
I don't really understand the need for recursive transfers. A transfer is basically moving a pointer. You can't really divide that further.
Now we're more comfortable with IDL extension, should we add IDL extension to make it easier to define the platform object bits? E.g., it seems serialization type could just be the interface name.
I think the main problems you hint at with deserialization can be solved if we put objects in charge of the full deserialization dance, including allocating themselves.
IDB I think needs a new primitive for storing that ends up copying any shared data (or IDB should simply disallow shared memory and you just need to convert to AB first).
The way MessagePort needs to work is that the set of tasks associated with it end up on an event loop and the task then needs to create an event and deserialize data as appropriate for that event loop. So we need to put much more logic in the actual task.
I hope this and the many nits and suggestions help. It might help to discuss more difficult points in a corresponding issue perhaps so we only deal with nits here? Up to you.
Can you explain why you think this is helpful or necessary?
We're already being very vague around allocation of new objects. E.g. technically the correct way to allocate a Boolean wrapper is:
but this is pretty ridiculous, so we just lean on the reader knowing that
is equivalent to the first two steps and
is equivalent to the third step.
I see the step
as analogous: it's a bit hand-wavy, but everyone knows what it means, and it should work fine. It should also be equivalent to the first step in most platform object's constructors, actually.
I don't see why this is a reason for requiring custom deserialization steps. It does imply I should add a note to the above stating that the object creation might fail for such a reason, though.
So, I'm not sure yet, but I think byte streams would be an example of this. You transfer the stream, which primarily means transferring any chunks already in the stream's queue recursively. (And then setting up a task to continually read from the stream and transfer more in the future.)
What do you mean exactly? I was thinking maybe we'll need SABs to serialize to normal ArrayBuffer stuff + source realm, so that when deserializing we can compare target and source realm for share-within-able-ness. And that is indeed weird from the sense of "realm independent", but technically still meets the definition.
Was that what you were thinking, or something else?
That would definitely help make the current setup less problematic.
I think what this means in practice is cloning the stream and transfering its contents. The main issue here is that it's no longer a synchronous operation, it becomes a sequence of intermittant operations. I would expect this to require a novel setup or some layering on top of what we have today.
I would expect that you can either find the Shared Data Block or not in the agent cluster. (Of course, with the current fallback of copying this isn't quite what implementations do, but this would seem the most quick.)
I don't think that's true. Platform objects are all allocated the same way, by Web IDL.
I don't understand this point. The allocation can fail I guess (if we are OOMing), and in that case the "Let value be a new instance of the platform object type identified by serialized.[[Type]]" step will fail. Changing to allow custom allocation doesn't allow failure any better.
I think that is the future I see, yeah. I'm not sure it's necessary to define in detail, but the idea is that like in implementations, all platform objects are just allocated the same way. The only important thing is getting the realm right, but that's what "in targetRealm" is about.
This ties into heycam/webidl#135 which is about defining "a new X object", i.e. allocation, in a tiny bit more detail.
Well, the question is, does it go in the transfer list or not? I think it needs to go in the transfer list, because the stream becomes "detached" via the locking mechanism. So these will be transfer steps, not serialization steps.
I think this will be slightly novel, but I think it fits into the setup reasonably well. You end up doing some kind of recursive promise loop inside the transfer steps. E.g. something like
This is clearly not complete, and might not work; maybe I should try to work it out completely, before we settle on this design. But I hope it demonstrates how recursive transfers might be used.
(Another issue here is that streams are technically not platform objects, but let's ignore that for now...)
That's interesting. Currently the ES spec doesn't tie Shared Data Blocks to agent clusters, but it seems like a reasonable thing for it to do...
Although, that brings up a better idea for speccing than storing the source realm: store the source agent cluster. At the spec level at least it's then just a simple check for identity, and is arguably more obviously realm-independent.
referenced this pull request
Mar 9, 2017
Made more progress and pushed more commits today. Still left:
It also would be really good to come up with a more concrete idea of how this will allow transferring streams.
My thinking for streams and promises and any other temporal API is that they effectively need a private MessageChannel to function if the initial clone/transfer happens through the existing
Another thing we need to do is collect downstream users that need updating:
It never needed to, partly because "agent cluster" is an emergent property from agents sharing memory and partly because "agent cluster" does not have a representation (in ecma262). In reality of course, "agent cluster" == Unit of Same-origin Browsing Context, the agents that can legally receive a SAB from each other by any means.
Indeed that's what code I'm working on for Firefox is doing, it stores the identity of the sender's Unit of Same-origin Browsing Contexts with the SAB representation; the receiver checks that its ditto identity matches the identity sent.
OK, I think this is ready for review and in theory landing. Note that deserialization error handling is divided into two cases:
I think that is OK for now and we'll fix the latter as part of the SAB work. If we care about handling the former differently we should file a separate issue and not hold up this work. I could be persuaded to leave it vague though instead of censoring to null.
As for dependent specifications:
Note that the cases that currently use StructuredClone will not break since we still define it here. However, it does seem like nobody actually wants to use StructuredClone directly. Everyone wants serialize/deserialize, including this spec. Maybe we should remove it?
Could you maybe post a copy online for review? That would make such a large change a little easier to judge. Meanwhile I spotted a few errors while skimming through.
https://dl.dropboxusercontent.com/u/20140634/structured-clone-rewrite/index.html#safe-passing-of-structured-data is a rendered version of this revision
Found two more nits and one somewhat larger concern. Really great work
What do you think the right plan is for merging this? I'd like to do it sooner rather than later so I can resume work on the SAB stuff. But I can understand if we want to have PRs lined up for as many other specs as possible first. I can do most PRs but will need your help around the blob URL store.