New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Refactor structured clone into serialize/deserialize steps #2421

Merged
merged 40 commits into from Mar 20, 2017

Conversation

4 participants
@domenic
Member

domenic commented Mar 8, 2017

This needs a lot of polish but I wanted to get it up there before @annevk woke up in case he was interested. Some review of the overall strategy would be good too.

Points I am actively thinking about:

  • There are a lot of abstract ops: StructuredSerialize, StructuredDeserialize, StructuredTransfer, StructuredReceiveTransfer, StructuredSerializeWithTransfer, StructuredDeserializeWithTransfer, and StructuredCloneWithTransfer. I think they all end up being necessary, but we're going to need some serious guidance on how to use them. My breakdown:
    • StructuredSerialize/StructuredDeserialize for cases where there is no transfer involved, like IDB
    • StructuredTransfer/StructuredReceiveTransfer are only used by the transfer steps/receive transfer steps when they need to do something recursively.
      • We don't appear to have any of these on that platform though---and maybe we shouldn't, i.e. you should always transfer the innermost thing, like you do with ArrayBuffer vs. TypedArray? Hmm, not sure.
      • If we don't allow such recursive transfer cases then we could in theory inline these into StructuredSerializeWithTransfer/StructuredDeserializeWithTransfer.
    • StructuredSerializeWithTransfer/StructuredDeserializeWithTransfer are what are used by MessagePort
    • StructuredCloneWithTransfer is a convenience for when the target realm is known (which is most cases)
  • StructuredClone needs to get deleted, as does the cloneable objects section
  • Lots of IDs need to be preserved
  • All the [[Clone]] and [[Transfer]] internal methods in the spec need to get changed to appropriate serialization/deserialization and transfer/receive transfer steps.

Points I haven't thought hard about yet but am really hoping that this new framework makes it easy:

  • How to define things so that SharedArrayBuffers can be serialized, but not then deserialized outside the agent cluster
  • How to make it obvious that when IDB writes a serialized record containing a shared data block to the disk, it doesn't somehow keep track of modifications to the original memory and then also update the version on the disk.
  • How to use the fact that we can get separate serialization/deserialization errors to get the desired error handling for the SAB case and ArrayBuffer allocation case
  • How to use this new framework to allow "entangling" style transfers for streams
  • How exactly to write the spec for MessagePort using the primitives I wrote up so that it makes sense and resolves #2277.
    • This needs to be resolved before merge

In this last list, all points but the last one don't necessarily need to be resolved before merge, but we should feel relatively confident in the path toward solving them, so that we don't have to throw this all away and start over yet again.

@annevk

First of, thanks so much for tackling this. Hopefully this third rewrite of the algorithm is the one that puts us on solid ground.

I don't really understand the need for recursive transfers. A transfer is basically moving a pointer. You can't really divide that further.

Now we're more comfortable with IDL extension, should we add IDL extension to make it easier to define the platform object bits? E.g., it seems serialization type could just be the interface name.

I think the main problems you hint at with deserialization can be solved if we put objects in charge of the full deserialization dance, including allocating themselves.

IDB I think needs a new primitive for storing that ends up copying any shared data (or IDB should simply disallow shared memory and you just need to convert to AB first).

The way MessagePort needs to work is that the set of tasks associated with it end up on an event loop and the task then needs to create an event and deserialize data as appropriate for that event loop. So we need to put much more logic in the actual task.

I hope this and the many nits and suggestions help. It might help to discuss more difficult points in a corresponding issue perhaps so we only deal with nits here? Up to you.

source
+ <dt>A <dfn data-export="">serialization type string</dfn></dt>
+ <dd><p>A string that uniquely identifies this class of <span>serializable objects</span>, so that
+ the <span>StructuredSerialize</span> and <span>StructuredDeserialize</span> algorithms can
+ recognize instances of serialized data to which this definition applies.</dd>

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

Nit: I would call this the "serialization type". We generally don't put the type in names.

@annevk

annevk Mar 8, 2017

Member

Nit: I would call this the "serialization type". We generally don't put the type in names.

+ <dd>
+ <p>A set of steps that serializes the data in <var>input</var> into fields of
+ <var>serialized</var>. The resulting data serialized into <var>serialized</var> must be
+ independent of any <span>JavaScript Realm</span>.</p>

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

On initial reading I thought this requirement could not be met by SAB. I understand what you're going for though and I don't really know a better way of phrasing it.

@annevk

annevk Mar 8, 2017

Member

On initial reading I thought this requirement could not be met by SAB. I understand what you're going for though and I don't really know a better way of phrasing it.

+ </dd>
+
+ <dt><dfn data-export="">deserialization steps</dfn>, taking a <span>Record</span>
+ <var>serialized</var> and a <span>platform object</span> <var>value</var></dt>

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

Don't these steps have to define how the platform object is allocated?

@annevk

annevk Mar 8, 2017

Member

Don't these steps have to define how the platform object is allocated?

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

That would also make it easier to make failure happen in case the platform object cannot even appear in that realm, the constructor would simply fail.

@annevk

annevk Mar 8, 2017

Member

That would also make it easier to make failure happen in case the platform object cannot even appear in that realm, the constructor would simply fail.

source
+ <span>"<code>DataCloneError</code>"</span> <code>DOMException</code>.</p>
+
+ <p class="example">For instance, a proxy object.</p>
+ </li>

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

Nit: indentation.

@annevk

annevk Mar 8, 2017

Member

Nit: indentation.

source
+ <span>"<code>DataCloneError</code>"</span> <code>DOMException</code>.</p></li>
+
+ <li><p>If <var>value</var> has a [[BooleanData]] internal slot, then let <var>serialized</var> be
+ { [[Type]]: "Boolean", [[BooleanData]]: <var>value</var>.[[BooleanData]] }.</p></li>

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

Should we also try to change the "let serialized" pattern? Using let for the same variable so many times makes me a little anxious.

@annevk

annevk Mar 8, 2017

Member

Should we also try to change the "let serialized" pattern? Using let for the same variable so many times makes me a little anxious.

This comment has been minimized.

@domenic

domenic Mar 8, 2017

Member

I guess we should, as we seem to be moving in that direction, but it just feels a bit silly when you write out the "right" way...

@domenic

domenic Mar 8, 2017

Member

I guess we should, as we seem to be moving in that direction, but it just feels a bit silly when you write out the "right" way...

source
+
+ <div class="example">
+ <p>It's important to realize that the <span data-x="Record">Records</span>
+ produced by <span>StructuredSerialize</span> may contain "pointers" to other records that create

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

Nit: may -> can.

@annevk

annevk Mar 8, 2017

Member

Nit: may -> can.

This comment has been minimized.

@zcorpan

zcorpan Mar 8, 2017

Member

Fixed linter to catch this whatwg/html-build#105

@zcorpan

zcorpan Mar 8, 2017

Member

Fixed linter to catch this whatwg/html-build#105

source
+ <span>platform object</span>.</p></li>
+
+ <li><p>Let <var>value</var> be a new instance of the <span>platform object</span> type
+ identified by <var>serialized</var>.[[Type]].</p></li>

This comment has been minimized.

@annevk

annevk Mar 8, 2017

Member

I think this allocation should be defined by the deserialization steps of the platform object. In particular we might want to support a situation where you can have a platform object that doesn't exist in all contexts. E.g., it can only be passed around between [SecureContext] environments. But it also seems nice to allow platform objects the same conveniences when it comes to construction as we use for JavaScript objects. That is, reusing existing algorithms taking as arguments other bits from the record.

@annevk

annevk Mar 8, 2017

Member

I think this allocation should be defined by the deserialization steps of the platform object. In particular we might want to support a situation where you can have a platform object that doesn't exist in all contexts. E.g., it can only be passed around between [SecureContext] environments. But it also seems nice to allow platform objects the same conveniences when it comes to construction as we use for JavaScript objects. That is, reusing existing algorithms taking as arguments other bits from the record.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Mar 8, 2017

Member

I think the main problems you hint at with deserialization can be solved if we put objects in charge of the full deserialization dance, including allocating themselves.

Can you explain why you think this is helpful or necessary?

We're already being very vague around allocation of new objects. E.g. technically the correct way to allocate a Boolean wrapper is:

  1. Let constructor be targetRealm.[[intrinsics]].[[%Boolean%]].
  2. Let O be ? OrdinaryCreateFromConstructor(constructor, "%BooleanPrototype%", « [[BooleanData]] »).
  3. Set O.[[BooleanData]] to serialized.[[BooleanData]].

but this is pretty ridiculous, so we just lean on the reader knowing that

allocate a new Boolean object in targetRealm

is equivalent to the first two steps and

with its [[BooleanData]] internal slot value set to serialized.[[BooleanData]]

is equivalent to the third step.

I see the step

Let value be a new instance of the platform object type identified by serialized.[[Type]]

as analogous: it's a bit hand-wavy, but everyone knows what it means, and it should work fine. It should also be equivalent to the first step in most platform object's constructors, actually.

I think this allocation should be defined by the deserialization steps of the platform object. In particular we might want to support a situation where you can have a platform object that doesn't exist in all contexts. E.g., it can only be passed around between [SecureContext] environments.

I don't see why this is a reason for requiring custom deserialization steps. It does imply I should add a note to the above stating that the object creation might fail for such a reason, though.

I don't really understand the need for recursive transfers. A transfer is basically moving a pointer. You can't really divide that further.

So, I'm not sure yet, but I think byte streams would be an example of this. You transfer the stream, which primarily means transferring any chunks already in the stream's queue recursively. (And then setting up a task to continually read from the stream and transfer more in the future.)

Now we're more comfortable with IDL extension, should we add IDL extension to make it easier to define the platform object bits? E.g., it seems serialization type could just be the interface name.

Great idea.

On initial reading I thought this requirement could not be met by SAB. I understand what you're going for though and I don't really know a better way of phrasing it.

What do you mean exactly? I was thinking maybe we'll need SABs to serialize to normal ArrayBuffer stuff + source realm, so that when deserializing we can compare target and source realm for share-within-able-ness. And that is indeed weird from the sense of "realm independent", but technically still meets the definition.

Was that what you were thinking, or something else?

Member

domenic commented Mar 8, 2017

I think the main problems you hint at with deserialization can be solved if we put objects in charge of the full deserialization dance, including allocating themselves.

Can you explain why you think this is helpful or necessary?

We're already being very vague around allocation of new objects. E.g. technically the correct way to allocate a Boolean wrapper is:

  1. Let constructor be targetRealm.[[intrinsics]].[[%Boolean%]].
  2. Let O be ? OrdinaryCreateFromConstructor(constructor, "%BooleanPrototype%", « [[BooleanData]] »).
  3. Set O.[[BooleanData]] to serialized.[[BooleanData]].

but this is pretty ridiculous, so we just lean on the reader knowing that

allocate a new Boolean object in targetRealm

is equivalent to the first two steps and

with its [[BooleanData]] internal slot value set to serialized.[[BooleanData]]

is equivalent to the third step.

I see the step

Let value be a new instance of the platform object type identified by serialized.[[Type]]

as analogous: it's a bit hand-wavy, but everyone knows what it means, and it should work fine. It should also be equivalent to the first step in most platform object's constructors, actually.

I think this allocation should be defined by the deserialization steps of the platform object. In particular we might want to support a situation where you can have a platform object that doesn't exist in all contexts. E.g., it can only be passed around between [SecureContext] environments.

I don't see why this is a reason for requiring custom deserialization steps. It does imply I should add a note to the above stating that the object creation might fail for such a reason, though.

I don't really understand the need for recursive transfers. A transfer is basically moving a pointer. You can't really divide that further.

So, I'm not sure yet, but I think byte streams would be an example of this. You transfer the stream, which primarily means transferring any chunks already in the stream's queue recursively. (And then setting up a task to continually read from the stream and transfer more in the future.)

Now we're more comfortable with IDL extension, should we add IDL extension to make it easier to define the platform object bits? E.g., it seems serialization type could just be the interface name.

Great idea.

On initial reading I thought this requirement could not be met by SAB. I understand what you're going for though and I don't really know a better way of phrasing it.

What do you mean exactly? I was thinking maybe we'll need SABs to serialize to normal ArrayBuffer stuff + source realm, so that when deserializing we can compare target and source realm for share-within-able-ness. And that is indeed weird from the sense of "realm independent", but technically still meets the definition.

Was that what you were thinking, or something else?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Mar 8, 2017

Member

Can you explain why you think this is helpful or necessary?

  1. Platform objects might want to use something like ArrayCreate as well.
  2. We know from ArrayBuffer that allocating an object can fail earlier on. It seems less clean to have something allocated already and later fail.
  3. Defining how allocation happens in detail is something we need to get better at over time I think. (Main question that comes up here is about globals and whether any associated objects are in the same global. Not applicable here I think.) Perhaps the answer is that all platform objects should be able to be allocated in this way and filled in later, but deciding that here feels a little wrong if IDL doesn't require it (and doesn't define the actual allocation process). (Now, if that's the future you see, I guess I'm okay with that, but please file a follow-up issue on IDL for later.)

It does imply I should add a note to the above stating that the object creation might fail for such a reason, though.

That would definitely help make the current setup less problematic.

You transfer the stream, which primarily means transferring any chunks already in the stream's queue recursively.

I think what this means in practice is cloning the stream and transfering its contents. The main issue here is that it's no longer a synchronous operation, it becomes a sequence of intermittant operations. I would expect this to require a novel setup or some layering on top of what we have today.

I was thinking maybe we'll need SABs to serialize to normal ArrayBuffer stuff + source realm, so that when deserializing we can compare target and source realm for share-within-able-ness.

I would expect that you can either find the Shared Data Block or not in the agent cluster. (Of course, with the current fallback of copying this isn't quite what implementations do, but this would seem the most quick.)

Member

annevk commented Mar 8, 2017

Can you explain why you think this is helpful or necessary?

  1. Platform objects might want to use something like ArrayCreate as well.
  2. We know from ArrayBuffer that allocating an object can fail earlier on. It seems less clean to have something allocated already and later fail.
  3. Defining how allocation happens in detail is something we need to get better at over time I think. (Main question that comes up here is about globals and whether any associated objects are in the same global. Not applicable here I think.) Perhaps the answer is that all platform objects should be able to be allocated in this way and filled in later, but deciding that here feels a little wrong if IDL doesn't require it (and doesn't define the actual allocation process). (Now, if that's the future you see, I guess I'm okay with that, but please file a follow-up issue on IDL for later.)

It does imply I should add a note to the above stating that the object creation might fail for such a reason, though.

That would definitely help make the current setup less problematic.

You transfer the stream, which primarily means transferring any chunks already in the stream's queue recursively.

I think what this means in practice is cloning the stream and transfering its contents. The main issue here is that it's no longer a synchronous operation, it becomes a sequence of intermittant operations. I would expect this to require a novel setup or some layering on top of what we have today.

I was thinking maybe we'll need SABs to serialize to normal ArrayBuffer stuff + source realm, so that when deserializing we can compare target and source realm for share-within-able-ness.

I would expect that you can either find the Shared Data Block or not in the agent cluster. (Of course, with the current fallback of copying this isn't quite what implementations do, but this would seem the most quick.)

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Mar 8, 2017

Member

Platform objects might want to use something like ArrayCreate as well.

I don't think that's true. Platform objects are all allocated the same way, by Web IDL.

We know from ArrayBuffer that allocating an object can fail earlier on. It seems less clean to have something allocated already and later fail.

I don't understand this point. The allocation can fail I guess (if we are OOMing), and in that case the "Let value be a new instance of the platform object type identified by serialized.[[Type]]" step will fail. Changing to allow custom allocation doesn't allow failure any better.

Defining how allocation happens in detail is something we need to get better at over time I think. (Main question that comes up here is about globals and whether any associated objects are in the same global. Not applicable here I think.) Perhaps the answer is that all platform objects should be able to be allocated in this way and filled in later, but deciding that here feels a little wrong if IDL doesn't require it (and doesn't define the actual allocation process). (Now, if that's the future you see, I guess I'm okay with that, but please file a follow-up issue on IDL for later.)

I think that is the future I see, yeah. I'm not sure it's necessary to define in detail, but the idea is that like in implementations, all platform objects are just allocated the same way. The only important thing is getting the realm right, but that's what "in targetRealm" is about.

This ties into heycam/webidl#135 which is about defining "a new X object", i.e. allocation, in a tiny bit more detail.

I think what this means in practice is cloning the stream and transfering its contents. The main issue here is that it's no longer a synchronous operation, it becomes a sequence of intermittant operations. I would expect this to require a novel setup or some layering on top of what we have today.

Well, the question is, does it go in the transfer list or not? I think it needs to go in the transfer list, because the stream becomes "detached" via the locking mechanism. So these will be transfer steps, not serialization steps.

I think this will be slightly novel, but I think it fits into the setup reasonably well. You end up doing some kind of recursive promise loop inside the transfer steps. E.g. something like

  1. Get a reader, thus locking/detaching the stream
  2. Let dataHolder be { [[Type]]: "ReadableStream", [[Queue]]: an empty List }.
  3. Let promise be reader.read()
  4. Set up a listener for when promise settles to { value, done }:
    1. Let transferredValue be StructuredTransfer(value).
    2. Add transferredValue to dataHolder's [[Queue]].
    3. Post a task (hmm, on what event loop though) to do the deserialization logic for anyone receiving the transfer?
  5. Return dataHolder.

This is clearly not complete, and might not work; maybe I should try to work it out completely, before we settle on this design. But I hope it demonstrates how recursive transfers might be used.

(Another issue here is that streams are technically not platform objects, but let's ignore that for now...)

I would expect that you can either find the Shared Data Block or not in the agent cluster. (Of course, with the current fallback of copying this isn't quite what implementations do, but this would seem the most quick.)

That's interesting. Currently the ES spec doesn't tie Shared Data Blocks to agent clusters, but it seems like a reasonable thing for it to do...

Although, that brings up a better idea for speccing than storing the source realm: store the source agent cluster. At the spec level at least it's then just a simple check for identity, and is arguably more obviously realm-independent.

Member

domenic commented Mar 8, 2017

Platform objects might want to use something like ArrayCreate as well.

I don't think that's true. Platform objects are all allocated the same way, by Web IDL.

We know from ArrayBuffer that allocating an object can fail earlier on. It seems less clean to have something allocated already and later fail.

I don't understand this point. The allocation can fail I guess (if we are OOMing), and in that case the "Let value be a new instance of the platform object type identified by serialized.[[Type]]" step will fail. Changing to allow custom allocation doesn't allow failure any better.

Defining how allocation happens in detail is something we need to get better at over time I think. (Main question that comes up here is about globals and whether any associated objects are in the same global. Not applicable here I think.) Perhaps the answer is that all platform objects should be able to be allocated in this way and filled in later, but deciding that here feels a little wrong if IDL doesn't require it (and doesn't define the actual allocation process). (Now, if that's the future you see, I guess I'm okay with that, but please file a follow-up issue on IDL for later.)

I think that is the future I see, yeah. I'm not sure it's necessary to define in detail, but the idea is that like in implementations, all platform objects are just allocated the same way. The only important thing is getting the realm right, but that's what "in targetRealm" is about.

This ties into heycam/webidl#135 which is about defining "a new X object", i.e. allocation, in a tiny bit more detail.

I think what this means in practice is cloning the stream and transfering its contents. The main issue here is that it's no longer a synchronous operation, it becomes a sequence of intermittant operations. I would expect this to require a novel setup or some layering on top of what we have today.

Well, the question is, does it go in the transfer list or not? I think it needs to go in the transfer list, because the stream becomes "detached" via the locking mechanism. So these will be transfer steps, not serialization steps.

I think this will be slightly novel, but I think it fits into the setup reasonably well. You end up doing some kind of recursive promise loop inside the transfer steps. E.g. something like

  1. Get a reader, thus locking/detaching the stream
  2. Let dataHolder be { [[Type]]: "ReadableStream", [[Queue]]: an empty List }.
  3. Let promise be reader.read()
  4. Set up a listener for when promise settles to { value, done }:
    1. Let transferredValue be StructuredTransfer(value).
    2. Add transferredValue to dataHolder's [[Queue]].
    3. Post a task (hmm, on what event loop though) to do the deserialization logic for anyone receiving the transfer?
  5. Return dataHolder.

This is clearly not complete, and might not work; maybe I should try to work it out completely, before we settle on this design. But I hope it demonstrates how recursive transfers might be used.

(Another issue here is that streams are technically not platform objects, but let's ignore that for now...)

I would expect that you can either find the Shared Data Block or not in the agent cluster. (Of course, with the current fallback of copying this isn't quite what implementations do, but this would seem the most quick.)

That's interesting. Currently the ES spec doesn't tie Shared Data Blocks to agent clusters, but it seems like a reasonable thing for it to do...

Although, that brings up a better idea for speccing than storing the source realm: store the source agent cluster. At the spec level at least it's then just a simple check for identity, and is arguably more obviously realm-independent.

@domenic domenic referenced this pull request in w3c/FileAPI Mar 9, 2017

Closed

Define cloning of Blob and FileList objects inline #32

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Mar 9, 2017

Member

Made more progress and pushed more commits today. Still left:

  • Decide whether StructuredTransfer, StructuredReceiveTransfer, and IsTransferable should stay as separate abstract ops, or get inlined into their single call sites. It sounds like @annevk is leaning toward inlining, at least until we have a clear need for them; I guess I'm OK with that.
  • Update call sites of StructuredCloneWithTransfer to use the correct tuple names (TODO is left in the source for this)
  • Update MessagePort's postMessage to actually make use of the split operations
  • Stop the double-StructuredClone()ing in BroadcastChannel and History since we can just serialize/deserialize instead.

It also would be really good to come up with a more concrete idea of how this will allow transferring streams.

Member

domenic commented Mar 9, 2017

Made more progress and pushed more commits today. Still left:

  • Decide whether StructuredTransfer, StructuredReceiveTransfer, and IsTransferable should stay as separate abstract ops, or get inlined into their single call sites. It sounds like @annevk is leaning toward inlining, at least until we have a clear need for them; I guess I'm OK with that.
  • Update call sites of StructuredCloneWithTransfer to use the correct tuple names (TODO is left in the source for this)
  • Update MessagePort's postMessage to actually make use of the split operations
  • Stop the double-StructuredClone()ing in BroadcastChannel and History since we can just serialize/deserialize instead.

It also would be really good to come up with a more concrete idea of how this will allow transferring streams.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Mar 9, 2017

Member

My thinking for streams and promises and any other temporal API is that they effectively need a private MessageChannel to function if the initial clone/transfer happens through the existing postMessage() API (be it on Window or MessagePort). That would also not require the introduction of any new primitives.

Another thing we need to do is collect downstream users that need updating:

Member

annevk commented Mar 9, 2017

My thinking for streams and promises and any other temporal API is that they effectively need a private MessageChannel to function if the initial clone/transfer happens through the existing postMessage() API (be it on Window or MessagePort). That would also not require the introduction of any new primitives.

Another thing we need to do is collect downstream users that need updating:

source
</ol>
</li>
<li>
- <p>Otherwise, if <var>input</var> has [[SetData]] internal slot, then:</p>
+ <p>Otherwise, if <var>value</var> is an Array exotic object, then:</p> <!-- IsArray supports
+ proxies too, which we cannot -->

This comment has been minimized.

@annevk

annevk Mar 9, 2017

Member

Nit: I prefer keeping the comment on its own line. Makes them easier to spot/read.

@annevk

annevk Mar 9, 2017

Member

Nit: I prefer keeping the comment on its own line. Makes them easier to spot/read.

source
+
+ <li><p>Let <var>value</var> be an uninitialized value.</p></li>
+
+ <li><p>If <var>serialized</var>.[[Transfer]] is true, then let <var>value</var> be ?

This comment has been minimized.

@annevk

annevk Mar 9, 2017

Member

Nit: s/let/set, s/be/to/.

@annevk

annevk Mar 9, 2017

Member

Nit: s/let/set, s/be/to/.

source
+ <var>interfaceName</var>, created in <var>targetRealm</var>.</p>
+
+ <p class="note">This step can potentially fail if for some reason allocating the object is
+ impossible, such as low-memory situations.</p>

This comment has been minimized.

@annevk

annevk Mar 9, 2017

Member

We should probably state that we rethrow exceptions here. However, given that we don't do that for Array objects, we should maybe not account for it here either (and only account for it in the deserialize steps).

@annevk

annevk Mar 9, 2017

Member

We should probably state that we rethrow exceptions here. However, given that we don't do that for Array objects, we should maybe not account for it here either (and only account for it in the deserialize steps).

This comment has been minimized.

@domenic

domenic Mar 14, 2017

Member

I don't quite understand what exceptions we'd be rethrowing. We don't do that for array objects because it's impossible for it to throw an exception. I think the same is true here, at least excepting OOMs.

@domenic

domenic Mar 14, 2017

Member

I don't quite understand what exceptions we'd be rethrowing. We don't do that for array objects because it's impossible for it to throw an exception. I think the same is true here, at least excepting OOMs.

This comment has been minimized.

@domenic

domenic Mar 14, 2017

Member

Upon reviewing though this note does seem rather out of place, so I'll just remove it.

@domenic

domenic Mar 14, 2017

Member

Upon reviewing though this note does seem rather out of place, so I'll just remove it.

source
+ <span>StructuredDeserialize</span>(<var>entry</var>.[[Value]], <var>targetRealm</var>,
+ <var>memory</var>).</p></li>
+
+ <li><p>Perform ? <span>CreateDataProperty</span>(<var>value</var>,

This comment has been minimized.

@annevk

annevk Mar 9, 2017

Member

This should always succeed at this point, I think?

@annevk

annevk Mar 9, 2017

Member

This should always succeed at this point, I think?

+ previously-serialized <span>Record</span> <var>subSerialized</var>, and returns
+ <span>StructuredDeserialize</span>(<var>subSerialized</var>, <var>targetRealm</var>,
+ <var>memory</var>). (In other words, a <span>sub-deserialization</span> is a specialization
+ of <span>StructuredDeserialize</span> to be consistent within this invocation.)</p>

This comment has been minimized.

@annevk

annevk Mar 9, 2017

Member

Why don't we simply pass the deserialization steps the memory argument from the start?

@annevk

annevk Mar 9, 2017

Member

Why don't we simply pass the deserialization steps the memory argument from the start?

This comment has been minimized.

@domenic

domenic Mar 14, 2017

Member

Making consumers remember to pass through memory and targetRealm, and/or fill memory at the appropriate times, seems very error prone.

@domenic

domenic Mar 14, 2017

Member

Making consumers remember to pass through memory and targetRealm, and/or fill memory at the appropriate times, seems very error prone.

This comment has been minimized.

@annevk

annevk Mar 15, 2017

Member

And with sub-deserialization they end up as global variables or some such? I guess I should study that again.

@annevk

annevk Mar 15, 2017

Member

And with sub-deserialization they end up as global variables or some such? I guess I should study that again.

This comment has been minimized.

@annevk

annevk Mar 15, 2017

Member

Never mind, I see how it works now. I guess that's fine, though it seems a little weird. I don't understand why they'd have to fill memory at the appropriate times though. That's not something that would change if we just require them to invoke the actual operation directly.

@annevk

annevk Mar 15, 2017

Member

Never mind, I see how it works now. I guess that's fine, though it seems a little weird. I don't understand why they'd have to fill memory at the appropriate times though. That's not something that would change if we just require them to invoke the actual operation directly.

source
+
+ <p class="note">Unlike the corresponding step in <span>StructuredDeserialize</span>, this step
+ is unlikely to throw an exception, as no new memory needs to be allocated: the memory occupied
+ by [[ArrayBufferData]] is instead just getting transferred into the new ArrayBuffer.</p>

This comment has been minimized.

@annevk

annevk Mar 9, 2017

Member

This is not true when you cross an agent cluster / process.

@annevk

annevk Mar 9, 2017

Member

This is not true when you cross an agent cluster / process.

@lars-t-hansen

This comment has been minimized.

Show comment
Hide comment
@lars-t-hansen

lars-t-hansen Mar 10, 2017

That's interesting. Currently the ES spec doesn't tie Shared Data Blocks to agent clusters, but it seems like a reasonable thing for it to do...

It never needed to, partly because "agent cluster" is an emergent property from agents sharing memory and partly because "agent cluster" does not have a representation (in ecma262). In reality of course, "agent cluster" == Unit of Same-origin Browsing Context, the agents that can legally receive a SAB from each other by any means.

Although, that brings up a better idea for speccing than storing the source realm: store the source agent cluster. At the spec level at least it's then just a simple check for identity, and is arguably more obviously realm-independent.

Indeed that's what code I'm working on for Firefox is doing, it stores the identity of the sender's Unit of Same-origin Browsing Contexts with the SAB representation; the receiver checks that its ditto identity matches the identity sent.

That's interesting. Currently the ES spec doesn't tie Shared Data Blocks to agent clusters, but it seems like a reasonable thing for it to do...

It never needed to, partly because "agent cluster" is an emergent property from agents sharing memory and partly because "agent cluster" does not have a representation (in ecma262). In reality of course, "agent cluster" == Unit of Same-origin Browsing Context, the agents that can legally receive a SAB from each other by any means.

Although, that brings up a better idea for speccing than storing the source realm: store the source agent cluster. At the spec level at least it's then just a simple check for identity, and is arguably more obviously realm-independent.

Indeed that's what code I'm working on for Firefox is doing, it stores the identity of the sender's Unit of Same-origin Browsing Contexts with the SAB representation; the receiver checks that its ditto identity matches the identity sent.

@domenic domenic referenced this pull request in w3c/webcrypto Mar 14, 2017

Closed

Will this spec accept a pull request? #181

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Mar 14, 2017

Member

OK, I think this is ready for review and in theory landing. Note that deserialization error handling is divided into two cases:

  • Explicitly ignored for now in history.state (the resulting history.state is null).
  • Left undefined with a TODO in the source for messages where we plan to introduce a "messageerror" event.

I think that is OK for now and we'll fix the latter as part of the SAB work. If we care about handling the former differently we should file a separate issue and not hold up this work. I could be persuaded to leave it vague though instead of censoring to null.

As for dependent specifications:

Note that the cases that currently use StructuredClone will not break since we still define it here. However, it does seem like nobody actually wants to use StructuredClone directly. Everyone wants serialize/deserialize, including this spec. Maybe we should remove it?

Member

domenic commented Mar 14, 2017

OK, I think this is ready for review and in theory landing. Note that deserialization error handling is divided into two cases:

  • Explicitly ignored for now in history.state (the resulting history.state is null).
  • Left undefined with a TODO in the source for messages where we plan to introduce a "messageerror" event.

I think that is OK for now and we'll fix the latter as part of the SAB work. If we care about handling the former differently we should file a separate issue and not hold up this work. I could be persuaded to leave it vague though instead of censoring to null.

As for dependent specifications:

Note that the cases that currently use StructuredClone will not break since we still define it here. However, it does seem like nobody actually wants to use StructuredClone directly. Everyone wants serialize/deserialize, including this spec. Maybe we should remove it?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Mar 15, 2017

Member

Removing StructuredClone seems a little better, since it'll force dependencies to rethink their strategy (and fix it).

Member

annevk commented Mar 15, 2017

Removing StructuredClone seems a little better, since it'll force dependencies to rethink their strategy (and fix it).

@annevk

Could you maybe post a copy online for review? That would make such a large change a little easier to judge. Meanwhile I spotted a few errors while skimming through.

source
- queue</span> the event <var>e</var> now finds itself.</p></li>
+ <li>
+ <p>Let <var>finalTargetPort</var> be the <code>MessagePort</code> in whose <span>port message
+ queue</span> the event <var>e</var> now finds itself.</p>

This comment has been minimized.

@annevk

annevk Mar 15, 2017

Member

This doesn't work, there's no e at this point.

@annevk

annevk Mar 15, 2017

Member

This doesn't work, there's no e at this point.

This comment has been minimized.

@annevk

annevk Mar 15, 2017

Member

You can refer back to the task itself though I think?

@annevk

annevk Mar 15, 2017

Member

You can refer back to the task itself though I think?

source
+ <li><p>If <var>transferable</var> does not have a <span>[[Detached]]</span> internal slot
+ (i.e., <var>transferable</var> is not a <span data-x="transferable objects">transferable</span>
+ <span>platform object</span>), then throw a <span>"<code>DataCloneError</code>"</span>
+ <code>DOMException</code>.</p></li>

This comment has been minimized.

@annevk

annevk Mar 15, 2017

Member

This would end up throwing if you pass in an ArrayBuffer whose ArrayBufferData is not detached.

@annevk

annevk Mar 15, 2017

Member

This would end up throwing if you pass in an ArrayBuffer whose ArrayBufferData is not detached.

@annevk

Found two more nits and one somewhat larger concern. Really great work 😊

source
+ </ol>
+ </li>
+
+ <li><p>Set <var>serialized</var> to ? <span>StructuredSerialize</span>(<var>input</var>,

This comment has been minimized.

@annevk

annevk Mar 16, 2017

Member

s/Set/Let/, s/to/be/.

@annevk

annevk Mar 16, 2017

Member

s/Set/Let/, s/to/be/.

<li>
- <p>If <var>O</var> has an [[ArrayBufferData]] internal slot, then:</p>
+ <p>If <var>serialized</var> contains a [[TransferConsumed]] field, then:</p>

This comment has been minimized.

@annevk

annevk Mar 16, 2017

Member

It seems the value of the slot is rather meaningless. It also seems weird for StructuredDeserialize to contain transfer-specific steps. Can't we just keep those in the transfer algorithms?

@annevk

annevk Mar 16, 2017

Member

It seems the value of the slot is rather meaningless. It also seems weird for StructuredDeserialize to contain transfer-specific steps. Can't we just keep those in the transfer algorithms?

This comment has been minimized.

@domenic

domenic Mar 16, 2017

Member

The value of the slot is used in an assert. It replaces the previous [[Transfer]] field whose value was indeed meaningless. But something is needed as a marker anyway I believe.

I don't think there's a reasonable way to do this in StructuredDeserializeWithTransfer only. We'd need to recursively crawl the serialization to find all the things marked as [[TransferConsumed]]: false, then pull them out for a separate processing step. Such a processing step would then involve that weird "replace all references within a JS object graph" operation again. It's much better to just do this as we are deserializing the graph, in a single pass.

@domenic

domenic Mar 16, 2017

Member

The value of the slot is used in an assert. It replaces the previous [[Transfer]] field whose value was indeed meaningless. But something is needed as a marker anyway I believe.

I don't think there's a reasonable way to do this in StructuredDeserializeWithTransfer only. We'd need to recursively crawl the serialization to find all the things marked as [[TransferConsumed]]: false, then pull them out for a separate processing step. Such a processing step would then involve that weird "replace all references within a JS object graph" operation again. It's much better to just do this as we are deserializing the graph, in a single pass.

This comment has been minimized.

@annevk

annevk Mar 16, 2017

Member

But we already do the "weird" thing when serializing, no? Seems better to be consistent.

@annevk

annevk Mar 16, 2017

Member

But we already do the "weird" thing when serializing, no? Seems better to be consistent.

This comment has been minimized.

@domenic

domenic Mar 16, 2017

Member

It's different there, because it's web-observable what order the effects happen in (e.g. if there is something unserializable in the structure, you don't want to detach your array buffers).

Also that doesn't require the "replace all references within a JS object graph", which raises questions about e.g. do we use Set() or DefineProperty() (answer: neither, some weird undefined-by-ES thing) but instead the much more reasonable "replace all references within a Record structure".

@domenic

domenic Mar 16, 2017

Member

It's different there, because it's web-observable what order the effects happen in (e.g. if there is something unserializable in the structure, you don't want to detach your array buffers).

Also that doesn't require the "replace all references within a JS object graph", which raises questions about e.g. do we use Set() or DefineProperty() (answer: neither, some weird undefined-by-ES thing) but instead the much more reasonable "replace all references within a Record structure".

This comment has been minimized.

@annevk

annevk Mar 16, 2017

Member

Thanks, I'm happy to let this one go.

@annevk

annevk Mar 16, 2017

Member

Thanks, I'm happy to let this one go.

@@ -63755,6 +64325,34 @@ v6DVT (also check for '- -' bits in the part above) -->
<li><p>Return <var>imageData</var>.</p></li>
</ol>
+ <p><code>ImageData</code> objects are <span>serializable objects</span>. Their <span>serialization

This comment has been minimized.

@annevk

annevk Mar 16, 2017

Member

The ImageData IDL needs to have the new Serializable keyword.

@annevk

annevk Mar 16, 2017

Member

The ImageData IDL needs to have the new Serializable keyword.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Mar 16, 2017

Member

Updated!

What do you think the right plan is for merging this? I'd like to do it sooner rather than later so I can resume work on the SAB stuff. But I can understand if we want to have PRs lined up for as many other specs as possible first. I can do most PRs but will need your help around the blob URL store.

Member

domenic commented Mar 16, 2017

Updated!

What do you think the right plan is for merging this? I'd like to do it sooner rather than later so I can resume work on the SAB stuff. But I can understand if we want to have PRs lined up for as many other specs as possible first. I can do most PRs but will need your help around the blob URL store.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Mar 17, 2017

Member

I would be okay with landing this with the "Breaking" prefix coupled with downstream issues and us or others fixing those over the next couple months.

Member

annevk commented Mar 17, 2017

I would be okay with landing this with the "Breaking" prefix coupled with downstream issues and us or others fixing those over the next couple months.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Mar 17, 2017

Member

Note that I pushed a further fixup since you applied the annotation to the wrong interface.

Member

annevk commented Mar 17, 2017

Note that I pushed a further fixup since you applied the annotation to the wrong interface.

@annevk

annevk approved these changes Mar 17, 2017

@domenic domenic referenced this pull request in whatwg/infra Mar 20, 2017

Open

Define numbers (waiting on Number / BigInt) #87

@domenic domenic merged commit 97d644c into master Mar 20, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@domenic domenic deleted the structureclone-refactor branch Mar 20, 2017

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 5, 2017

Member

Added WebAudio/web-audio-api to your list.

Member

annevk commented Apr 5, 2017

Added WebAudio/web-audio-api to your list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment