Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StructuredClone serialize / deserialize #935

Closed
annevk opened this issue Mar 24, 2016 · 22 comments · Fixed by #2530
Closed

StructuredClone serialize / deserialize #935

annevk opened this issue Mar 24, 2016 · 22 comments · Fixed by #2530

Comments

@annevk
Copy link
Member

annevk commented Mar 24, 2016

See tc39/proposal-ecmascript-sharedmem#39 (comment). The way we define structured cloning right now is not correct.

What we need is some kind of "Agent Message Record" which holds a serialization of an object graph (probably also defined as a record). Those records are then passed around between agents/realms (using postMessage()) and can sit still for a while in a MessagePort' queue until its started for a particular agent/realm. (Although maybe it needs to be bigger if we actually want to carefully define how those tasks get transferred between event loops too. Or we pretend tasks are magic.)

tc39/proposal-ecmascript-sharedmem#39 (comment) has thoughts on how to approach this.

@lars-t-hansen, let me know when this becomes higher priority.

@jungkees @jakearchibald this is probably also important for service workers for when we'll start specifying the messages going to and from clients in more detail.

@domenic
Copy link
Member

domenic commented Sep 13, 2016

We discussed this a bit more in #jslang, from a slightly different perspective. If we want to be able to structured clone promises and streams between workers and the main thread, we need some distinction which doesn't allow those to be serialized to IndexedDB, since that doesn't work or make sense. My proposed framework is then:

  • Objects can have three possible operations:
    • Serialize(object): object -> lists-and-records
    • Deserialize(serialization, targetRealm): lists-and-records -> new object created in targetRealm
    • Entangle(object, targetRealm): object -> new object created in targetRealm which is "entangled" with the original object
  • An object that has Serialize must have Deserialize; an object that has Entangle must not have Serialize or Deserialize.
  • Moving stuff between realms is done via structured cloning, which does either Serialize + Deserialize, or Entangle, depending on the object being cloned.
  • IDB and other storage-type things no longer use structured clone. Instead, they directly use Serialize and Deserialize.
    • Serialize does not work on things that must be transfered, like MessagePorts. This is already how IDB works.

An example of Entangle for promises is, creating a new promise p in targetRealm, then doing basically object.then(resolveP, rejectP) where resolveP and rejectP resolve/reject p with a structured clone of their argument.

An example of Entangle for streams is basically doing what we are doing in fetch/service workers today to pass things over the service worker boundary to the main thread. If this work is done carefully we could probably replace all that spec text.

@annevk
Copy link
Member Author

annevk commented Sep 13, 2016

Two complications: 1) Serialize does not mean persist to IDB, e.g., SharedArrayBuffer. 2) Serialize and Deserialize happen in different tasks and can each fail. We currently do not handle failure for the latter.

@domenic
Copy link
Member

domenic commented Sep 13, 2016

Can you expand on

Serialize does not mean persist to IDB, e.g., SharedArrayBuffer

? @inexorabletash was talking about something similar but my understanding is that this hasn't been decided exactly how SAB and IDB should interact.

My thought for something like SharedArrayBuffer or Blob is that the result of serialization would be a record like { [[Type]]: "SharedArrayBuffer", [[BackingData]]: the [[ArrayBufferData]] of the source } or { [[Type]]: "Blob", [[BackingData]]: a pointer to the data }. The actual process of serializing these to disk (i.e. translating them into bytes) would be implementation-specific.

@annevk
Copy link
Member Author

annevk commented Sep 13, 2016

@lars-t-hansen told me that was desired.

@domenic
Copy link
Member

domenic commented Sep 13, 2016

Right. But why does that mena "Serialize does not mean persist to IDB"?

@annevk
Copy link
Member Author

annevk commented Sep 13, 2016

It means "storage/persist" and Serialize are distinct. Supporting the latter does not imply the former.

@domenic
Copy link
Member

domenic commented Sep 13, 2016

OK. I don't understand your point then or how it relates to SAB. SharedArray would support Serialize and would be storable/persistable.

@lars-t-hansen
Copy link

lars-t-hansen commented Sep 13, 2016

Briefly, for me the question is, which operation does postMessage use? It can't be Serialize if that makes a copy of shared memory as it would have to for IDB.

@domenic
Copy link
Member

domenic commented Sep 13, 2016

Serialize does not make a copy of shared memory, neither for postMessage nor for IDB. It just serializes to the record of the form { [[Type]]: "SharedArrayBuffer", [[BackingData]]: the [[ArrayBufferData]] of the source }. When this is Deserialized by postMessage, it just creates a new SAB with the pointer to [[BackingData]]. When IDB performs its implementation-specific records-to-bytes translation, it makes a copy of the data.

@annevk
Copy link
Member Author

annevk commented Sep 14, 2016

@domenic all I'm saying is that we wouldn't allow SharedArrayBuffer to be stored. So the theoretical Store/Persist abstract operation that takes the result of the Serialize operation would fail on SharedArrayBuffer records. (I suspect this is because the underlying data is much more mutable than with other kind of objects. Only File objects have somewhat similar characteristics, but those would require the user to go in and modify the underlying resource.)

@annevk
Copy link
Member Author

annevk commented Sep 14, 2016

And IDB doesn't make a copy of the data necessarily. For File objects it just stores a pointer, as far as I know. If it made a copy of the data it would do more than structured cloning offers and I don't think it does that (which is the main reason why we don't want SharedArrayBuffer there I think, presumably this also applies to the Notifications API).

@domenic
Copy link
Member

domenic commented Sep 14, 2016

Why wouldn't we allow SAB to be stored??

@annevk
Copy link
Member Author

annevk commented Sep 14, 2016

What would storing it mean? Would it mean copying the underlying data? It's not really a primitive that's defined nor necessarily desired for v1.

@domenic
Copy link
Member

domenic commented Sep 14, 2016

Yes, copying the underlying data, exactly the same as for ABs. It is indeed not defined how that happens, since it's implementation specific.

@lars-t-hansen
Copy link

Clearly the SAB's data can be stored (though of course there's no guarantee that the data are stable while the storing is going on :) The SAB itself can't be stored in the same way that a closure can't usefully be stored, though.

I probably don't care very much how this is resolved, so long as we're clear on what the misc operations do. I think it's Weird that IDB has an implementation-specific mechanism for persisting data that could accept a type in one embedding and not in another (IIUC), but I don't think it matters for my purposes.

@domenic
Copy link
Member

domenic commented Sep 14, 2016

To be clear what I mean by implementation-specific: I mean that the actual byte pattern on disk/in memory used is implementation specific. The "records-and-lists" structure is meant to be an implementation-agnostic (and, more importantly, realm-agnostic) representation of the data, but how that gets translated into a byte pattern is necessarily implementation specific. This implementation-specific serialization is not just for IDB; it's also used in IPC for postMessage and friends.

@annevk
Copy link
Member Author

annevk commented Sep 14, 2016

It's not the same as regular AB since the underlying data isn't shared for regular AB.

@domenic
Copy link
Member

domenic commented Sep 14, 2016

That doesn't impact anything relevant here.

@annevk
Copy link
Member Author

annevk commented Sep 14, 2016

It does, because the SAB you get back may or may not point to a different buffer depending on how storage is defined.

@inexorabletash
Copy link
Member

And IDB doesn't make a copy of the data necessarily. For File objects it just stores a pointer, as far as I know.

Tangent, but IDB implementations do copy the File contents - basically the same work as for Blobs. Even outside IDB the behavior of a File when the file on disk is modified is inconsistent between browsers. (At some point the spec effectively required doing a full data copy on file selection, but no browser does that.)

@annevk
Copy link
Member Author

annevk commented Sep 14, 2016

We should get that defined.

@inexorabletash
Copy link
Member

w3c/FileAPI#47 and possibly others.

annevk added a commit that referenced this issue Apr 13, 2017
And remove StructuredCloneWithTransfer as deserializing errors need to
be handled on their own, to be consistent with MessageChannel.

This helps with #2260 and fixes #935.
annevk added a commit that referenced this issue Apr 16, 2017
The messageerror event is used when deserialization fails. E.g., when
an ArrayBuffer object cannot be allocated.

This also removes StructuredCloneWithTransfer as deserializing errors
now need to be handled on their own.

Fixes part of #2260 and fixes #935.
annevk added a commit that referenced this issue Apr 16, 2017
The messageerror event is used when deserialization fails. E.g., when
an ArrayBuffer object cannot be allocated.

This also removes StructuredCloneWithTransfer as deserializing errors
now need to be handled on their own.

Fixes part of #2260 and fixes #935.
annevk added a commit that referenced this issue Apr 24, 2017
The messageerror event is used when deserialization fails. E.g., when
an ArrayBuffer object cannot be allocated.

This also removes StructuredCloneWithTransfer as deserializing errors
now need to be handled on their own.

Tests: web-platform-tests/wpt#5567.

Service workers follow-up:
w3c/ServiceWorker#1116.

Fixes part of #2260 and fixes #935.
domenic pushed a commit that referenced this issue Apr 24, 2017
The messageerror event is used when deserialization fails. E.g., when
an ArrayBuffer object cannot be allocated.

This also removes StructuredCloneWithTransfer as deserializing errors
now need to be handled on their own.

Tests: web-platform-tests/wpt#5567.

Service workers follow-up:
w3c/ServiceWorker#1116.

Fixes part of #2260 and fixes #935.
inikulin pushed a commit to HTMLParseErrorWG/html that referenced this issue May 9, 2017
The messageerror event is used when deserialization fails. E.g., when
an ArrayBuffer object cannot be allocated.

This also removes StructuredCloneWithTransfer as deserializing errors
now need to be handled on their own.

Tests: web-platform-tests/wpt#5567.

Service workers follow-up:
w3c/ServiceWorker#1116.

Fixes part of whatwg#2260 and fixes whatwg#935.
inikulin pushed a commit to HTMLParseErrorWG/html that referenced this issue May 9, 2017
The messageerror event is used when deserialization fails. E.g., when
an ArrayBuffer object cannot be allocated.

This also removes StructuredCloneWithTransfer as deserializing errors
now need to be handled on their own.

Tests: web-platform-tests/wpt#5567.

Service workers follow-up:
w3c/ServiceWorker#1116.

Fixes part of whatwg#2260 and fixes whatwg#935.
alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
This rewrites most of the cloneable and transferable object
infrastructure to better reflect the reality that structured cloning
requires separate serialization and deserialization steps, instead of a
single operation that creates a new object in the target Realm. This is
most evident in the case of MessagePorts, as noted in whatwg#2277. It also
allows us to avoid awkward double-cloning with an intermediate
"user-agent defined Realm", as seen in e.g. history.state or IndexedB;
instead we can simply store the serialized form and later deserialize.

Concretely, this:

* Replaces the concept of cloneable objects with serializable objects.
  For platform objects, instead of defining a [[Clone]]() internal
  method, serializable platform objects are annotated with the new
  [Serializable] IDL attribute, and include serialization and
  deserialization steps in their definition.
* Updates the concept of transferable objects. For platform objects,
  instead of defining a [[Transfer]]() internal method, transferable
  platform objects are annotated with the new [Transferable] IDL
  attribute, and include transfer and transfer-receiving steps.
  Additionally, the [[Detached]] internal slot for such objects is now
  managed more automatically.
* Removes the StructuredClone() abstract operation in favor of separate
  StructuredSerialize() and StructuredDeserialize() abstract operations.
  In practice we found that performing a structured clone alone is never
  necessary in specs. It is always either coupled with a transfer list,
  for which StructuredCloneWithTransfer() can be used, or it is best
  expressed as separate serialization and deserialization steps.
* Removes IsTransferable() and Transfer() abstract operations. When
  defined more properly, these became less useful by themselves, so they
  were inlined into the rest of the machinery.
* Introduces StructuredSerialzieWithTransfer() and
  StructuredDeserializeWithTransfer(), which can be used by other
  specifications which need to define their own postMessage()-style
  algorithm but for which StructuredCloneWithTransfer() is not
  sufficient.

Closes whatwg#785. Closes whatwg#935. Closes whatwg#2277. Closes whatwg#1162. Sets the stage for
whatwg#936 and whatwg#2260/whatwg#2361.
alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
The messageerror event is used when deserialization fails. E.g., when
an ArrayBuffer object cannot be allocated.

This also removes StructuredCloneWithTransfer as deserializing errors
now need to be handled on their own.

Tests: web-platform-tests/wpt#5567.

Service workers follow-up:
w3c/ServiceWorker#1116.

Fixes part of whatwg#2260 and fixes whatwg#935.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants