New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp of SharedArrayBuffer PR #2518

Merged
merged 1 commit into from Apr 28, 2017

Conversation

3 participants
@annevk
Member

annevk commented Apr 10, 2017

This is a new take on #2361 based upon StructuredSerialize and StructuredDeserialize.

I rebased @domenic’s work and then did a fixup in a new commit.

For failure StructuredDeserialize throws. (Still need to update some callers. The previous PR is also a good source for tests that are still lacking.)

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 10, 2017

Member

Todo:

  • Rename directly share-to-able event loops to "directly-shared event loops"
  • Rename unit of share-within-able event loops to "shared event loops"
  • Revise definition of directly share-to-able event loops: "responsible browsing context" is never null for dedicated and shared workers. This is probably a big problem and requires introduction of some kind of hierarchy. There's no way you can distinguish Window -> Dedicated -> Shared -> Dedicated from Window -> Dedicated -> Dedicated, Window -> Shared for instance, somewhat by design. (Not sure if implementations still restrict creating a shared worker inside a dedicated worker.)
  • [ ] Make sure all the right places expose onmessageerror

@domenic do you want to preserve the first two separate commits? Might have to do some trickery to apply the above changes then.

Member

annevk commented Apr 10, 2017

Todo:

  • Rename directly share-to-able event loops to "directly-shared event loops"
  • Rename unit of share-within-able event loops to "shared event loops"
  • Revise definition of directly share-to-able event loops: "responsible browsing context" is never null for dedicated and shared workers. This is probably a big problem and requires introduction of some kind of hierarchy. There's no way you can distinguish Window -> Dedicated -> Shared -> Dedicated from Window -> Dedicated -> Dedicated, Window -> Shared for instance, somewhat by design. (Not sure if implementations still restrict creating a shared worker inside a dedicated worker.)
  • [ ] Make sure all the right places expose onmessageerror

@domenic do you want to preserve the first two separate commits? Might have to do some trickery to apply the above changes then.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 10, 2017

Member

Details of workers in Firefox:

  • Each worker has a parent, which is either a window object or another worker.
  • The parent of a shared or service worker can only be a window object.
  • Workers also have children and type, which are concepts the standard already has.
Member

annevk commented Apr 10, 2017

Details of workers in Firefox:

  • Each worker has a parent, which is either a window object or another worker.
  • The parent of a shared or service worker can only be a window object.
  • Workers also have children and type, which are concepts the standard already has.
@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 10, 2017

Member

No need to preserve the first two separate commits. It'd be vaguely nice if things could land separately, e.g. remove StructuredCloneWithTransfer, define agents and agent clusters, define SAB structured cloning, but it's not necessary. I've certainly tried for separate commits and failed on similarly-large projects.

Member

domenic commented Apr 10, 2017

No need to preserve the first two separate commits. It'd be vaguely nice if things could land separately, e.g. remove StructuredCloneWithTransfer, define agents and agent clusters, define SAB structured cloning, but it's not necessary. I've certainly tried for separate commits and failed on similarly-large projects.

@annevk annevk referenced this pull request Apr 11, 2017

Open

What is an Agent? #882

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 11, 2017

Member

I've split some of this out into #2520 and #2521.

To do StructuredCloneWithTransfer separately I think we'd have to tightly couple it with messageerror. Otherwise you end up changing the same algorithms everywhere (and in service workers) twice. I also think we'd have to do it before we land the SharedArrayBuffer bits. The alternative is coupling all these together. If you can make a suggestion I can work on it tomorrow, provided the other two PRs are in reasonable shape.

Member

annevk commented Apr 11, 2017

I've split some of this out into #2520 and #2521.

To do StructuredCloneWithTransfer separately I think we'd have to tightly couple it with messageerror. Otherwise you end up changing the same algorithms everywhere (and in service workers) twice. I also think we'd have to do it before we land the SharedArrayBuffer bits. The alternative is coupling all these together. If you can make a suggestion I can work on it tomorrow, provided the other two PRs are in reasonable shape.

@inexorabletash

This comment has been minimized.

Show comment
Hide comment
@inexorabletash

inexorabletash Apr 11, 2017

Member

@binji pointed me here - just wondering if throwing when attempting to store a SAB to IndexedDB falls out of this PR or if we need to tackle that separately.

Member

inexorabletash commented Apr 11, 2017

@binji pointed me here - just wondering if throwing when attempting to store a SAB to IndexedDB falls out of this PR or if we need to tackle that separately.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 12, 2017

Member

@inexorabletash no, we lost that property when we stopped requiring it to be a transferable. I suspect we might need StructuredSerializeWithoutSharedArrayBuffer, also for the Notifications API, which has similar constraints as Indexed DB (also keeps items for storage). @domenic what do you think? Requiring StructuredSerialize callers to traverse the record seems unnecessarily complicated.

Member

annevk commented Apr 12, 2017

@inexorabletash no, we lost that property when we stopped requiring it to be a transferable. I suspect we might need StructuredSerializeWithoutSharedArrayBuffer, also for the Notifications API, which has similar constraints as Indexed DB (also keeps items for storage). @domenic what do you think? Requiring StructuredSerialize callers to traverse the record seems unnecessarily complicated.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 12, 2017

Member

The above is assuming btw that [AllowShared] would be implied for any. An alternative might be to require it even then, since it's a bit of a footgun.

Member

annevk commented Apr 12, 2017

The above is assuming btw that [AllowShared] would be implied for any. An alternative might be to require it even then, since it's a bit of a footgun.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 12, 2017

Member

So IMO attempting to send a SharedArrayBuffer to IDB should work fine. It should snapshot it at that time, as part of the IDB-specific records-in-memory-to-bytes-on-disk process. As a web developer I think that would be more expected than requiring me to create a copy of my object graph with each SAB replaced by an AB whenever I want to store something on disk.

But let's say we want to prohibit it, for the purpose of this discussion. In that case I think it's part of a more general problem, where certain things aren't storable on disk. The obvious upcoming example is a stream.

Again to me the best layering for this sort of thing is for it to be part of IDB's records-in-memory-to-bytes-on-disk process. That just makes the most sense from my perspective, as it's IDB-specific. Maybe we can help IDB out by defining some abstract op like "CanSerializationBeWrittenToDisk" that crawls the record graph for you?


The normative question at hand, BTW, is whether serializing the JS object: { get x() { ... }, y: unDiskable, get z() { ... } } invokes zs getter. If we do the validation as a second pass, z's getter will be invoked. If we do it interleaved with the StructuredSerialize operation, then z's getter will not be invoked.

@inexorabletash has previously expressed off-thread that it fits better with Blink's architecture to not invoke z's getter, i.e. to do the validation while the object graph is being crawled and converted into its serialized form. I'm OK with that if other implementers also agree. I think it's less elegant layering, as it requires creating some IDB-specific variant of StructuredSerialize and intertwingling the concerns of IDB's disk serialization with StructuredSerialize's realm-independent serialization. But I don't object if that's what the implementers want; the theoretical purity isn't worth much.

Member

domenic commented Apr 12, 2017

So IMO attempting to send a SharedArrayBuffer to IDB should work fine. It should snapshot it at that time, as part of the IDB-specific records-in-memory-to-bytes-on-disk process. As a web developer I think that would be more expected than requiring me to create a copy of my object graph with each SAB replaced by an AB whenever I want to store something on disk.

But let's say we want to prohibit it, for the purpose of this discussion. In that case I think it's part of a more general problem, where certain things aren't storable on disk. The obvious upcoming example is a stream.

Again to me the best layering for this sort of thing is for it to be part of IDB's records-in-memory-to-bytes-on-disk process. That just makes the most sense from my perspective, as it's IDB-specific. Maybe we can help IDB out by defining some abstract op like "CanSerializationBeWrittenToDisk" that crawls the record graph for you?


The normative question at hand, BTW, is whether serializing the JS object: { get x() { ... }, y: unDiskable, get z() { ... } } invokes zs getter. If we do the validation as a second pass, z's getter will be invoked. If we do it interleaved with the StructuredSerialize operation, then z's getter will not be invoked.

@inexorabletash has previously expressed off-thread that it fits better with Blink's architecture to not invoke z's getter, i.e. to do the validation while the object graph is being crawled and converted into its serialized form. I'm OK with that if other implementers also agree. I think it's less elegant layering, as it requires creating some IDB-specific variant of StructuredSerialize and intertwingling the concerns of IDB's disk serialization with StructuredSerialize's realm-independent serialization. But I don't object if that's what the implementers want; the theoretical purity isn't worth much.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 13, 2017

Member

Storage question also affects the Notifications API at least, which does some limited form of storage.

Member

annevk commented Apr 13, 2017

Storage question also affects the Notifications API at least, which does some limited form of storage.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 13, 2017

Member

I guess I am coming around to StructuredSerializeWithoutSharedArrayBuffer, as long as it is generic (e.g. has a name like StructuredSerializeForStorage) and ideally is used by at least two consumers.

I don't quite understand why Notifications needs this as I thought it was all passing data around in memory. It certainly doesn't seem like Notifications are stored on disk. In particular the Notifications case seems pretty much identical to the pushState case. (Which is a good test to write for SABs... let me file a web platform tests)

Member

domenic commented Apr 13, 2017

I guess I am coming around to StructuredSerializeWithoutSharedArrayBuffer, as long as it is generic (e.g. has a name like StructuredSerializeForStorage) and ideally is used by at least two consumers.

I don't quite understand why Notifications needs this as I thought it was all passing data around in memory. It certainly doesn't seem like Notifications are stored on disk. In particular the Notifications case seems pretty much identical to the pushState case. (Which is a good test to write for SABs... let me file a web platform tests)

@domenic domenic referenced this pull request Apr 13, 2017

Merged

WIP: SharedArrayBuffer structured cloning tests #5003

19 of 21 tasks complete
@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 13, 2017

Member

You can create a notification from a service worker with serializable data, then that agent cluster can be collected, and the browser could even be closed (and store the data to disk) and then when the user clicks the notification you'd need to deserialize it, maybe from disk. And even if the disk isn't involved it wouldn't have to be the same instance of the service worker, meaning it's a new agent cluster.

Member

annevk commented Apr 13, 2017

You can create a notification from a service worker with serializable data, then that agent cluster can be collected, and the browser could even be closed (and store the data to disk) and then when the user clicks the notification you'd need to deserialize it, maybe from disk. And even if the disk isn't involved it wouldn't have to be the same instance of the service worker, meaning it's a new agent cluster.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 13, 2017

Member

Hmm, I see! Thanks, I didn't understand how data really worked.

Member

domenic commented Apr 13, 2017

Hmm, I see! Thanks, I didn't understand how data really worked.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 13, 2017

Member

So I still think we need to decide, for both Notifications and IDB, whether it snapshots the data or errors at serialization time. Snapshotting makes a lot more sense to me, but nobody seems sure... I guess we could always be conservative and let it fail for now, and see if users get upset about having to crawl their object graph and create a copy with all SABs replaced by AB snapshots...

Member

domenic commented Apr 13, 2017

So I still think we need to decide, for both Notifications and IDB, whether it snapshots the data or errors at serialization time. Snapshotting makes a lot more sense to me, but nobody seems sure... I guess we could always be conservative and let it fail for now, and see if users get upset about having to crawl their object graph and create a copy with all SABs replaced by AB snapshots...

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 13, 2017

Member

The main reason no serialization error feels icky is because writing to disk is effectively crossing the Agent Cluster boundary (same if you pass shared memory to the network). And whenever you normally cross the Agent Cluster boundary we decided on failure. We could also do what some implementations do and hand out an AB when you try to cross a Agent Cluster boundary with an SAB, but we kinda already decided against that.

Member

annevk commented Apr 13, 2017

The main reason no serialization error feels icky is because writing to disk is effectively crossing the Agent Cluster boundary (same if you pass shared memory to the network). And whenever you normally cross the Agent Cluster boundary we decided on failure. We could also do what some implementations do and hand out an AB when you try to cross a Agent Cluster boundary with an SAB, but we kinda already decided against that.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 13, 2017

Member

To me the disk is qualitatively different from the in-memory agent cluster boundaries. Asking to serialize something to IDB is pretty clearly asking for you to put the bytes on the disk. That's is qualitatively different from sending the memory to another window/worker. I doubt developers even know that they are both backed by similar spec/implementation infrastructure.

Putting it another way, when a developer says "put these bytes on disk" it's pretty surprising to be told "no, these are special shared bytes, we don't put those on the disk". For putting-on-disk purposes, they're just bytes, IMO!

The notifications case is less straightforward, as it certainly wasn't obvious to me that it was more like IDB than like postMessage. So my intuition doesn't help there.

Member

domenic commented Apr 13, 2017

To me the disk is qualitatively different from the in-memory agent cluster boundaries. Asking to serialize something to IDB is pretty clearly asking for you to put the bytes on the disk. That's is qualitatively different from sending the memory to another window/worker. I doubt developers even know that they are both backed by similar spec/implementation infrastructure.

Putting it another way, when a developer says "put these bytes on disk" it's pretty surprising to be told "no, these are special shared bytes, we don't put those on the disk". For putting-on-disk purposes, they're just bytes, IMO!

The notifications case is less straightforward, as it certainly wasn't obvious to me that it was more like IDB than like postMessage. So my intuition doesn't help there.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 13, 2017

Member

It seems very natural to me, but either way they will need a different code path. These are the options we have I think:

  1. Throw.
  2. Copy the internals into an AB.
  3. Copy the internals into a new SAB (and by this I mostly mean a new Shared Data Block).
Member

annevk commented Apr 13, 2017

It seems very natural to me, but either way they will need a different code path. These are the options we have I think:

  1. Throw.
  2. Copy the internals into an AB.
  3. Copy the internals into a new SAB (and by this I mostly mean a new Shared Data Block).
@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 13, 2017

Member

OK, trying to pick between those made me side with (1). In addition to the argument-from-conservativism, the idea that Notifications and IDB should be consistent clinched it. Because neither (2) nor (3) is very good for Notifications. Both are surprising for developers, I would think.

FWIW @binji and @inexorabletash were leaning toward (1) from what I understand, and Firefox already implements (1), so I think we should be good to move forward with it.

Thanks again for walking me through all this.

Member

domenic commented Apr 13, 2017

OK, trying to pick between those made me side with (1). In addition to the argument-from-conservativism, the idea that Notifications and IDB should be consistent clinched it. Because neither (2) nor (3) is very good for Notifications. Both are surprising for developers, I would think.

FWIW @binji and @inexorabletash were leaning toward (1) from what I understand, and Firefox already implements (1), so I think we should be good to move forward with it.

Thanks again for walking me through all this.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 13, 2017

Member

I decided to split out messageerror into its own PR: #2530. We can still land it together with this or shortly after each other, but it seems far better to get it reviewed separately.

Member

annevk commented Apr 13, 2017

I decided to split out messageerror into its own PR: #2530. We can still land it together with this or shortly after each other, but it seems far better to get it reviewed separately.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 13, 2017

Member

I will rebase this PR once we've made some progress on the other PRs and maybe landed one or two. In particular this probably needs to be on top of the messageerror PR due to the addition of the new StructuredSerializeForStorage abstract operation, but it also needs to be on top of the other PRs and it doesn't quite seem worth the effort to start cherry picking to make all that possible as the foundations are not at the nit stage just yet.

Member

annevk commented Apr 13, 2017

I will rebase this PR once we've made some progress on the other PRs and maybe landed one or two. In particular this probably needs to be on top of the messageerror PR due to the addition of the new StructuredSerializeForStorage abstract operation, but it also needs to be on top of the other PRs and it doesn't quite seem worth the effort to start cherry picking to make all that possible as the foundations are not at the nit stage just yet.

@inexorabletash

This comment has been minimized.

Show comment
Hide comment
@inexorabletash

inexorabletash Apr 13, 2017

Member

Thank you both for your care and attention on this.

Member

inexorabletash commented Apr 13, 2017

Thank you both for your care and attention on this.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 16, 2017

Member

Open issues:

  1. How should we define StructuredSerializeForStorage; should we give StructuredSerialize another parameter that we need to pass on throughout? Before or after memory? Or should we make StructuredSerialize and StructuredSerializeForStorage an abstraction on top of a shared internal primitive that takes three parameters, whereas they only take value? The latter seems somewhat nice.
  2. We probably need to account for the fact that by the time a SharedArrayBuffer reaches the other side it might have been GC'd or is there something that keeps a Shared Data Block alive if it's part of a message? (Imagine you just wait a minute before invoking start() on your MessagePort.)
Member

annevk commented Apr 16, 2017

Open issues:

  1. How should we define StructuredSerializeForStorage; should we give StructuredSerialize another parameter that we need to pass on throughout? Before or after memory? Or should we make StructuredSerialize and StructuredSerializeForStorage an abstraction on top of a shared internal primitive that takes three parameters, whereas they only take value? The latter seems somewhat nice.
  2. We probably need to account for the fact that by the time a SharedArrayBuffer reaches the other side it might have been GC'd or is there something that keeps a Shared Data Block alive if it's part of a message? (Imagine you just wait a minute before invoking start() on your MessagePort.)
@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 17, 2017

Member

I agree a separate StructuredSerializeInternal with a boolean parameter makes sense.

I assumed that there was a GC root (UA) -> in-flight messages link, i.e. the UA was't allowed to GC anything in the port message queue. Otherwise that'd be a problem for e.g. non-shared data blocks too; you'd go to deserialize and suddenly the data block is null instead of having all the right data.

Member

domenic commented Apr 17, 2017

I agree a separate StructuredSerializeInternal with a boolean parameter makes sense.

I assumed that there was a GC root (UA) -> in-flight messages link, i.e. the UA was't allowed to GC anything in the port message queue. Otherwise that'd be a problem for e.g. non-shared data blocks too; you'd go to deserialize and suddenly the data block is null instead of having all the right data.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 17, 2017

Member

It's a little different from a Data Block as this is just a reference to a Shared Data Block wheres the normal Data Block would just be inlined so to speak, but if we assume it's a strong reference that makes sense. Still, it seems like something implementations would have to explicitly account for somehow and would also be good to test.

Member

annevk commented Apr 17, 2017

It's a little different from a Data Block as this is just a reference to a Shared Data Block wheres the normal Data Block would just be inlined so to speak, but if we assume it's a strong reference that makes sense. Still, it seems like something implementations would have to explicitly account for somehow and would also be good to test.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 17, 2017

Member

Yeah, I think the semantics of UA-keeps-a-reference-to-Record plus Record-contains-a-Data-Block assume strong references in both cases. It would be ideal to test; maybe we can use https://github.com/w3c/web-platform-tests/blob/55e9a48c3b1218b64998e8a3f30ab3d274353f5f/streams/resources/test-utils.js#L27

Member

domenic commented Apr 17, 2017

Yeah, I think the semantics of UA-keeps-a-reference-to-Record plus Record-contains-a-Data-Block assume strong references in both cases. It would be ideal to test; maybe we can use https://github.com/w3c/web-platform-tests/blob/55e9a48c3b1218b64998e8a3f30ab3d274353f5f/streams/resources/test-utils.js#L27

@inexorabletash

This comment has been minimized.

Show comment
Hide comment
@inexorabletash

inexorabletash Apr 18, 2017

Member

Should forStorage be passed to the serialization steps so that platform objects can implement semantics equivalent to SAB if desired?

Member

inexorabletash commented Apr 18, 2017

Should forStorage be passed to the serialization steps so that platform objects can implement semantics equivalent to SAB if desired?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 18, 2017

Member

Maybe once there's a use case. I don't think we need it for Blob/File. Those will continue to store a pointer to some "on disk" resource.

Member

annevk commented Apr 18, 2017

Maybe once there's a use case. I don't think we need it for Blob/File. Those will continue to store a pointer to some "on disk" resource.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 18, 2017

Member

Streams will need it pretty immediately; I'd pass it in.

Member

domenic commented Apr 18, 2017

Streams will need it pretty immediately; I'd pass it in.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 18, 2017

Member

Should we still abstract away the boolean parameter then or just embrace it and have a single abstract operation with forStorage defaulting to false?

Member

annevk commented Apr 18, 2017

Should we still abstract away the boolean parameter then or just embrace it and have a single abstract operation with forStorage defaulting to false?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 18, 2017

Member

I guess it's somewhat more flexible if we don't expose the boolean parameter to all callers.

Member

annevk commented Apr 18, 2017

I guess it's somewhat more flexible if we don't expose the boolean parameter to all callers.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 18, 2017

Member

I think we should still abstract away the boolean parameter, yeah.

I was also thinking about whether we should maybe have separate 'serialization steps' and 'serialization for storage steps' with the latter defaulting to the former. But that's not a big deal I guess.

We do need to make sure sub-serialization passes through the forStorage boolean.

Member

domenic commented Apr 18, 2017

I think we should still abstract away the boolean parameter, yeah.

I was also thinking about whether we should maybe have separate 'serialization steps' and 'serialization for storage steps' with the latter defaulting to the former. But that's not a big deal I guess.

We do need to make sure sub-serialization passes through the forStorage boolean.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 18, 2017

Member

On today's WebAssembly call we briefly discussed the pushState/replaceState case and thought that should probably use StructuredSerializeForStorage as well, both to be conservative and because it seems at least feasible implementations could serialize history state to the disk during navigation.

Member

domenic commented Apr 18, 2017

On today's WebAssembly call we briefly discussed the pushState/replaceState case and thought that should probably use StructuredSerializeForStorage as well, both to be conservative and because it seems at least feasible implementations could serialize history state to the disk during navigation.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 18, 2017

Member

That sounds like a good plan.

Member

annevk commented Apr 18, 2017

That sounds like a good plan.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 18, 2017

Member

We do need to make sure sub-serialization passes through the forStorage boolean.

Wait, this already happens, but I guess you mean to the "serialization steps" as stated earlier.

Member

annevk commented Apr 18, 2017

We do need to make sure sub-serialization passes through the forStorage boolean.

Wait, this already happens, but I guess you mean to the "serialization steps" as stated earlier.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 18, 2017

Member

But also, if we change "serialization steps", everyone that has integrated with the current calling pattern will have to change again. Perhaps introducing "serialization for storage steps" isn't so bad?

Member

annevk commented Apr 18, 2017

But also, if we change "serialization steps", everyone that has integrated with the current calling pattern will have to change again. Perhaps introducing "serialization for storage steps" isn't so bad?

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 18, 2017

Member

Oh sorry, I hadn't read through the actual changes yet :-/.

I agree it's a bit confusing whether people who have integrated with the current pattern for declaring serialization steps need to declare the third argument if they don't use it. E.g. in JavaScript they would not, but in most strongly-typed languages they would.

Maybe "serialization for storage steps" is the easiest way out of that confusion.

Member

domenic commented Apr 18, 2017

Oh sorry, I hadn't read through the actual changes yet :-/.

I agree it's a bit confusing whether people who have integrated with the current pattern for declaring serialization steps need to declare the third argument if they don't use it. E.g. in JavaScript they would not, but in most strongly-typed languages they would.

Maybe "serialization for storage steps" is the easiest way out of that confusion.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 18, 2017

Member

It also makes more sense as the second argument I thought. Hmm.

Member

annevk commented Apr 18, 2017

It also makes more sense as the second argument I thought. Hmm.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 19, 2017

Member

I decided to make forStorage a third argument to "serialization steps" after all and recommend it be omitted if it's not relevant to the algorithm at hand. This seemed better than requiring platform objects to define a set of steps labeled "for storage" while the dominant usage of such an algorithm would not in fact be "for storage".

Member

annevk commented Apr 19, 2017

I decided to make forStorage a third argument to "serialization steps" after all and recommend it be omitted if it's not relevant to the algorithm at hand. This seemed better than requiring platform objects to define a set of steps labeled "for storage" while the dominant usage of such an algorithm would not in fact be "for storage".

@domenic

Sorry it took me a while to get around to reviewing this

@@ -7643,6 +7667,9 @@ interface <dfn>DOMStringList</dfn> {
<p>These steps may perform a <span>sub-serialization</span> to serialize nested data
structures. They should not call <span>StructuredSerialize</span> directly, as doing so will
omit the important <var>memory</var> argument.</p>
<p>The introduction of these steps should omit mention of the <var>forStorage</var> argument if
it is not relevant to the algorithm.</p>
</dd>

This comment has been minimized.

@domenic

domenic Apr 19, 2017

Member

I had a new idea. Why don't we say that platform objects may define themselves as "not serializable for storage", in addition to their serialization/deserialization steps? And explain when you might want to do that. Then we just throw a DataCloneError if that is set for a platform object.

That would help enforce consistent errors, for example.

@domenic

domenic Apr 19, 2017

Member

I had a new idea. Why don't we say that platform objects may define themselves as "not serializable for storage", in addition to their serialization/deserialization steps? And explain when you might want to do that. Then we just throw a DataCloneError if that is set for a platform object.

That would help enforce consistent errors, for example.

This comment has been minimized.

@inexorabletash

inexorabletash Apr 19, 2017

Member

As a parameter, it would let WebAssembly.Module define the "throw if serializing for storage if not in a secure context" behavior we've discussed elsewhere (rather than doing that in IDB as a post-serialize pass). Although maybe we can do that with prose just as easily. "A module is not serializable for storage if the context is not a secure context."

@inexorabletash

inexorabletash Apr 19, 2017

Member

As a parameter, it would let WebAssembly.Module define the "throw if serializing for storage if not in a secure context" behavior we've discussed elsewhere (rather than doing that in IDB as a post-serialize pass). Although maybe we can do that with prose just as easily. "A module is not serializable for storage if the context is not a secure context."

This comment has been minimized.

@domenic

domenic Apr 19, 2017

Member

Interesting; I am now leaning back toward a parameter. The prose version isn't great, because of the question "what is the context"? Which indeed is a good point we'll need to be sure to write tests on. Consider the case

// inside window1, so object literals/functions have a relevant realm of window1
window2Store.put({
  get x() { return new window3.WebAssembly.Module(...); }
});

which context(s) here need to be secure?

  • The current settings object during serialization is that of window2
  • The relevant settings object of the wasm module itself is window3. (Note: this kind of assumes WASM modules are platform objects, since normal JS objects don't necessarily have relevant settings objects. See tc39/ecma262#573.)

The only way "not serializable for storage" would work is if we decide window3 is the one that needs to be secure; then we could say "A module is not serializable for storage if its relevant settings object is not a secure context". For window2 we'll need an algorithm so we can correctly refer to "the current settings object".

(window1 isn't easy to get at in specs, and probably not in implementation either, so we can forget that.)

We could also require some combination of these, e.g. both window2 and window3. Again this is easier with algorithm steps.


Given this complexity, for now I am happy with either:

  • Abandoning revisions to serialization steps alone in this PR and working on a follow-up
  • Going with the PR as-is, with an optional forStorage parameter.
@domenic

domenic Apr 19, 2017

Member

Interesting; I am now leaning back toward a parameter. The prose version isn't great, because of the question "what is the context"? Which indeed is a good point we'll need to be sure to write tests on. Consider the case

// inside window1, so object literals/functions have a relevant realm of window1
window2Store.put({
  get x() { return new window3.WebAssembly.Module(...); }
});

which context(s) here need to be secure?

  • The current settings object during serialization is that of window2
  • The relevant settings object of the wasm module itself is window3. (Note: this kind of assumes WASM modules are platform objects, since normal JS objects don't necessarily have relevant settings objects. See tc39/ecma262#573.)

The only way "not serializable for storage" would work is if we decide window3 is the one that needs to be secure; then we could say "A module is not serializable for storage if its relevant settings object is not a secure context". For window2 we'll need an algorithm so we can correctly refer to "the current settings object".

(window1 isn't easy to get at in specs, and probably not in implementation either, so we can forget that.)

We could also require some combination of these, e.g. both window2 and window3. Again this is easier with algorithm steps.


Given this complexity, for now I am happy with either:

  • Abandoning revisions to serialization steps alone in this PR and working on a follow-up
  • Going with the PR as-is, with an optional forStorage parameter.

This comment has been minimized.

@annevk

annevk Apr 20, 2017

Member

I think the current PR is fine. It's flexible which is what we need until we have patterns.

@annevk

annevk Apr 20, 2017

Member

I think the current PR is fine. It's flexible which is what we need until we have patterns.

Show outdated Hide outdated source
Show outdated Hide outdated source
Show outdated Hide outdated source
data-dfn-type="abstract-op"><dfn>StructuredSerialize</dfn> ( <var>value</var> [ ,
<var>memory</var> ] )</h4>
<h4 id="structuredserializeinternal" data-noexport="" data-lt="StructuredSerializeInternal"
data-dfn-type="abstract-op"><dfn>StructuredSerializeInternal</dfn> ( <var>value</var>,

This comment has been minimized.

@domenic

domenic Apr 19, 2017

Member

We could omit all the Bikeshed stuff here since we're not exporting.

@domenic

domenic Apr 19, 2017

Member

We could omit all the Bikeshed stuff here since we're not exporting.

This comment has been minimized.

@annevk

annevk Apr 20, 2017

Member

I thought I'd leave it in for when we get that public/private thing fixed.

@annevk

annevk Apr 20, 2017

Member

I thought I'd leave it in for when we get that public/private thing fixed.

<li><p>Let <var>serialized</var> be ? <span>StructuredSerialize</span>(<var>input</var>,
<var>memory</var>).</p></li>
<li><p>Let <var>serialized</var> be ? <span>StructuredSerializeInternal</span>(<var>input</var>,
false, <var>memory</var>).</p></li>

This comment has been minimized.

@domenic

domenic Apr 19, 2017

Member

You could just keep calling StructuredSerialize here; seems slightly nicer?

@domenic

domenic Apr 19, 2017

Member

You could just keep calling StructuredSerialize here; seems slightly nicer?

This comment has been minimized.

@annevk

annevk Apr 20, 2017

Member

How? StructuredSerialize no longer accepts memory.

@annevk

annevk Apr 20, 2017

Member

How? StructuredSerialize no longer accepts memory.

@@ -8611,6 +8680,7 @@ o.myself = o;</pre>
</dd>
<dt><span>StructuredSerialize</span></dt>
<dt><span>StructuredSerializeForStorage</span></dt>
<dt><span>StructuredDeserialize</span></dt>
<dd>
<p>Creating a <span>JavaScript Realm</span>-independent snapshot of a given value which can be

This comment has been minimized.

@domenic

domenic Apr 19, 2017

Member

I'd add a paragraph after this explaining the distinction. Something like

StructuredSerializeForStorage can be used for situations where the serialization is anticipated to be stored in a persistent manner, instead of passed between Realms. It throws when attempting to serialize SharedArrayBuffer objects, since storing shared memory doesn't make sense. Similarly, it throws when given any platform objects that are marked as [not serializable for storage]."

@domenic

domenic Apr 19, 2017

Member

I'd add a paragraph after this explaining the distinction. Something like

StructuredSerializeForStorage can be used for situations where the serialization is anticipated to be stored in a persistent manner, instead of passed between Realms. It throws when attempting to serialize SharedArrayBuffer objects, since storing shared memory doesn't make sense. Similarly, it throws when given any platform objects that are marked as [not serializable for storage]."

Show outdated Hide outdated source

annevk added a commit to whatwg/notifications that referenced this pull request Apr 24, 2017

Use StructuredSerializeForStorage
As defined by and discussed in whatwg/html#2518.
Serializing and deserializing SharedArrayBuffer
Additionally, define StructuredSerializeForStorage when for
serializable objects need to be stored more persistently.

Tests: web-platform-tests/wpt#5003.

Fixes #2260 (by being the last in a set of fixes). Closes
tc39/ecmascript_sharedmem#144 and closes
tc39/ecmascript_sharedmem#65.
@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 27, 2017

Member

Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1360190. Maybe we should have a generic bug for each browser though to go through the new tests. Safari fails quite a few due to not supporting shared or service workers, or BroadcastChannel, and also seems to have some bugs with window-to-window sharing.

Member

annevk commented Apr 27, 2017

Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1360190. Maybe we should have a generic bug for each browser though to go through the new tests. Safari fails quite a few due to not supporting shared or service workers, or BroadcastChannel, and also seems to have some bugs with window-to-window sharing.

@annevk

This comment has been minimized.

Show comment
Hide comment
@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Apr 27, 2017

Member

This LGTM and I'm ready to merge/publish the blog post when you are. I agree with a single "update SAB to pass the tests" bug for each browser.

Member

domenic commented Apr 27, 2017

This LGTM and I'm ready to merge/publish the blog post when you are. I agree with a single "update SAB to pass the tests" bug for each browser.

annevk added a commit to web-platform-tests/wpt that referenced this pull request Apr 28, 2017

@annevk annevk merged commit b3f8779 into master Apr 28, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@annevk annevk deleted the annevk/sab branch Apr 28, 2017

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 28, 2017

Member

Something weird happened. Because I did squash & merge so I could change the commit message, GitHub ended up changing the author of the commit. I haven't seen that happen before.

Member

annevk commented Apr 28, 2017

Something weird happened. Because I did squash & merge so I could change the commit message, GitHub ended up changing the author of the commit. I haven't seen that happen before.

annevk added a commit to whatwg/notifications that referenced this pull request May 14, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment