Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
littledan committed May 27, 2020
1 parent faadee2 commit c28c866
Showing 1 changed file with 83 additions and 19 deletions.
102 changes: 83 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
# ECMAScript Proposal for Boxing Objects into Primitives
# Symbols as WeakMap keys

Stage 0

**Champions**:
On the agenda for the June 2020 TC39 meeting for Stage 1

- Robin Ricard (@rricard)
- Rick Button (@rickbutton)

**Authors**:
**Coauthors/champions**:

- Robin Ricard (@rricard)
- Rick Button (@rickbutton)
Expand All @@ -16,6 +14,12 @@

This proposal aims to solve a problem space introduced by the [Record & Tuple Proposal][rtp]; how can we reference and access non-primitive values in a primitive?

tl;dr We see Symbols, dereferenced through WeakMaps, as the most reasonable way forward to reference Objects from Records and Tuples, given all the constraints raised in the discussion so far.

There are some open questions as to how this should they work exactly, and also valid ergonomics/ecosystem coordination issues, which we hope to resolve/validate in the course of the TC39 stage process. We'll start with an understanding of the problem space, including why Records and Tuples are a good first step without this feature. Then, we'll examine various possible solutions, with their pros and cons,

----

[Records & Tuples][rtp] can't contain objects, functions, or methods and will throw a `TypeError` when someone attempts to do it:

```js
Expand All @@ -25,7 +29,7 @@ const server = #{
};
```

This limitation exists because the one of the **key goals** of the [Record & Tuple Proposal][rtp] is to have deep immutability guarantees _by default_.
This limitation exists because the one of the **key goals** of the [Record & Tuple Proposal][rtp] is to have deep immutability guarantees and structural equality _by default_.

The userland solutions mentioned below provide multiple methods of side-stepping this limitation, and `Record and Tuple` is viable and useful without additional language support for boxing objects. This proposal attempts to describe solutions that complement the usage of these userland solutions with `Record and Tuple`, but is not a prerequisite to landing `Record and Tuple` in the language.

Expand Down Expand Up @@ -110,6 +114,7 @@ class RefBookkeeper {
deref(sym) { return this.#references[sym]; }
}

// Usage
const server = (() => {
const references = new RefBookkeeper();
const structure = #{
Expand All @@ -128,13 +133,17 @@ You might want to reuse the same `RefBookkeeper` object across multiple server i

```js
globalThis.refs = new RefBookkeeper();

// Usage
const server = #{
port: 8080,
handler: refs.ref(function handler(req) { /* ... */ }),
};
refs.deref(server.handler)({ /* ... */ });
```

See also [a larger worked example](https://github.com/rricard/proposal-refcollection/issues/5#issuecomment-619104173) demonstrating the kinds of code patterns where you'd want to be able to use a RefBookkeeper across several functions.

However, this pattern would lead to some memory management issues: As the program runs, more and more things might get added to `refs`, the Array just gets longer and longer, and all of the entries in the Array have to be held in memory just in case someone calls `deref` to get them, even if no one is actually going to do that. But really, we should avoid referencing things that are no longer needed anymore, so the garbage collector can reclaim memory. There are two ways to do this:
- If we reuse one global `RefBookkeeper` everywhere, then we need to manually delete unused entries when we're done with them, with an extra `delete` method added to the class, allowing the index to be reused. (We can't use FinalizationRegistry for this, since Numbers are never really dead.)
- Otherwise, we could avoid using a global RefBookkeeper, and instead use smaller local ones, which we pass around in parallel to the Record/Tuple, as in the previous examples.
Expand Down Expand Up @@ -191,20 +200,72 @@ When working in the context of multiple JS global objects (e.g., with a same-ori

It's possible to get around this by passing `Box.prototype.deref()` to that other global object. This method "grants the right" to dereference boxes created in the context of that global object. In this way, different global objects are a little bit isolated from each other, when it comes to `Box`.

###### Membrane/Ocap implications
###### Membrane/Ocap analysis

Our hope was that, the fact that object operations are used to access and call `Box.prototype.deref`, and this can be membrane-wrapped, would imply that the Box system would work great for membrane-based security--just run each membrane-enclosed piece of code in a separate Realm, and you can safely share Boxes, granting or denying access based on the access to the `Box` constructor or `deref` method.

However, in systems which want to use membrane-based isolation *within* the same Realm, the whole feature would have to be turned off. This is to prevent boxes from "piercing" through membranes: a box primitive could be passed from one side of the membrane to the other (the membrane has no chance to intervene since boxes are not objects), and then it can simply be dereferenced. This would constitute an unmediated cross-membrane communication channel, going against the goals of the membrane system.

All access to the contents of boxes goes through `Box.prototype.deref`. A Realm's `Box.prototype` is ambiently available within that Realm, since it's referenced from ToObject, as long as you have a box. Although it is possible to turn off this feature from JavaScript by removing that function, this seems unfortunate for two reasons:
- If the Realm is already deeply frozen, it's no longer possible to mutate `Box.prototype`
- It'd be unfortunate to lose the functionality!

(Note that, if this kind of isolation is taken to be a requirement, it would have other implications for other kinds of potential TC39 proposals, such as Record classes/value types.)

### BoxMakers to avoid Realm-wide communication channels

> Boxmaker, boxmaker, plan me no plans.
>
> I’m in no rush. Maybe I’ve learned
>
> Playing with boxes a girl can get burned.
>
> -- [A musical](http://www.exelana.com/lyrics/MatchmakerMatchmaker.html), probably
Ultimately, what we want is, a way to construct and dereference boxes, without being attached to the Realm. It could be used something like this:

```js
const Box = new BoxMaker();

const obj = { hello: "world" };
const box = Box(obj);
assert(typeof box === "box", "boxes are a new primitive type");
assert(obj !== box, "boxes are not their boxed object");
assert(obj.hello === Box.deref(box).hello, "boxes can deref props");
assert(obj === Box.deref(box), "boxes can deref the full object");
```

```js
const server = #{
port: 8080,
handler: Box(function handler(req) { /* ... */ }),
};
Box.deref(server.handler)({ /* ... */ });
```

This is definitely a bit uglier--it's annoying to have to instantiate your own `Box` constructor, call out to it explicitly to call `deref` as opposed to using method chaining, and to coordinate all your code to use the same `Box`. Still, if you reuse the same `Box` constructor across all your functions, they interact "for free" without any manual bookkeeping or passing around of auxiliary structures, as described with `RefBookkeeper` above.

The need for a `Box`/`BoxMaker` separation is a direct consequence of the goal to avoid membranes within the same Realm that can use this feature.

This isolation enables membrane-based security in the context of Boxes. Since Boxes are primitives, they cannot be membrane-wrapped. But, the `Box.prototype.deref()` method exists separately--this *method* can be membrane-wrapped, in order to mediate before giving out the information contained in the box.
#### Another way to look at it: RefCollection

Conceptually, in object-capability (ocap) security terms: Box primitives don't directly point to the object. Instead, they are treated as keys in a mapping, owned by the `Box` constructor, which `Box.prototype.deref` uses. Note that a Realm may deny access to this built-in mapping by monkey-patching `Box.prototype.deref`, which is the only point of access.
Our initial attempt to provide a solution for this problem space was [RefCollection](https://github.com/rricard/proposal-refcollection), which is basically the same as `BoxMaker`, except that it uses Symbols instead of Box primitives.

When you call the Symbol constructor, you get a new Symbol; ultimately it can serve just as well to reference the Object as a Box primitive does. The RefCollection API allocates the Symbol for you, just like `Box.ref` creates a box. They're quite similar, when it comes down to it!

RefCollection is a bit simpler since it doesn't create a new primitive type, just reusing what we have. However, it's not possible to polyfill RefCollection without something new in the language: we run into the exact problem described with RefBookkeeper above, where it can fill up with junk that's not relevant anymore, if the same RefCollection keeps getting reused. (Refer to [this example](https://github.com/rricard/proposal-refcollection/issues/5#issuecomment-619104173) to see how you might want to continue using a RefCollection over time.)

There was a bit of debate about the details of the API surface of RefCollection, e.g., as it would relate to reusable templates. This seems to point towards exposing a low-level primitive that could address these memory management issues, leaving JavaScript libraries and frameworks to develop the best API surface.

### `symbol` as `WeakMap` keys

Another solution is to make `WeakMap` accept `symbol` as keys. This could enable improvements to the userland solution seen before and let us do global automatic bookkeeping of references:
We propose to make `WeakMap` accept `symbol` as keys. This would allow JavaScript libraries to implement their own RefCollection-like things which could be reusable (avoiding the need to pass around the mapping all over the place, using a single global one, and just passing around Records and Tuples) while not leaking memory over time.

```js
class RefBookkeeper {
#references = new WeakMap();
ref(obj) {
ref(obj) {
// (Simplified; we may want to return an existing symbol if it's already there)
const sym = Symbol();
this.#references.set(sym, obj);
return sym;
Expand All @@ -213,26 +274,29 @@ class RefBookkeeper {
}
globalThis.refs = new RefBookkeeper();

// Usage
const server = #{
port: 8080,
handler: refs.ref(function handler(req) { /* ... */ }),
};
refs.deref(server.handler)({ /* ... */ });
```

See [earlier discussion](https://github.com/tc39/ecma262/issues/1194) on Symbols as WeakMap keys--it remains controversial and not currently supported.
See [earlier discussion](https://github.com/tc39/ecma262/issues/1194) on Symbols as WeakMap keys--it remains controversial and not currently supported.

### RefCollection
Some questions to discuss about Symbols as WeakMap keys:
- Can registered symbols be used as WeakMap keys? Some TC39 delegates have argued strongly in either direction. We see both "allowing" and "disallowing" as acceptable otpions.
- Allowing registered symbols doesn't seem so bad, since registered Symbols are analogous to Objects that are held alive for the lifetime of the Realm. In the context of a Realm that stays alive as long as there's JS running (e.g., on the Web, the Realm of a Worker), things like `Symbol.iterator` are analogous to primordials like `Object.prototype` and `Array.prototype`, and registered `Symbol.for()` symbols are analogous to properties of the global object, in terms of lifetime. Just because these will stay alive doesn't mean we disallow them as WeakMap keys.
- Prohibiting registered symbols doesn't seem so bad, since it's already readily observable whether a Symbol is registered, and it's not very useful to include these as WeakMap keys. Therefore, it's hard to see what practical or consistency problems the prohibition would create, or why it would be surprising (if there's a meaningful error message).
- We could support Symbols in WeakRefs and FinalizationRegistry, or not. It's not clear what the use cases are, but it would seem consistent with adding them as WeakMap keys.

Our initial attempt to provide a solution for this problem space was [RefCollection](https://github.com/rricard/proposal-refcollection) which sits in between the two aforementioned solutions.
- A RefCollection could be built with WeakMaps with Symbol keys, but it's a somewhat higher-level mechanism.
- On the other hand, you could think of Box as a built-in per-Realm RefCollection, that just happens to use a new primitive type which is parallel to Symbol.
As starting points, we propose that all Symbols be allowed as WeakMap keys, WeakSet entries, and in WeakRefs and FinalizationRegistry.

Since you could think of these each as implemented in terms of the other, more or less, they're equivalent at some level: they all attempt to leverage a primitive's identity in order to let an object be accessed, mediated through a mapping that only holds that object alive if the unforgeable primitive exists.
We plan to propose "Symbols as WeakMap keys" for Stage 1 at the June 2020 TC39 meeting. By going through the stage process, we hope we can develop consensus on answers to these questions little by little.

## Summing up

We think that `Box` has the most ergonomic surface area of these three options, which is why this document puts that solution front and center, ahead of Symbols in WeakMaps and RefCollection. At the same time, the userspace solutions seem sufficient for many/most use cases; we believe that Records and Tuples are very useful without Box or any other built-in mechanism for referencing objects from primtiives, and therefore makes sense to proceed with Records and Tuples independently.
We think that adding Symbols as WeakMap keys is a useful, minimal primitive enabling Records and Tuples to reference Objects while respecting the constraints imposed by the goal to support membrane-based isolation within a single Realm. At the same time, the userspace solutions seem sufficient for many/most use cases; we believe that Records and Tuples are very useful without any additional mechanism for referencing objects from primtiives, and therefore makes sense to proceed with Records and Tuples independently of this proposal.

[rtp]: https://github.com/tc39/proposal-record-tuple
[rcp]: https://github.com/rricard/proposal-refcollection

0 comments on commit c28c866

Please sign in to comment.