Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new approach to array-ish classes #796

Open
domenic opened this issue Sep 12, 2019 · 19 comments
Open

A new approach to array-ish classes #796

domenic opened this issue Sep 12, 2019 · 19 comments

Comments

@domenic
Copy link
Member

domenic commented Sep 12, 2019

For a long time we've had some issues with arrays/indexed getter types on the platform. I am beginning to think we should solve this in Web IDL. The below long write-up walks through my thoughts.

/cc @tabatkins @annevk @syg @rakina as folks I've discussed this with previously.

Background

The problem

We've historically had trouble representing array-ish types on the platform. Specifically, there are two cases that are very common in web specs, but JavaScript provides no good solution for:

  • Readonly arrays: i.e., arrays whose contents can be changed by the creator of the array, but not by the consumer of the array. Examples where this semantic is useful are navigator.languages, document.styleSheets, or node.childNodes.

  • React-to-able arrays: i.e., arrays which are mutable by the consumer, but any mutations need to be validated (usually for type-checking), and potentially trigger some side effect. xamples where this semantic is useful are element.ariaDescribedByElements or document.adoptedStyleSheets.

JavaScript does provide for two cases. Those are: mutable arrays that you don't need to react to (just Array), and completely immutable arrays (frozen Arrays, i.e. Arrays which have had Object.freeze() called on them). But the above two cases are not in reach.

Historical approaches

We have tried two main solutions to these use cases:

  • Array-ish classes, i.e. Web IDL interfaces with indexed getters and possibly setters. Examples include StyleSheetList and NodeList for the readonly case, or HTMLOptionsCollection for the react-to-able case.

    These suck in various ways, in roughly decreasing order of terribleness:

    • They do not have any useful array methods like map(), filter(), find(), ...
    • They often have weird methods that duplicate array behavior, like add(), remove(), or item()
    • They are not instanceof Array
    • Array.isArray() returns false on them
    • someArray.concat(oneOfThose) does not concatenate, but instead adds oneOfThose as a single new item
  • The FrozenArray type. Examples include navigator.languages and document.adoptedStyleSheets. This is generally seen as the preferred modern pattern. In particular, all of the cons of the array-ish classes are avoided.

    However, frozen arrays are suboptimal in the following ways:

    • For the readonly case: references you obtain from the getter can get stale. That is, if you do const langs = navigator.languages, langs will not get updated if the UA's languages change. Instead, navigator.languages will start returning a new frozen array instance, such that navigator.languages !== langs. This can be somewhat of a footgun.

    • For the react-to-able case: mutating the arrays must be done via the setter. E.g. to insert, you'd do something like document.adoptedStyleSheets = [...document.adoptedStyleSheets, newSheet]. This is less pleasant than being able to use normal array mutation methods, e.g. document.adoptedStyleSheets.push(newSheet). (It's also less efficient, although in most cases the small JS-object-related cost here is outweighed by the cost of the actual reaction, e.g. style recalc or DOM mutation.)

Solutions

Ideal, TC39 solution

I think the best outcome here would be if TC39 enabled array creators to install set hooks on the array. Then you'd implement readonly arrays via something like

const arr = Array.withSetHook(() => {
  if (!specialBooleanOnlyIControl) {
    throw new TypeError("This array is readonly");
  }
});

// In use code:
arr.push(1); // throws

// In array creator code:
specialBooleanOnlyIControl = true;
arr.push(1);
specialBooleanOnlyIControl = false;

and you'd implement react-to-able arrays via something like

const arr = Array.withSetHook((value, i) => {
  if (isWrongType(value)) {
    throw new TypeError("Can only accept CSSStyleSheet instances or whatever");
  }
  
  reactToTheNewCSSStyleSheet(value);
});

arr.push(1);                  // throws
arr.push(new CSStyleSheet()); // works and gets reacted to

I mentioned this to @syg and he was hesitant from an implementation complexity perspective in the JS engine. He suggested we just use proxies instead. Which brings us to...

Solving this in Web IDL

The essence of this solution is to basically do the "array-ish classes" thing, but better. If we just design two specific array-ish classes, designed from the very beginning to not be terrible, we can get pretty close to optimal. And then instead of spec authors continuing to re-invent new bad array-ish classes using indexed getters/setters, we would just tell them to always use these two, and get the benefits of all our design work.

We would define two types in Web IDL: ReadonlyArray and ReactToAbleArray. (OK, we probably need a better name for the last one.) One version of this would be something like

interface ReadonlyArrayController {
  void splice(unsigned long long start, unsigned long long deleteCount, any... items);
  // we could have more mutating methods here if we wanted
};
callback ReadonlyArrayExecutor = void (ReadonlyArrayController controller);

[LegacyArrayClass]
interface ReadonlyArray {
  constructor(ReadonlyArrayExecutor executor);
  getter any (unsigned long long index);
};
callback ReactToAbleArrayReactor = void (any item, unsigned long long index);

[LegacyArrayClass]
interface ReactToAbleArray {
  constructor(ReactToAbleArrayReactor reactor);
  getter any (unsigned long long index);
  setter void (unsigned long long index, any value);
};

We probably want to set the @@species of these to Array, so that oneOfThese.map(...) and oneOfThese.filter(...) produce Arrays instead of ReadonlyArrays or ReactToAbleArrays.

We probably also to set the @@isConcatSpreadable of these to true, so that array.concat(oneOfThese) properly concatenates.

Note that the above designs will still fail Array.isArray(). There's no way to pass that check without getting changes to the 262 spec.

Once we defined these types, we could evangelize their usage everywhere. We could likely also upgrade many of the existing cases on the platform to use them, which would transparently give web developers more useful array types, e.g. allowing document.adoptedStyleSheets.push() or maybe even nodeList.filter().

Variants and tweaks

We could consolidate these into a single type, the ReactToAbleArray. It'd be more awkward to use (you need the specialBooleanOnlyIControl technique), but at least for spec authors we could paper over that with some Web IDL-provided boilerplate phrases. The web developer-facing differences when consuming the array would be relatively minimal; worse error messages or debugging experience would be the only thing I can think of.

We probably should add a type parameter to these, e.g. ReactToAbleArray<T>, since type-filtering is so prevalent on the web platform.

Instead of using [LegacyArrayClass], i.e. instead of inheriting from Array.prototype and using all its methods directly, we could install appropriate methods directly on the two types. (Including omitting mutation methods from ReadonlyArray.) This would increase our maintenance burden in the Web IDL spec a bit, requiring us to stay in sync with 262. And it would make these classes fail instanceof Array checks. But it could be a bit nicer for implementations in some ways, I suspect.

@bzbarsky
Copy link
Collaborator

Thank you for thinking about this!

It's not quite clear to me how the executor bits fit into the proposal. Is there an explanation somewhere?

Presumably there should be some length getters (and setters; see below) on these interfaces.

For ReadonlyArray, it seems like the main things we want are:

  1. Have Array.prototype on the proto chain.
  2. Set @@isConcatSpreadable

@@species thing is not an issue; the array methods in ES only look at @@species if the object tests true for isArray; otherwise they create a vanilla Array, which is the behavior we want.

In theory we could do this for everything with an indexed getter already, modulo possible web compat issues for existing types, right? So the main benefits (and they're useful benefits!) of ReadonlyArray are that we know there is no web compat issue and we can avoid having a silly item method on it. Is that a correct summary of the ReadonlyArray situation?

For ReactToAbleArray, same things apply plus we want a writable length that allows setting length to anything <= current length (e.g. Array.prototype.push sets the length after doing the index-based assignment, so we need to support at least setting to current value to support push if we use the Array methods).

At least on the Gecko side, I suspect implementation complexity is much lower if we just do [LegacyArrayClass] instead of reimplementing all of that stuff. But maybe I'm missing something; I'd like to know what implementation benefits would come from the reimplementation approach.

Fundamentally, if we just reintroduce [LegacyArrayClass] (which is funny, given Gecko had support for it and was using it, and people told us they would never implement it so we should remove it!) and define that it sets @@isConcatSpreadable, this is all something specs could do themselves; the annoying part is the proliferation of interfaces involved and the fact that they might not do it quite right, yes?

@domenic
Copy link
Member Author

domenic commented Sep 13, 2019

It's not quite clear to me how the executor bits fit into the proposal. Is there an explanation somewhere?

No explanation, so here's a quick one. The intention is basically to allow these to be usable by web developers in the same way as they would be for spec authors. The executor in particular is an instance of the revealing constructor pattern. The usage would be something like

let controller;
const arr = new ReadonlyArray(c => {
  c.splice(0, 0, "a", "b");
  controller = c;
});

// later
controller.splice(0, 1);

and

const arr = new ReactToAbleArray((item, index) => {
  if (isWrongType(value)) {
    throw new TypeError("Can only accept CSSStyleSheet instances or whatever");
  }
  
  reactToTheNewCSSStyleSheet(value);
});

In theory we could do this for everything with an indexed getter already, modulo possible web compat issues for existing types, right? So the main benefits (and they're useful benefits!) of ReadonlyArray are that we know there is no web compat issue and we can avoid having a silly item method on it. Is that a correct summary of the ReadonlyArray situation?

I think the main other benefit is providing a clear path forward for future cases in the platform. Without something in the Web IDL spec, people would continue to need to invent them per-spec. Also we could wrap up some of the boilerplate for folks and provide sanctioned patterns for manipulating the contents of the collection.

Indeed, my OP was mostly thinking about what class to use for future properties. Retrofitting existing classes is probably also a reasonable thing to do, and the strategy there could look like one of:

  • Switch specs to use these new classes, with potential web-compat pains (e.g. .constructor.name would change, item() methods would disappear, isConcatSpreadable behavior would change)
  • Define some way of "mixing in" the appropriate behavior, e.g. legacyarraylike<T>, which would minimize the number of potential web-compat problems
  • Just go through the spec systematically and upgrade them on a case-by-case bases

However, I'm unsure how much appetite there is for upgrading existing APIs, especially if there are compat risks. (I haven't run any of this by Chrome bindings team yet; instead it's motivated by the folks working on adopedStyleSheets.) So that's why I was focused on future cases for now.

Also, there is the potential benefit of providing these capabilities to web developers (via the constructor()s). That is not as urgent but it seems like the right thing to do.

At least on the Gecko side, I suspect implementation complexity is much lower if we just do [LegacyArrayClass] instead of reimplementing all of that stuff. But maybe I'm missing something; I'd like to know what implementation benefits would come from the reimplementation approach.

I was mostly speculating, so it's probably not worth getting in to. I can't even remember very well what my speculation was :(.

@bzbarsky
Copy link
Collaborator

The intention is basically to allow these to be usable by web developers in the same way as they would be for spec authors.

Ah, ok. The details of what those callbacks get we should think about, but the general idea makes sense.

I think the main other benefit is providing a clear path forward for future cases in the platform.

Right, makes sense.

If we wanted typed readonly arrays, by the way, we could roll all this up into a parametrized declaration (a la iterable) that creates an anonymous indexed getter, changes the proto chain, adds a constructor, adds @@isConcatSpreadable, etc. One difference there is that we would also allow adding other methods on the array type; whether this is a benefit or a drawback is an interesting question.

I guess this is basically the legacyarraylike<T> possibility you mention above?

However, I'm unsure how much appetite there is for upgrading existing APIs,

I, personally, would really like it if the return value of querySelectorAll did not suck to use... I agree that there are compat risks there, though, and it would have to be done carefully. But I think the benefit would be quite worthwhile if we can manage it.

Unfortunately, for legacy APIs we may be able to do things like change the proto chain but not add @@isConcatSpreadable or the other way around or whatnot... That sort of argues for having separate primitives for these pieces if we want to address legacy APIs, but we can create the new thing first as a monolith and then think about refactoring it too as use cases come up.

@tabatkins
Copy link
Contributor

As you know, I'm quite excited about this!

So, it looks like TypedOM would change like:

interface CSSMathSum : CSSMathValue {
-     readonly attribute CSSNumericArray values;
+     readonly attribute ReadonlyArray<CSSNumericValue> values;
};

right? CSSNumericArray is just an interface with indexed getters/setters that typechecks its entries as CSSNumericValue.

@bzbarsky
Copy link
Collaborator

That would presumable be a ReactToAbleArray, not a ReadonlyArray. But yes, that would be the idea.

@domenic
Copy link
Member Author

domenic commented Sep 17, 2019

Does anyone have a non-terrible name for ReactToAbleArray?

@heycam
Copy link
Collaborator

heycam commented Sep 17, 2019

ObservableArray?

@bzbarsky
Copy link
Collaborator

I guess I actually have one other question about the proposed setup. Would each typed instantiation of these array types have its own proto, or would they all just use Array.prototype? The proposal sounds like the former, more or less, right?

@domenic
Copy link
Member Author

domenic commented Sep 17, 2019

Good question. I think my intention was that they would all use Array.prototype? Or maybe there would be single ReadonlyArray.prototype and ObservableArray.prototype.

Edit: yes, there would need to be a ReadonlyArray.prototypeand ObservableArray.prototype to support the constructor thing I proposed.

@tabatkins
Copy link
Contributor

That would presumable be a ReactToAbleArray, not a ReadonlyArray. But yes, that would be the idea.

Actually TypedOM needs ReadonlyArray for that case specifically; numeric operations form a tree (or actually a graph if you're doing weird stuff with reusing values) and typechecking (in the CSS type sense, not the JS/IDL type sense) is both strict and kinda expensive.

But yes, the other places where I'm doing arraylikes by hand (CSSTransformValue and CSSUnparsedValue) are Observable.


The reason I was asking about NumericArray first, tho, is that those other two cases actually are quite different - they're CSSStyleValue subclasses. Is that sort of thing going to be supported? Or must these arraylikes definitely be subclasses of ReadonlyArray/ObservableArray?

@domenic
Copy link
Member Author

domenic commented Sep 17, 2019

I think the next steps here are for me to write up a Web IDL pull request, plus some pull requests to other specs (e.g. CSS Typed OM, Constructible Style Sheets) that show this in action. I'll focus on "new" arraylikes now, i.e. ones with minimal compat concerns and no extra methods like item(). If we can get that working smoothly, then we can work on porting it back to trickier cases, which might require the legacyarraylike<T> type thing.

Any thoughts or concerns about those steps? E.g. a worry from folks that we should work on the legacy-retrofit-friendly thing at the same time as the new classes, since they might share a good amount of spec infra?

@tabatkins
Copy link
Contributor

We should probably work on both the extends ReadonlyArray and arraylike<>; approaches at the same time. But they don't need to block each other, I don't think; I'm happy for the extends approach to go faster.

@tabatkins
Copy link
Contributor

(Also, I'm happy to investigate switching CSSTransformValue and CSSUnparsedValue away from arraylike<>; and towards having an ObservableArray member instead. When I had to do everything manually anyway, their current spec was just easier to do, as it meant I didn't have to design two more single-use interfaces. It's likely that they're still changeable at this point.)

@syg
Copy link
Contributor

syg commented Oct 16, 2019

@erights presented a now-stage 1 proposal on read-only collections at the October TC39. It'd be nice to align here, so that the semantics of the WebIDL ReadOnlyArray can eventually be switched out for the in-language version if the proposal happens.

ObservableArray will probably not have any language equivalents for the foreseeable future.

Edit: Got the meeting date wrong.

@domenic
Copy link
Member Author

domenic commented Jan 23, 2020

Note that the above designs will still fail Array.isArray(). There's no way to pass that check without getting changes to the 262 spec.

For completeness, I think this could be hacked around by making ReadonlyArray and ObservableArrays exotic proxy objects around inner Arrays, since Array.isArray(new Proxy([], {})) is true (for some reason). However, I have a very hard time seeing this play well with Web IDL binding infrastructure in implementations, which currently generates platform objects with their own brands, that would not have the "is a JS-spec Proxy" brand and a [[ProxyTarget]] hooked up.

I guess we could define them as exotic proxy objects in the spec, with a [[ProxyTarget]] and everything, and have implementations do some kind of behind-the-scenes hack on top of their platform objects to get the same observable behavior. But, this seems pretty hacky...

@domenic
Copy link
Member Author

domenic commented Jan 24, 2020

I've put up a pull request for observable arrays at #836. You can see a summary including examples at https://pr-preview.s3.amazonaws.com/heycam/webidl/pull/836.html#idl-observable-array

@domenic
Copy link
Member Author

domenic commented Jan 24, 2020

In #836 I've started a discussion on some of the details of that particular proposal. In this thread I'll step back a bit and note some features of the current proposal that we might want to rethink.

The main one is that we could do more work to make observable arrays transparently array-like. In particular, right now ObservableArray is defined as a class that inherits from Array, so:

someObservableArray.constructor !== Array; // it's ObservableArray
someObservableArray.prototype !== Array.prototype; // it's ObservableArray.prototype

Additionally, as discussed previously, since this does not meet the definition of "Array exotic object":

Array.isArray(someObservableArray) === false;
JSON.stringify(someObservableArray) === "[object ObservableArray]";

It's possible to fix all of these, by instead specifying observable arrays a legit JS Proxy around an inner Array instance. Note that this is different than the current proposal, which uses proxy-like techniques of overriding the internal methods [[GetOwnProperty]], [[Delete]], [[DefineOwnProperty]], etc. We would need to instead literally use the Proxy definitions from the 262 spec, including those specific internal methods, [[ProxyTarget]], etc.

The reason for the current approach is that the ObservableArray-as-a-distinct-class approach reuses a lot of existing Web IDL machinery. The only things the bindings team would need to do are handling the setters/getters (in a fashion similar to what we envision for FrozenArray anyway, per #810 (comment)), and a few minor tweaks on top of platform object allocation. Whereas, the Proxy-based approach would require writing some fairly custom and intricate bindings code, I believe.

Nevertheless, I'm leaning toward the Proxy-like approach anyway, for priority of constituencies reasons. That would also allow us to remove the constructor (it's just a proxied array; no constructor needed!), and thus defer some of the questions in #836 (comment).

@annevk
Copy link
Member

annevk commented Dec 18, 2020

What's remaining here now that #840 is merged?

@domenic
Copy link
Member Author

domenic commented Dec 18, 2020

Although ObservableArrays are usable for the ReadonlyArray case, they are a bit awkward. I believe at the very least you need to specify "set the indexed value" and "delete the indexed value" as throwing exceptions. So IMO it'd be good to add a ReadonlyArray sugar which does that for you, plus any other steps which I'm missing in this morning's quick skim. (Also we could then restrict ReadonlyArrays to only work with readonly attributes.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants