New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: DOMChangeList #270

Open
wycats opened this Issue Jun 20, 2016 · 62 comments

Comments

@wycats

wycats commented Jun 20, 2016

DOMChangeList and DOMTreeConstruction

Motivation

There are three high-level motivations for this API:

  1. To make it clearer and less error prone to apply a sequence of DOM operations at once, that can support the full gamut of mutations to Elements and Nodes.
  2. For efficiency, to provide an API for constructing a sequence of DOM operations that can be:
    1. constructed in a worker and transferred to the UI thread
    2. constructed with a minimum of allocations
    3. applied in one shot without interleaved user code
  3. To support, in one API, the superset of trees that can be produced using the HTML parser and using the DOM API (the HTML parser supports more tag names, while the DOM API supports more trees, such as custom elements nested inside a table).

This is an initial, minimal version of this API, which could be expanded over time with more capabilities. It should always maintain its goals of predictable, allocation-lite performance, focusing on giving user-space abstractions the capabilities they need for maximum performance.

Concepts

NodeToken

A NodeToken is an opaque value that represents a Node. It can be efficiently transferred from one worker to another.

It serves two purposes:

  • To represent a Node that already exists in another worker, and can serve as the starting point of a series of mutations expressed in a DOMChangeList.
  • To represent an intermediate Node that was produced as part of the process of building a DOMChangeList.

Applying a Change

The intent of this API is to have these performance characteristics:

  • Creating a change list (DOMTreeConstruction or DOMChangeList) significantly reduces GC-managed allocations compared to the current DOM APIs.
  • Applying a change list should not create intermediate JavaScript wrappers for the DOM objects it creates, which should reduce costs.
  • It is possible to run the JavaScript code necessary to construct a change list in a worker, and apply it on the UI thread with minimal additional work or allocations in JavaScript.
  • The API creates a single immutable blob of instructions to pass to the engine, which is intended to avoid JavaScript side effects while the application proceeds. (as Boris Zbarsky said in his comment about proposals in this space, "that would simplify both specification and implementation (in the sense of not having to spell out a specific processing algorithm, allowing parallelized implementation, etc).")
  • It is reasonable to assume that the builder APIs could be exposed to WebAssembly.

If there is some reason that a concrete implementation of this design might not be able to accomplish these goals, it's almost certainly something we should discuss.

Details

DOMTreeConstruction

Note: Use of TypeScript generics notation in the details of this proposal does not intend to imply any additional type checking. It is simply a way to annotate which kinds of node a particular node tokens is intended to produce for clarity. This proposal expects checking at the point of application (e.g. appending a node to a non-element node doesn't make sense and would be an error at the time of application). At present, this proposal does not assume that errors in the change list cause previously applied operations to be rolled back; that is listed as an open question below.

// you can create a NodeToken in the UI thread and transfer it into
// a worker that constructs the DOMChangeList or DOMTreeConstruction.
partial interface Node {
  getToken(): NodeToken<this>;  
}

interface Bounds {
  firstNode: NodeToken<Node>;
  lastNode: NodeToken<Node>;
}

partial interface Element {
  // this has the same semantics as insertBefore, but takes a `DOMTreeConstruction`
  insertTreeBefore(element: Element, tree: DOMTreeConstruction, reference: Node): Promise<AppliedChanges>;
}

// The intent of this object is to allow engines to clean up any bookkeeping maps they
// need to track `NodeToken`s once the `AppliedChanges` is GCed.
interface AppliedChanges {
  // this allows code that created a ChangeList to get the actual
  // node they were creating so they can add listeners or change
  // the node later.
  getNode<T extends Node>(t: NodeToken<T>): T;
}

// DOMTreeConstruction is transferrable, and is backed by raw bytes. This
// allows trees to be constructed in workers.
interface DOMTreeConstruction {
  openElement(name: string, ns: string = "html"): NodeToken<Element>;
  closeElement();

  setAttribute(name: string, value: string = "", ns: string = ""): void;
  appendText(text: string): NodeToken<Text>;
  appendComment(text: string): NodeToken<Comment>;
  appendHTML(html: string): Bounds;
}

DOMChangeList

partial interface Document {
  applyChanges(changes: DOMChangeList): Promise<AppliedChanges>;
}

// DOMChangeList is transferrable, and is backed by raw bytes. This
// allows change lists to be constructed in workers.
interface DOMChangeList {
  /// traverse
  nextSibling(n: NodeToken<Node>): NodeToken<Node | null>;
  previousSibling(n: NodeToken<Node>): NodeToken<Node | null>;
  firstChild(n: NodeToken<Node>): NodeToken<Node | null>;
  lastChild(n: NodeToken<Node>): NodeToken<Node | null>;
  parentNode(n: NodeToken<Node>): NodeToken<Element | null>;

  /// create
  createElement(name: string, ns: string = 'html'): NodeToken<Element>;
  createText(text: string): NodeToken<Text>;
  createComment(text: string): NodeToken<Comment>;

  /// update
  setAttribute(el: NodeToken<Element>, name: string, value: string = '', ns: string = ''): void;
  updateTextData(n: NodeToken<Text>, text: string): void;
  updateCommentData(n: NodeToken<Comment>, text: string): void;
  remove(n: NodeToken<Node>): void;

  /// append
  appendChild(el: NodeToken<Element>, node: NodeToken<Node>): void;
  appendTree(el: NodeToken<Element>, tree: DOMTreeConstruction): void;

  // this unifies the various HTML-based APIs in the DOM; the return value is
  // a token corresponding to the start and a token corresponding to the end of
  // the inserted tree which can be reified after application.
  //
  // IMPORTANT NOTE: The HTML is parsed during application, not when
  // it is added to the change list.
  insertHTMLBefore(node: NodeToken<Node>, html: string): [NodeToken, NodeToken];
}

Example

Using the usual DOM API:

let article = document.getElementById('main-article');
let text = node.firstChild;
text.textValue = "Hello world";
node.setAttribute("data-active", "yes");
node.removeAttribute("data-pending");

Using the ChangeList API:

let articleToken = document.getToken(document.getElementById('main-article'));

let changes = new DOMChangeList();
let textToken = changes.firstChild(token);
changes.updateTextData(textToken, "Hello world");
changes.setAttribute(articleToken, "data-active", "yes");
changes.removeAttribute(articleToken, "data-pending");

document.applyChanges(changes);

FAQ

Is this actually faster?

  1. Don't you still have to create all of the DOM nodes anyway? If so, why is it cheaper?
  2. Isn't DOM pretty fast in modern browsers already?

The intent of this API is to create a low-level interface that is as close as possible to the underlying implementations. It attempts to avoid introducing new costs while reducing a number of paper-cuts that exist in today's usage.

  1. This API creates DOM nodes in the engine, but it does not need to create JavaScript wrappers. Experiments with deep cloneNode show that skipping those wrappers provides a performance benefit, but cloneNode() can't satisfy as many use-cases as this API.
  2. It allows the construction of the set of mutations to occur separately from the application (or even in a worker), keeping the sensitive work that limits 60fps to a minimum.
  3. Since applying changes is asynchronous, the full change list can be applied in batches that avoid blocking interaction (especially scroll). If the browser reaches its budget, it can interleave some work to keep the UI interactive and pick up the mutation process afterward. In short, the API should allow browsers to experiment with more scheduling strategies.
  4. It encourages good staging practices, eliminating some of the major causes of layout thrash (see the next section).

Isn't the real issue that people are interleaving DOM manipulation and layout?

That is certainly a major issue, and this API puts developers on the path to success by encouraging them to stage DOM manipulation work separately from APIs that can trigger painting or layout.

Because the API guarantees that no user script can interleave during the application of changes, there is no way to "mess up" and trigger an immediate flush of any deferred work.

Unresolved Questions

  1. The current state of this API allows failures to occur during the processing of a change list, and does not require engines to roll back earlier changes (rolling back changes like "remove an iframe" may not be trivial to implement). Would engines prefer to roll back changes?
  2. Should we support APIs like ClassList and the style property through this API? It may be difficult to represent these kinds of changes with the operations already proposed (since this API does not allow direct imperative access to the DOM), and a few additional APIs probably wouldn't do damage to the constraints.
  3. Are there other optimizations that could be performed by engines? For example, Luke Wagner suggested that a parameterized version of this API could work like "prepared statements" in SQL, allowing engines to do up-front work to optimize the access and mutation patterns for application. What can we do to make this API more hospitable to hypothetical optimizations like those?
@KrisSiegel

This comment has been minimized.

Show comment
Hide comment
@KrisSiegel

KrisSiegel Jun 20, 2016

This introduces quite a few concepts for manipulating the DOM for something that's ultimately an optimization. This seems like an optimization that, with slightly an improved DOM API could accomplish the exact same thing with only minor additions to the API (perhaps ~~~a suspend and resume layout plus~~~ a way to mix elements so you can generate them external to the current thread and have the rendering engine apply them at once similar to your proposal but significantly less complex).

KrisSiegel commented Jun 20, 2016

This introduces quite a few concepts for manipulating the DOM for something that's ultimately an optimization. This seems like an optimization that, with slightly an improved DOM API could accomplish the exact same thing with only minor additions to the API (perhaps ~~~a suspend and resume layout plus~~~ a way to mix elements so you can generate them external to the current thread and have the rendering engine apply them at once similar to your proposal but significantly less complex).

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 20, 2016

This introduces quite a few concepts for manipulating the DOM for something that's ultimately an optimization.

... perhaps a suspend and resume layout ...

What would the existing APIs that trigger a layout flush do while layout was paused? Would pausing layout force an immediate layout flush to update all layout caches? If not, how would they compute the "old" layout once mutations have begun?

One of my concerns with proposals to augment the existing DOM APIs with some sort of transaction is that you have to decide what happens to all of the DOM's APIs while the transaction is in flight. In contrast, the proposed API is a closed set of operations, disallowing any other interleaved operations, making it possible to understand the complexity of the entire API.

wycats commented Jun 20, 2016

This introduces quite a few concepts for manipulating the DOM for something that's ultimately an optimization.

... perhaps a suspend and resume layout ...

What would the existing APIs that trigger a layout flush do while layout was paused? Would pausing layout force an immediate layout flush to update all layout caches? If not, how would they compute the "old" layout once mutations have begun?

One of my concerns with proposals to augment the existing DOM APIs with some sort of transaction is that you have to decide what happens to all of the DOM's APIs while the transaction is in flight. In contrast, the proposed API is a closed set of operations, disallowing any other interleaved operations, making it possible to understand the complexity of the entire API.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Jun 20, 2016

Member

Great to see this! I'm coming at this in comparison to Dru's proposal at https://github.com/drufball/async-append/blob/master/EXPLAINER.md, so let me share some thoughts:

As far as I can tell this proposal focuses mostly on trying to minimize allocations (but see below) and on providing a proxy API that can be used in a worker, instead of focusing on allowing chunking and scheduling of parsing/style/layout/paint as Dru's proposal does. I can't tell at first glance whether it can serve both purposes, or whether its design prevents that. There's a brief paragraph under "Is this actually faster?" number 3 that indicates it is intended to be compatible, but I'll defer to the Blink engineers.

I think the proposal might have a misapprehension about how DOM nodes and wrappers are related. Remember that wrappers are not created until they are accessed. With this in mind it seems like the proposal just substitutes one set of wrapper allocations (NodeTokens and DOMTreeConstructions) for another (Nodes and DocumentFragments). Looking at the example code seems to bear this out; counting wrapper allocations it seems to have the same amount as normal DOM manipulation code. (And remember a wrapper is just a pointer; a wrapper to a NodeToken and a wrapper to a Node are both the same cost.) If you assume the DOM nodes will eventually exist anyway, as Yehuda discusses, Yehuda's proposal actually seems to have more allocations: NodeToken backing + NodeToken wrapper + Node backing, vs. Dru's proposal which has Node wrapper + Node backing.

One concern I have about the idea of running these things in a worker is that the IPC involved in getting back to the main thread (in the case of hundreds or thousands of nodes/nodetokens) would probably be a significant cost. I'm not sure whether it would be greater than or less than the cost of just doing the work on the main thread; similarly I am not sure whether the IPC would be significant enough to cause jank. But maybe all you have to IPC is a single memory location of the queue in which case it is presumably fine.

So, going through the three high-level motivations Yehuda lists:

  1. Both proposals make it clear and less error prone to apply a sequence of DOM operations at once, through similar queue-then-commit systems. Both support the full gamut of mutations to Elements and Nodes, with Dru's doing so by just allowing you to use those normal DOM APIs and Yehuda's by creating new versions of (most of) them that operate on analogous data structures.
  2. This bullet is weird since it doesn't list actual use cases, but instead specific ideas as to how good performance would be achieved, despite lack of data showing that these proposed solutions actually help. But anyway, going through each in turn:
    1. Yehuda's proposal allows the sequence of operations to be constructed in a worker and transferred; Dru's does not.
    2. Yehuda's proposal has slightly more allocations, as discussed aboved.
    3. Both have the queue-then-commit structure.
  3. Both support the union of the trees that can be produced using the HTML parser and using the DOM API, Dru's by using the HTML parser directly (innerHTML/outerHTML) and Yehuda's presumably by e.g. allowing openElement to take more element names than createElement does (although I couldn't find where this is specified). We could of course try to expand createElement again; nobody's had the motivation so far, but maybe this will be the last straw.

Also going through the "Applying a Change" bullets:

  • Neither proposal reduces GC-managed allocations compared to current DOM APIs, as discussed above; Yehuda's slightly increases it.
  • Both proposals avoid creating intermediate JavaScript wrappers in most cases (e.g. using innerHTML, textContent, setAttribute). Both proposals do not avoid creating wrappers for other cases; e.g. creating an element requires creating a NodeToken wrapper1 in Yehuda's and a Node wrapper in Dru's.
  • As above, Yehuda's proposal works in a worker with minimal extra work and a slight amount of extra allocations in JavaScript at commit time (but perhaps quite a lot in the browser; I am not sure). Dru's proposal does not run in a worker.
  • Both proposals create a single immutable (?) blob of instructions to pass to the engine (the queue), and do not allow JavaScript side effects to interleave during a commit of the queue.
  • Both proposals could equally be exposed to WebAssembly since both would be specified using Web IDL.

So it seems to me like the proposals are very similar with the main divergence being the desire to allow building the queue of mutations in a worker in Yehuda's proposal, which necessitates creating a parallel object hierarchy in order to build the queue over in the worker, before then transferring the queue to the main thread before (or during?) a commit.

To me it feels like an empirical question whether the gains from building the parallel DOM-like tree in the worker instead of building a DOM tree in the main thread outweigh the costs of transferring it. There's also the empirical question of whether building a DOM tree in the main thread actually takes long enough to cause jank; if it takes e.g. 1-2 ms for thousands of nodes, then it'd probably be best to just eat the cost and avoid the conceptual burden of introducing a parallel object hierarchy, or at least save it for a future extension once we have gotten rid of all the more low-hanging perf fruit and are grubbing for a few milliseconds.

Member

domenic commented Jun 20, 2016

Great to see this! I'm coming at this in comparison to Dru's proposal at https://github.com/drufball/async-append/blob/master/EXPLAINER.md, so let me share some thoughts:

As far as I can tell this proposal focuses mostly on trying to minimize allocations (but see below) and on providing a proxy API that can be used in a worker, instead of focusing on allowing chunking and scheduling of parsing/style/layout/paint as Dru's proposal does. I can't tell at first glance whether it can serve both purposes, or whether its design prevents that. There's a brief paragraph under "Is this actually faster?" number 3 that indicates it is intended to be compatible, but I'll defer to the Blink engineers.

I think the proposal might have a misapprehension about how DOM nodes and wrappers are related. Remember that wrappers are not created until they are accessed. With this in mind it seems like the proposal just substitutes one set of wrapper allocations (NodeTokens and DOMTreeConstructions) for another (Nodes and DocumentFragments). Looking at the example code seems to bear this out; counting wrapper allocations it seems to have the same amount as normal DOM manipulation code. (And remember a wrapper is just a pointer; a wrapper to a NodeToken and a wrapper to a Node are both the same cost.) If you assume the DOM nodes will eventually exist anyway, as Yehuda discusses, Yehuda's proposal actually seems to have more allocations: NodeToken backing + NodeToken wrapper + Node backing, vs. Dru's proposal which has Node wrapper + Node backing.

One concern I have about the idea of running these things in a worker is that the IPC involved in getting back to the main thread (in the case of hundreds or thousands of nodes/nodetokens) would probably be a significant cost. I'm not sure whether it would be greater than or less than the cost of just doing the work on the main thread; similarly I am not sure whether the IPC would be significant enough to cause jank. But maybe all you have to IPC is a single memory location of the queue in which case it is presumably fine.

So, going through the three high-level motivations Yehuda lists:

  1. Both proposals make it clear and less error prone to apply a sequence of DOM operations at once, through similar queue-then-commit systems. Both support the full gamut of mutations to Elements and Nodes, with Dru's doing so by just allowing you to use those normal DOM APIs and Yehuda's by creating new versions of (most of) them that operate on analogous data structures.
  2. This bullet is weird since it doesn't list actual use cases, but instead specific ideas as to how good performance would be achieved, despite lack of data showing that these proposed solutions actually help. But anyway, going through each in turn:
    1. Yehuda's proposal allows the sequence of operations to be constructed in a worker and transferred; Dru's does not.
    2. Yehuda's proposal has slightly more allocations, as discussed aboved.
    3. Both have the queue-then-commit structure.
  3. Both support the union of the trees that can be produced using the HTML parser and using the DOM API, Dru's by using the HTML parser directly (innerHTML/outerHTML) and Yehuda's presumably by e.g. allowing openElement to take more element names than createElement does (although I couldn't find where this is specified). We could of course try to expand createElement again; nobody's had the motivation so far, but maybe this will be the last straw.

Also going through the "Applying a Change" bullets:

  • Neither proposal reduces GC-managed allocations compared to current DOM APIs, as discussed above; Yehuda's slightly increases it.
  • Both proposals avoid creating intermediate JavaScript wrappers in most cases (e.g. using innerHTML, textContent, setAttribute). Both proposals do not avoid creating wrappers for other cases; e.g. creating an element requires creating a NodeToken wrapper1 in Yehuda's and a Node wrapper in Dru's.
  • As above, Yehuda's proposal works in a worker with minimal extra work and a slight amount of extra allocations in JavaScript at commit time (but perhaps quite a lot in the browser; I am not sure). Dru's proposal does not run in a worker.
  • Both proposals create a single immutable (?) blob of instructions to pass to the engine (the queue), and do not allow JavaScript side effects to interleave during a commit of the queue.
  • Both proposals could equally be exposed to WebAssembly since both would be specified using Web IDL.

So it seems to me like the proposals are very similar with the main divergence being the desire to allow building the queue of mutations in a worker in Yehuda's proposal, which necessitates creating a parallel object hierarchy in order to build the queue over in the worker, before then transferring the queue to the main thread before (or during?) a commit.

To me it feels like an empirical question whether the gains from building the parallel DOM-like tree in the worker instead of building a DOM tree in the main thread outweigh the costs of transferring it. There's also the empirical question of whether building a DOM tree in the main thread actually takes long enough to cause jank; if it takes e.g. 1-2 ms for thousands of nodes, then it'd probably be best to just eat the cost and avoid the conceptual burden of introducing a parallel object hierarchy, or at least save it for a future extension once we have gotten rid of all the more low-hanging perf fruit and are grubbing for a few milliseconds.

@KrisSiegel

This comment has been minimized.

Show comment
Hide comment
@KrisSiegel

KrisSiegel Jun 20, 2016

What would the existing APIs that trigger a layout flush do while layout was paused? Would pausing layout force an immediate layout flush to update all layout caches? If not, how would they compute the "old" layout once mutations have begun?

No a pause wouldn't flush only a resume. This is how other frameworks operate. Regardless I struck the suspend / resume from my comment as it unnecessarily complicated my position.

One of my concerns with proposals to augment the existing DOM APIs with some sort of transaction is that you have to decide what happens to all of the DOM's APIs while the transaction is in flight. In contrast, the proposed API is a closed set of operations, disallowing any other interleaved operations, making it possible to understand the complexity of the entire API.

Transaction capability typically takes on a begin then rollback or commit pattern. This proposal doesn't do that. This proposal creates multiple new APIs to do what the existing DOM APIs can already do today except that it applies them in bulk when you're finished.

This proposal, in my opinion, could be significantly simplified by simply making it: document.appleChanges(source, destination);. Obviously you'd need to update how web workers work if you want to build a DOM tree in it but you'll have to do something similar anyway in your proposal for the Token pseudo elements.

The destination is the HTML node to apply all your changes to and the source is your separately built DOM. No reason the underlying engine couldn't handle all the complexity your introduced.

The DOM API is inelegant; I don't think it's a good idea to incorporate increased complexity for a theoretical optimization. If possible, re-use or work towards a better designed API that lends itself better to transactions. Transactions would be pretty cool.

KrisSiegel commented Jun 20, 2016

What would the existing APIs that trigger a layout flush do while layout was paused? Would pausing layout force an immediate layout flush to update all layout caches? If not, how would they compute the "old" layout once mutations have begun?

No a pause wouldn't flush only a resume. This is how other frameworks operate. Regardless I struck the suspend / resume from my comment as it unnecessarily complicated my position.

One of my concerns with proposals to augment the existing DOM APIs with some sort of transaction is that you have to decide what happens to all of the DOM's APIs while the transaction is in flight. In contrast, the proposed API is a closed set of operations, disallowing any other interleaved operations, making it possible to understand the complexity of the entire API.

Transaction capability typically takes on a begin then rollback or commit pattern. This proposal doesn't do that. This proposal creates multiple new APIs to do what the existing DOM APIs can already do today except that it applies them in bulk when you're finished.

This proposal, in my opinion, could be significantly simplified by simply making it: document.appleChanges(source, destination);. Obviously you'd need to update how web workers work if you want to build a DOM tree in it but you'll have to do something similar anyway in your proposal for the Token pseudo elements.

The destination is the HTML node to apply all your changes to and the source is your separately built DOM. No reason the underlying engine couldn't handle all the complexity your introduced.

The DOM API is inelegant; I don't think it's a good idea to incorporate increased complexity for a theoretical optimization. If possible, re-use or work towards a better designed API that lends itself better to transactions. Transactions would be pretty cool.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 20, 2016

Member

@domenic most of the "allocations" in this proposal are not actually allocations. Integers are created which should be much more light-weight than actual objects and don't impact GC and such. (And also, I think it might be a good idea to define DOMChangeList and DOMTreeConstruction as ECMAScript-style objects. They really shouldn't perform much validation and the more browsers can inline the better.)

@KrisSiegel the API is not meant to be elegant. It's meant to be fast. Having discussed this with implementers it seems unlikely we'll get proper transactions. Doing that efficiently would be very hard and unlikely to materialize anytime soon.

Member

annevk commented Jun 20, 2016

@domenic most of the "allocations" in this proposal are not actually allocations. Integers are created which should be much more light-weight than actual objects and don't impact GC and such. (And also, I think it might be a good idea to define DOMChangeList and DOMTreeConstruction as ECMAScript-style objects. They really shouldn't perform much validation and the more browsers can inline the better.)

@KrisSiegel the API is not meant to be elegant. It's meant to be fast. Having discussed this with implementers it seems unlikely we'll get proper transactions. Doing that efficiently would be very hard and unlikely to materialize anytime soon.

@KrisSiegel

This comment has been minimized.

Show comment
Hide comment
@KrisSiegel

KrisSiegel Jun 20, 2016

the API is not meant to be elegant

@annevk I don't think the WHATWG should consider an API only for speed. The DOM APIs are public for web developers to use so usability of APIs shouldn't be forgotten or tossed away (this is how we ended up with jQuery and a new front end framework every two weeks). The DOM API is inelegant enough as is. I would like to see more re-usage of existing APIs where possible (which I see no reason why you couldn't do so while still accomplishing the goals of the proposal). I'm curious how efficient the passing of node objects to and from web workers would be. I could see that being useful in more scenarios than just this one.

Yeah I wouldn't be expecting transactions for any type of GUI state. It would be cool but yeah implementing it efficiently while maintaining generic capabilities is very difficult.

KrisSiegel commented Jun 20, 2016

the API is not meant to be elegant

@annevk I don't think the WHATWG should consider an API only for speed. The DOM APIs are public for web developers to use so usability of APIs shouldn't be forgotten or tossed away (this is how we ended up with jQuery and a new front end framework every two weeks). The DOM API is inelegant enough as is. I would like to see more re-usage of existing APIs where possible (which I see no reason why you couldn't do so while still accomplishing the goals of the proposal). I'm curious how efficient the passing of node objects to and from web workers would be. I could see that being useful in more scenarios than just this one.

Yeah I wouldn't be expecting transactions for any type of GUI state. It would be cool but yeah implementing it efficiently while maintaining generic capabilities is very difficult.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 20, 2016

Member

Transactions is pretty much impossible due to how <iframe> and such work. While we should make the DOM more elegant, we should also offer fast paths for advanced applications. Passing node objects around would not be efficient and getting node objects in workers is also something that's too hard (no implementation is anywhere near ready for that, if ever). This proposal, which avoids all that, is really our best bet to get something done in that area.

Member

annevk commented Jun 20, 2016

Transactions is pretty much impossible due to how <iframe> and such work. While we should make the DOM more elegant, we should also offer fast paths for advanced applications. Passing node objects around would not be efficient and getting node objects in workers is also something that's too hard (no implementation is anywhere near ready for that, if ever). This proposal, which avoids all that, is really our best bet to get something done in that area.

@bfgeek

This comment has been minimized.

Show comment
Hide comment
@bfgeek

bfgeek Jun 20, 2016

Member

cc/ @drufball @esprehn @n8schloss

@domenic has summarized some of my thoughts up pretty well, but I'd just like to add my 2c. :)

This proposal has three parts that are potential performance benefits.

(1) Worker Construction of Tree.
Having something that is transferrable sounds great. Moving more script logic off the critical path is something that authors have typically struggled with.

It may be worth thinking about this part of the problem as a separate API which integrates nicely with this proposal, and regular DOM APIs?

(2) Tree Construction
We haven't seen in traces that we've performed that the bindings cost is the bottleneck here. For this part of the proposal we'd really love to see some data on this to be convinced that this is an issue worth solving.

@wycats , you mentioned some experiments with cloneNode you did? Do you have data on this?

It looks like that this is a pretty simple API to actually implement (if you are just worried about the bindings type cost); getting numbers from someone wouldn't be too hard?

If there is a significant performance difference: this should then be compared to a polyfill of the current DOM API on top of this API. I.e. when a sync DOM API is called, perform the mutation list synchronously.

Blink's implementation has the ability to create IDL APIs that live in javascript instead of cpp. If this is a huge performance win, then theoretically engines could change their implementations to mitigate this.

Again, we really need numbers to be convinced here.

(3) applyChanges
It will soon be possible in blink for us to styleRecalc separately from layout and be able to "smear" work for a particular DOM mutation over multiple frames.
This was the initial motivation behind asyncAppend; the fact that when you append DOM, it will typically jank that frame as more work is needed to be done than is possible in one frame.

We think it's really valuable for engines to be able to resolve applyChanges multiple frames after the initial request.

One thing that I think is missing in this API is the "DoItNow" button. I.e. calling this api would perform all the changes synchronously like the current DOM APIs.

As an example, say your UI can tolerate 100ms (~6frames) of the async DOM not appearing, but after that it would prefer to jank the main thread, as opposed to delaying further. A sync "commit" function would do this.

One way to do this with the current API would be:

changes = new DOMChangeList();
/* build up changes here */

changes.execute().then(() => {
  // changes aren't actually performed until you call "commit" here?
  // e.g. all the style/layout/etc work is performed. "commit" just makes it appear in the DOM tree.
  changes.commit();
});

requestAnimationFrame(() => {
  changes.commit(); // Promise above is now not resolved, as already done "commit".
});

We see really large benefits from this part. ^-^.

Member

bfgeek commented Jun 20, 2016

cc/ @drufball @esprehn @n8schloss

@domenic has summarized some of my thoughts up pretty well, but I'd just like to add my 2c. :)

This proposal has three parts that are potential performance benefits.

(1) Worker Construction of Tree.
Having something that is transferrable sounds great. Moving more script logic off the critical path is something that authors have typically struggled with.

It may be worth thinking about this part of the problem as a separate API which integrates nicely with this proposal, and regular DOM APIs?

(2) Tree Construction
We haven't seen in traces that we've performed that the bindings cost is the bottleneck here. For this part of the proposal we'd really love to see some data on this to be convinced that this is an issue worth solving.

@wycats , you mentioned some experiments with cloneNode you did? Do you have data on this?

It looks like that this is a pretty simple API to actually implement (if you are just worried about the bindings type cost); getting numbers from someone wouldn't be too hard?

If there is a significant performance difference: this should then be compared to a polyfill of the current DOM API on top of this API. I.e. when a sync DOM API is called, perform the mutation list synchronously.

Blink's implementation has the ability to create IDL APIs that live in javascript instead of cpp. If this is a huge performance win, then theoretically engines could change their implementations to mitigate this.

Again, we really need numbers to be convinced here.

(3) applyChanges
It will soon be possible in blink for us to styleRecalc separately from layout and be able to "smear" work for a particular DOM mutation over multiple frames.
This was the initial motivation behind asyncAppend; the fact that when you append DOM, it will typically jank that frame as more work is needed to be done than is possible in one frame.

We think it's really valuable for engines to be able to resolve applyChanges multiple frames after the initial request.

One thing that I think is missing in this API is the "DoItNow" button. I.e. calling this api would perform all the changes synchronously like the current DOM APIs.

As an example, say your UI can tolerate 100ms (~6frames) of the async DOM not appearing, but after that it would prefer to jank the main thread, as opposed to delaying further. A sync "commit" function would do this.

One way to do this with the current API would be:

changes = new DOMChangeList();
/* build up changes here */

changes.execute().then(() => {
  // changes aren't actually performed until you call "commit" here?
  // e.g. all the style/layout/etc work is performed. "commit" just makes it appear in the DOM tree.
  changes.commit();
});

requestAnimationFrame(() => {
  changes.commit(); // Promise above is now not resolved, as already done "commit".
});

We see really large benefits from this part. ^-^.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 20, 2016

It looks like that this is a pretty simple API to actually implement

That was my intent. Getting many of the benefits people are after with a simple-to-implement API should be valuable.

We think it's really valuable for engines to be able to resolve applyChanges multiple frames after the initial request.

That is definitely the intent of this proposal. By making applyChanges async, engines can choose to smear the work across multiple frames, without janking user interactions like scroll. (making it async gives the engine's scheduler control over the timing.)

One thing that I think is missing in this API is the "DoItNow" button. I.e. calling this api would perform all the changes synchronously like the current DOM APIs.

Good request! I'd have no problem adding it if engines thought it wouldn't introduce new problems 😄

We see really large benefits from this part. ^-^.

Great!

wycats commented Jun 20, 2016

It looks like that this is a pretty simple API to actually implement

That was my intent. Getting many of the benefits people are after with a simple-to-implement API should be valuable.

We think it's really valuable for engines to be able to resolve applyChanges multiple frames after the initial request.

That is definitely the intent of this proposal. By making applyChanges async, engines can choose to smear the work across multiple frames, without janking user interactions like scroll. (making it async gives the engine's scheduler control over the timing.)

One thing that I think is missing in this API is the "DoItNow" button. I.e. calling this api would perform all the changes synchronously like the current DOM APIs.

Good request! I'd have no problem adding it if engines thought it wouldn't introduce new problems 😄

We see really large benefits from this part. ^-^.

Great!

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Jun 20, 2016

Member

@annevk

most of the "allocations" in this proposal are not actually allocations. Integers are created which should be much more light-weight than actual objects and don't impact GC and such. (And also, I think it might be a good idea to define DOMChangeList and DOMTreeConstruction as ECMAScript-style objects. They really shouldn't perform much validation and the more browsers can inline the better.)

Hmm, can you expand on this? I didn't see any integers in this proposal. Going through the sample code:

// Usual DOM API:
let article = document.getElementById('main-article'); // Node wrapper (1)
let text = node.firstChild; // Node wrapper (2)
text.textValue = "Hello world"; // no allocations
node.setAttribute("data-active", "yes"); // no allocations
node.removeAttribute("data-pending"); // no allocations
// ChangeList API:

let articleToken = document.getToken(document.getElementById('main-article')); // Node wrapper + Token backing + Token wrapper (3)

let changes = new DOMChangeList(); // ChangeList backing + ChangeList wrapper (5)
let textToken = changes.firstChild(token); // Token backing + Token wrapper (7)
changes.updateTextData(textToken, "Hello world"); // no allocations
changes.setAttribute(articleToken, "data-active", "yes"); // no allocations
changes.removeAttribute(articleToken, "data-pending"); // no allocations

document.applyChanges(changes); // promise (8)

I see 2 allocations for normal DOM APIs (both simple wrapper allocations, which are "an integer" (pointer) size each) and 8 allocations for the ChangeList example. And that isn't including the transfer cost.

Member

domenic commented Jun 20, 2016

@annevk

most of the "allocations" in this proposal are not actually allocations. Integers are created which should be much more light-weight than actual objects and don't impact GC and such. (And also, I think it might be a good idea to define DOMChangeList and DOMTreeConstruction as ECMAScript-style objects. They really shouldn't perform much validation and the more browsers can inline the better.)

Hmm, can you expand on this? I didn't see any integers in this proposal. Going through the sample code:

// Usual DOM API:
let article = document.getElementById('main-article'); // Node wrapper (1)
let text = node.firstChild; // Node wrapper (2)
text.textValue = "Hello world"; // no allocations
node.setAttribute("data-active", "yes"); // no allocations
node.removeAttribute("data-pending"); // no allocations
// ChangeList API:

let articleToken = document.getToken(document.getElementById('main-article')); // Node wrapper + Token backing + Token wrapper (3)

let changes = new DOMChangeList(); // ChangeList backing + ChangeList wrapper (5)
let textToken = changes.firstChild(token); // Token backing + Token wrapper (7)
changes.updateTextData(textToken, "Hello world"); // no allocations
changes.setAttribute(articleToken, "data-active", "yes"); // no allocations
changes.removeAttribute(articleToken, "data-pending"); // no allocations

document.applyChanges(changes); // promise (8)

I see 2 allocations for normal DOM APIs (both simple wrapper allocations, which are "an integer" (pointer) size each) and 8 allocations for the ChangeList example. And that isn't including the transfer cost.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 20, 2016

Great to see this! I'm coming at this in comparison to Dru's proposal at https://github.com/drufball/async-append/blob/master/EXPLAINER.md

I also reviewed that in detail before I began work on this proposal.

As far as I can tell this proposal focuses mostly on trying to minimize allocations

That is certainly an important characteristic of this proposal. It also has several other goals, both related to efficiency and expanding capabilities. (one example is supporting the superset of the HTML syntax and the DOM API, which I talk about in the proposal; another is allowing the end-developer to directly control the staging of work across threads by using workers).

(but see below)

I'll reply inline.

and on providing a proxy API that can be used in a worker

I don't think you should see ChangeList as a "proxy", but rather as a collection of instructions that can be freely transferred until it's applied.

instead of focusing on allowing chunking and scheduling of parsing/style/layout/paint as Dru's proposal does

Since the application of a ChangeList is asynchronous, the benefits to the browser's scheduling algorithm is more-or-less equivalent to Dru's proposal. ChangeList supports many more operations, and allows them to be performed in a batch, which should further increase the scope of the benefit to authors.

I can't tell at first glance whether it can serve both purposes, or whether its design prevents that.

This is something I spent a lot of time talking to @lukewagner, @smaug--- and @annevk about, and our intent was for this API to serve both purposes. As I said in the proposal: "If there is some reason that a concrete implementation of this design might not be able to accomplish these goals, it's almost certainly something we should discuss."

There's a brief paragraph under "Is this actually faster?" number 3 that indicates it is intended to be compatible, but I'll defer to the Blink engineers.

I'm very eager to hear from them about this. I'd be very happy to have a call with anyone interested in this area as well.

I think the proposal might have a misapprehension about how DOM nodes and wrappers are related. Remember that wrappers are not created until they are accessed.

This particular constraint was one of the most important aspects of this design, and I spent a lot of time thinking about it and discussing the details with others.

The intent of NodeToken is that it can be represented as a simple internal value (e.g. an representing the n'th object to be allocated in this transaction) but that the nature of that representation is not directly exposed to users. The only way in which these values can be reified into Nodes is through AppliedChanges. I've spoke with some engineers at Mozilla about how to structure the API to allow it to work as a simple value, and would love to discuss it further with engineers who work on Blink.

Looking at the example code seems to bear this out; counting wrapper allocations it seems to have the same amount as normal DOM manipulation code. (And remember a wrapper is just a pointer; a wrapper to a NodeToken and a wrapper to a Node are both the same cost.) If you assume the DOM nodes will eventually exist anyway, as Yehuda discusses, Yehuda's proposal actually seems to have more allocations: NodeToken backing + NodeToken wrapper + Node backing, vs. Dru's proposal which has Node wrapper + Node backing.

Both proposals make it clear and less error prone to apply a sequence of DOM operations at once, through similar queue-then-commit systems.

Both support the full gamut of mutations to Elements and Nodes, with Dru's doing so by just allowing you to use those normal DOM APIs and Yehuda's by creating new versions of (most of) them that operate on analogous data structures.

It looks like Dru's proposal creates several new async APIs for existing operations. It also postulates a "wrapper" that could be used with the regular APIs. I understood Dru's proposal as an early exploration with several different possible directions; can you flesh out which of those directions your comments are talking about?

Both support the union of the trees that can be produced using the HTML parser and using the DOM API, Dru's by using the HTML parser directly (innerHTML/outerHTML) and Yehuda's presumably by e.g. allowing openElement to take more element names than createElement does (although I couldn't find where this is specified). We could of course try to expand createElement again; nobody's had the motivation so far, but maybe this will be the last straw.

From my original statement about the union of trees: "the HTML parser supports more tag names, while the DOM API supports more trees, such as custom elements nested inside a table)."

DOMChangeList also supports using the HTML parser, but the HTML parser limits which nodes can be made children of certain elements. We could perhaps fix createElement to support the full set of element names supported by the HTML parser, and that would address this benefit of the ChangeList proposal.

To me it feels like an empirical question whether the gains from building the parallel DOM-like tree in the worker instead of building a DOM tree in the main thread outweigh the costs of transferring it. There's also the empirical question of whether building a DOM tree in the main thread actually takes long enough to cause jank;

I have heard this thinking before and I think it might misapprehend the reasons people want to do build DOM trees and calculate DOM mutations in a worker.

It's not so much shrinking the cost of the DOM work, but rather all of the JavaScript work that is necessary to calculate the needed mutations. In practice, a lot of application work is done on the UI thread simply because of the close proximity of the DOM APIs.

It is of course possible to implement some of this strategy as a user-space library, but not all of it. In user-space, applying a change set is inherently single-threaded, for example. The process of deserializing a transferred ChangeList would have to choose between interleaving the DOM operations or deserializing them into an intermediate data structure, both of which would introduce new costs that the native implementations would not need.

And that's all assuming that a user-space library has enough primitives to do an efficient job at the limit. The basic idea behind this API is to provide a low-level primitive that libraries can use to build up different useful libraries with different tradeoffs, and start lower level than the current set of primitives.


There are many small paper-cuts that the ChangeList proposal addresses in one, fairly contained proposal. It attempts to create an API that, when used, has noticeably fewer common gotchas for web developers, and does not require library, framework or application developers to reverse engineer which APIs produce hazards, and keep that knowledge up to date as implementations change.

I am especially interested in feedback from implementors that there is something about this proposal that is difficult to implement, or that would make reasonable initial implementations slower than the equivalent use of DOM APIs. Any suggested changes to this proposal that would improve

wycats commented Jun 20, 2016

Great to see this! I'm coming at this in comparison to Dru's proposal at https://github.com/drufball/async-append/blob/master/EXPLAINER.md

I also reviewed that in detail before I began work on this proposal.

As far as I can tell this proposal focuses mostly on trying to minimize allocations

That is certainly an important characteristic of this proposal. It also has several other goals, both related to efficiency and expanding capabilities. (one example is supporting the superset of the HTML syntax and the DOM API, which I talk about in the proposal; another is allowing the end-developer to directly control the staging of work across threads by using workers).

(but see below)

I'll reply inline.

and on providing a proxy API that can be used in a worker

I don't think you should see ChangeList as a "proxy", but rather as a collection of instructions that can be freely transferred until it's applied.

instead of focusing on allowing chunking and scheduling of parsing/style/layout/paint as Dru's proposal does

Since the application of a ChangeList is asynchronous, the benefits to the browser's scheduling algorithm is more-or-less equivalent to Dru's proposal. ChangeList supports many more operations, and allows them to be performed in a batch, which should further increase the scope of the benefit to authors.

I can't tell at first glance whether it can serve both purposes, or whether its design prevents that.

This is something I spent a lot of time talking to @lukewagner, @smaug--- and @annevk about, and our intent was for this API to serve both purposes. As I said in the proposal: "If there is some reason that a concrete implementation of this design might not be able to accomplish these goals, it's almost certainly something we should discuss."

There's a brief paragraph under "Is this actually faster?" number 3 that indicates it is intended to be compatible, but I'll defer to the Blink engineers.

I'm very eager to hear from them about this. I'd be very happy to have a call with anyone interested in this area as well.

I think the proposal might have a misapprehension about how DOM nodes and wrappers are related. Remember that wrappers are not created until they are accessed.

This particular constraint was one of the most important aspects of this design, and I spent a lot of time thinking about it and discussing the details with others.

The intent of NodeToken is that it can be represented as a simple internal value (e.g. an representing the n'th object to be allocated in this transaction) but that the nature of that representation is not directly exposed to users. The only way in which these values can be reified into Nodes is through AppliedChanges. I've spoke with some engineers at Mozilla about how to structure the API to allow it to work as a simple value, and would love to discuss it further with engineers who work on Blink.

Looking at the example code seems to bear this out; counting wrapper allocations it seems to have the same amount as normal DOM manipulation code. (And remember a wrapper is just a pointer; a wrapper to a NodeToken and a wrapper to a Node are both the same cost.) If you assume the DOM nodes will eventually exist anyway, as Yehuda discusses, Yehuda's proposal actually seems to have more allocations: NodeToken backing + NodeToken wrapper + Node backing, vs. Dru's proposal which has Node wrapper + Node backing.

Both proposals make it clear and less error prone to apply a sequence of DOM operations at once, through similar queue-then-commit systems.

Both support the full gamut of mutations to Elements and Nodes, with Dru's doing so by just allowing you to use those normal DOM APIs and Yehuda's by creating new versions of (most of) them that operate on analogous data structures.

It looks like Dru's proposal creates several new async APIs for existing operations. It also postulates a "wrapper" that could be used with the regular APIs. I understood Dru's proposal as an early exploration with several different possible directions; can you flesh out which of those directions your comments are talking about?

Both support the union of the trees that can be produced using the HTML parser and using the DOM API, Dru's by using the HTML parser directly (innerHTML/outerHTML) and Yehuda's presumably by e.g. allowing openElement to take more element names than createElement does (although I couldn't find where this is specified). We could of course try to expand createElement again; nobody's had the motivation so far, but maybe this will be the last straw.

From my original statement about the union of trees: "the HTML parser supports more tag names, while the DOM API supports more trees, such as custom elements nested inside a table)."

DOMChangeList also supports using the HTML parser, but the HTML parser limits which nodes can be made children of certain elements. We could perhaps fix createElement to support the full set of element names supported by the HTML parser, and that would address this benefit of the ChangeList proposal.

To me it feels like an empirical question whether the gains from building the parallel DOM-like tree in the worker instead of building a DOM tree in the main thread outweigh the costs of transferring it. There's also the empirical question of whether building a DOM tree in the main thread actually takes long enough to cause jank;

I have heard this thinking before and I think it might misapprehend the reasons people want to do build DOM trees and calculate DOM mutations in a worker.

It's not so much shrinking the cost of the DOM work, but rather all of the JavaScript work that is necessary to calculate the needed mutations. In practice, a lot of application work is done on the UI thread simply because of the close proximity of the DOM APIs.

It is of course possible to implement some of this strategy as a user-space library, but not all of it. In user-space, applying a change set is inherently single-threaded, for example. The process of deserializing a transferred ChangeList would have to choose between interleaving the DOM operations or deserializing them into an intermediate data structure, both of which would introduce new costs that the native implementations would not need.

And that's all assuming that a user-space library has enough primitives to do an efficient job at the limit. The basic idea behind this API is to provide a low-level primitive that libraries can use to build up different useful libraries with different tradeoffs, and start lower level than the current set of primitives.


There are many small paper-cuts that the ChangeList proposal addresses in one, fairly contained proposal. It attempts to create an API that, when used, has noticeably fewer common gotchas for web developers, and does not require library, framework or application developers to reverse engineer which APIs produce hazards, and keep that knowledge up to date as implementations change.

I am especially interested in feedback from implementors that there is something about this proposal that is difficult to implement, or that would make reasonable initial implementations slower than the equivalent use of DOM APIs. Any suggested changes to this proposal that would improve

@bfgeek

This comment has been minimized.

Show comment
Hide comment
@bfgeek

bfgeek Jun 20, 2016

Member

@wycats We really need numbers for point (2) - (bindings related costs are really the problem) before going too deeply into API design however.

Without numbers on if this will actually get us a significant performance benefit, it's impossible to tell if this is the right design or not.

I.e. we could get a simpler API if this wasn't a significant problem. For example just the asyncAppend APIs for (3) - (performing style/layout work async), and a "WorkerNode" API separately.

Are you able to get numbers for (2)? Or work with someone to get numbers for this?

Member

bfgeek commented Jun 20, 2016

@wycats We really need numbers for point (2) - (bindings related costs are really the problem) before going too deeply into API design however.

Without numbers on if this will actually get us a significant performance benefit, it's impossible to tell if this is the right design or not.

I.e. we could get a simpler API if this wasn't a significant problem. For example just the asyncAppend APIs for (3) - (performing style/layout work async), and a "WorkerNode" API separately.

Are you able to get numbers for (2)? Or work with someone to get numbers for this?

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Jun 20, 2016

Member

The intent of NodeToken is that it can be represented as a simple internal value (e.g. an representing the n'th object to be allocated in this transaction) but that the nature of that representation is not directly exposed to users. The only way in which these values can be reified into Nodes is through AppliedChanges. I've spoke with some engineers at Mozilla about how to structure the API to allow it to work as a simple value, and would love to discuss it further with engineers who work on Blink.

Well, one issue with the above proposal is that it uses a type notation (TypeScript?) that implies type checking. This would necessitate creating Web IDL interfaces in order to get type checking so that you couldn't pass arbitrary values. It in fact uses a generics notation that implies we'd need (when converted to Web IDL) TextNodeToken, CommentNodeToken, ElementNodeToken, etc. So even if it's conceptually a simple internal value, it would need to be implemented as several wrappers + backing classes where the backing classes have those simple internal values.

It looks like Dru's proposal creates several new async APIs for existing operations. It also postulates a "wrapper" that could be used with the regular APIs. I understood Dru's proposal as an early exploration with several different possible directions; can you flesh out which of those directions your comments are talking about?

What I meant is that both of Dru's proposed APIs have minimal extra surface. For example they allow you to create Nodes and e.g. set their text data or attributes using the usual DOM APIs, whereas yours involves creating NodeTokens and using the parallel interface for such operations. In other words, instead of creating a parallel DOM API with a full parallel object hierarchy of NodeTokens and APIs that operate on them, Dru's proposal only duplicates the few APIs necessary for queue-building.

It's not so much shrinking the cost of the DOM work, but rather all of the JavaScript work that is necessary to calculate the needed mutations. In practice, a lot of application work is done on the UI thread simply because of the close proximity of the DOM APIs.

This is a great point; it's good to bring it up. It certainly makes the empirical question harder, but still worth investigating.

or that would make reasonable initial implementations slower than the equivalent use of DOM APIs

I've given some fairly concrete feedback about the increased allocation cost, which hopefully can make this clear. I'll defer to those more involved on the implementation side to say more.

Member

domenic commented Jun 20, 2016

The intent of NodeToken is that it can be represented as a simple internal value (e.g. an representing the n'th object to be allocated in this transaction) but that the nature of that representation is not directly exposed to users. The only way in which these values can be reified into Nodes is through AppliedChanges. I've spoke with some engineers at Mozilla about how to structure the API to allow it to work as a simple value, and would love to discuss it further with engineers who work on Blink.

Well, one issue with the above proposal is that it uses a type notation (TypeScript?) that implies type checking. This would necessitate creating Web IDL interfaces in order to get type checking so that you couldn't pass arbitrary values. It in fact uses a generics notation that implies we'd need (when converted to Web IDL) TextNodeToken, CommentNodeToken, ElementNodeToken, etc. So even if it's conceptually a simple internal value, it would need to be implemented as several wrappers + backing classes where the backing classes have those simple internal values.

It looks like Dru's proposal creates several new async APIs for existing operations. It also postulates a "wrapper" that could be used with the regular APIs. I understood Dru's proposal as an early exploration with several different possible directions; can you flesh out which of those directions your comments are talking about?

What I meant is that both of Dru's proposed APIs have minimal extra surface. For example they allow you to create Nodes and e.g. set their text data or attributes using the usual DOM APIs, whereas yours involves creating NodeTokens and using the parallel interface for such operations. In other words, instead of creating a parallel DOM API with a full parallel object hierarchy of NodeTokens and APIs that operate on them, Dru's proposal only duplicates the few APIs necessary for queue-building.

It's not so much shrinking the cost of the DOM work, but rather all of the JavaScript work that is necessary to calculate the needed mutations. In practice, a lot of application work is done on the UI thread simply because of the close proximity of the DOM APIs.

This is a great point; it's good to bring it up. It certainly makes the empirical question harder, but still worth investigating.

or that would make reasonable initial implementations slower than the equivalent use of DOM APIs

I've given some fairly concrete feedback about the increased allocation cost, which hopefully can make this clear. I'll defer to those more involved on the implementation side to say more.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 20, 2016

Well, one issue with the above proposal is that it uses a type notation (TypeScript?) that implies type checking. This would necessitate creating Web IDL interfaces in order to get type checking so that you couldn't pass arbitrary values. It in fact uses a generics notation that implies we'd need (when converted to Web IDL) TextNodeToken, CommentNodeToken, ElementNodeToken, etc. So even if it's conceptually a simple internal value, it would need to be implemented as several wrappers + backing classes where the backing classes have those simple internal values.

The intent of the TypeScript notation was just to show clearly what kind of NodeToken I expect each part of the API to produce. The only time I expect any kind of checking to be mandatory is when applying the ChangeList, because for example it would be a bug to appendChild a Node to another Node or no node at all.

What I meant is that both of Dru's proposed APIs have minimal extra surface. For example they allow you to create Nodes and e.g. set their text data or attributes using the usual DOM APIs, whereas yours involves creating NodeTokens and using the parallel interface for such operations. In other words, instead of creating a parallel DOM API with a full parallel object hierarchy of NodeTokens and APIs that operate on them, Dru's proposal only duplicates the few APIs necessary for queue-building.

Thanks for the clarification. One of the problems I'm hoping to address is that not all operations are safe or reasonable to use inside a transaction, and certainly many operations are not safe if you want the resulting change list to be transferrable. By starting with a minimal API that nontheless covers mutations of the DOM as a data structure, we can build up an API that doesn't have surprising or problematic behavior.

I don't expect to need a much larger set of lowest-level APIs; Much of DOM1-4 is described in terms of a lower-level set of primitive DOM operations, and my intent is to flesh out the base layer of those APIs in a way that would both work across workers and support asynchronous application of the change list, smart scheduling by the engines, and avoid user-space interleaving of dangerous operations.

wycats commented Jun 20, 2016

Well, one issue with the above proposal is that it uses a type notation (TypeScript?) that implies type checking. This would necessitate creating Web IDL interfaces in order to get type checking so that you couldn't pass arbitrary values. It in fact uses a generics notation that implies we'd need (when converted to Web IDL) TextNodeToken, CommentNodeToken, ElementNodeToken, etc. So even if it's conceptually a simple internal value, it would need to be implemented as several wrappers + backing classes where the backing classes have those simple internal values.

The intent of the TypeScript notation was just to show clearly what kind of NodeToken I expect each part of the API to produce. The only time I expect any kind of checking to be mandatory is when applying the ChangeList, because for example it would be a bug to appendChild a Node to another Node or no node at all.

What I meant is that both of Dru's proposed APIs have minimal extra surface. For example they allow you to create Nodes and e.g. set their text data or attributes using the usual DOM APIs, whereas yours involves creating NodeTokens and using the parallel interface for such operations. In other words, instead of creating a parallel DOM API with a full parallel object hierarchy of NodeTokens and APIs that operate on them, Dru's proposal only duplicates the few APIs necessary for queue-building.

Thanks for the clarification. One of the problems I'm hoping to address is that not all operations are safe or reasonable to use inside a transaction, and certainly many operations are not safe if you want the resulting change list to be transferrable. By starting with a minimal API that nontheless covers mutations of the DOM as a data structure, we can build up an API that doesn't have surprising or problematic behavior.

I don't expect to need a much larger set of lowest-level APIs; Much of DOM1-4 is described in terms of a lower-level set of primitive DOM operations, and my intent is to flesh out the base layer of those APIs in a way that would both work across workers and support asynchronous application of the change list, smart scheduling by the engines, and avoid user-space interleaving of dangerous operations.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 20, 2016

I.e. we could get a simpler API if this wasn't a significant problem. For example just the asyncAppend APIs for (3) - (performing style/layout work async), and a "WorkerNode" API separately.
Are you able to get numbers for (2)? Or work with someone to get numbers for this?

I've spoken with Bill McCloskey at Mozilla, who has an intern looking into building an initial implementation of this proposal. Since I don't work at Mozilla, I obviously can't hold them to this, but hopefully we'll see some useful results from their work. But still, it doesn't seem like it should be necessary to implement a feature to discuss general performance implications of the design. Generally when we find ourselves getting close to a broad-strokes design that people find promising, that's when implementation work starts and we start getting more fine-grained performance feedback.

@wycats , you mentioned some experiments with cloneNode you did? Do you have data on this?

I can give more information later, but the TLDR is:

Instead of building up DOM and interleaving dynamic work, Ember has experimented with (and shipped, as part of Glimmer 1) an optimization that uses cloneNode for templates and fills in the dynamic content by doing an optimized walk to the dynamic nodes.

The entire benefit of this optimization is to avoid reifying static nodes in JavaScript, and we saw very significant improvements to templates with minimal dynamic content (and therefore shipped the optimization). We still need to walk through some static nodes to reach the dynamic ones, and the AppliedChanges API is intended to limit the number of reified nodes to precisely the dynamic ones that may need to be changed.

wycats commented Jun 20, 2016

I.e. we could get a simpler API if this wasn't a significant problem. For example just the asyncAppend APIs for (3) - (performing style/layout work async), and a "WorkerNode" API separately.
Are you able to get numbers for (2)? Or work with someone to get numbers for this?

I've spoken with Bill McCloskey at Mozilla, who has an intern looking into building an initial implementation of this proposal. Since I don't work at Mozilla, I obviously can't hold them to this, but hopefully we'll see some useful results from their work. But still, it doesn't seem like it should be necessary to implement a feature to discuss general performance implications of the design. Generally when we find ourselves getting close to a broad-strokes design that people find promising, that's when implementation work starts and we start getting more fine-grained performance feedback.

@wycats , you mentioned some experiments with cloneNode you did? Do you have data on this?

I can give more information later, but the TLDR is:

Instead of building up DOM and interleaving dynamic work, Ember has experimented with (and shipped, as part of Glimmer 1) an optimization that uses cloneNode for templates and fills in the dynamic content by doing an optimized walk to the dynamic nodes.

The entire benefit of this optimization is to avoid reifying static nodes in JavaScript, and we saw very significant improvements to templates with minimal dynamic content (and therefore shipped the optimization). We still need to walk through some static nodes to reach the dynamic ones, and the AppliedChanges API is intended to limit the number of reified nodes to precisely the dynamic ones that may need to be changed.

@bfgeek

This comment has been minimized.

Show comment
Hide comment
@bfgeek

bfgeek Jun 20, 2016

Member

But still, it doesn't seem like it should be necessary to implement a feature to discuss general performance implications of the design.

Sorry have to disagree with you here. For most APIs proposing a broad-strokes design first is great. However when explicitly proposing a performance API, it should come with the burden of proof (or be obvious) that it will solve the performance problem.

Again if bindings isn't really the problem, we could get away with a simpler asyncAppend + WorkerNode design potentially, which is a significantly different API surface.

I can give more information later

Yes please!

Member

bfgeek commented Jun 20, 2016

But still, it doesn't seem like it should be necessary to implement a feature to discuss general performance implications of the design.

Sorry have to disagree with you here. For most APIs proposing a broad-strokes design first is great. However when explicitly proposing a performance API, it should come with the burden of proof (or be obvious) that it will solve the performance problem.

Again if bindings isn't really the problem, we could get away with a simpler asyncAppend + WorkerNode design potentially, which is a significantly different API surface.

I can give more information later

Yes please!

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 20, 2016

Again if bindings isn't really the problem, we could get away with a simpler asyncAppend + WorkerNode design potentially, which is a significantly different API surface.

Something like this has been mentioned in very broad strokes a few times. I would be very interested to see what such an API would look like in more detail.

wycats commented Jun 20, 2016

Again if bindings isn't really the problem, we could get away with a simpler asyncAppend + WorkerNode design potentially, which is a significantly different API surface.

Something like this has been mentioned in very broad strokes a few times. I would be very interested to see what such an API would look like in more detail.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 20, 2016

However when explicitly proposing a performance API, it should come with the burden of proof (or be obvious) that it will solve the performance problem.

I generally agree with the principle that there should be some reason to believe that the API will improve performance. In this case, there are multiple good reasons to believe that, in addition to the reduced cost of bindings:

  • a less error-prone API from the perspective of developers on the platform, via a guarantee that none of the operations in change list can interleave with JavaScript (making it impossible for code to rely on observable intermediate layouts or paints).
  • a (significant) reduction in JavaScript allocations when constructing and applying the change list (assuming that NodeToken indeed can be designed to avoid allocations)
  • the ability for the engine to apply the change list in a "smeared" way with much more control over the scheduling of the work.
  • the ability to locate more JavaScript code that handles computing the updates off the main thread with an efficient way to transfer the expected operations back to the UI thread.

These are all "paper cuts", but they're precisely the kind of paper-cuts that we see adding up in real applications, and have a lot of difficulty avoiding at scale in practice (indeed, I learned about the impact of many of them by watching great talks by Google engineers at Google I/O!).

For what it's worth, Ember applications (for example, LinkedIn's Mobile app, which is where I've been focusing much of my performance effort lately) can easily perform thousands or tens of thousands of discrete operations during initial render, so small costs (especially associated with garbage collection) can quickly add up.

wycats commented Jun 20, 2016

However when explicitly proposing a performance API, it should come with the burden of proof (or be obvious) that it will solve the performance problem.

I generally agree with the principle that there should be some reason to believe that the API will improve performance. In this case, there are multiple good reasons to believe that, in addition to the reduced cost of bindings:

  • a less error-prone API from the perspective of developers on the platform, via a guarantee that none of the operations in change list can interleave with JavaScript (making it impossible for code to rely on observable intermediate layouts or paints).
  • a (significant) reduction in JavaScript allocations when constructing and applying the change list (assuming that NodeToken indeed can be designed to avoid allocations)
  • the ability for the engine to apply the change list in a "smeared" way with much more control over the scheduling of the work.
  • the ability to locate more JavaScript code that handles computing the updates off the main thread with an efficient way to transfer the expected operations back to the UI thread.

These are all "paper cuts", but they're precisely the kind of paper-cuts that we see adding up in real applications, and have a lot of difficulty avoiding at scale in practice (indeed, I learned about the impact of many of them by watching great talks by Google engineers at Google I/O!).

For what it's worth, Ember applications (for example, LinkedIn's Mobile app, which is where I've been focusing much of my performance effort lately) can easily perform thousands or tens of thousands of discrete operations during initial render, so small costs (especially associated with garbage collection) can quickly add up.

@esprehn

This comment has been minimized.

Show comment
Hide comment
@esprehn

esprehn Jun 20, 2016

Wouldn't custom element callbacks run inside the operation making all the intermediate states visible? I don't think we can skip running the constructors or attached callbacks and do a layout, most elements are mutating their state in there, we'd be doing a wasted layout. This API doesn't handle the creation of ShadowRoots either.

I've been profiling many JS applications recently and the wrapper creation is rarely the place where costs add up, certainly not enough to justify this style of API. Specifically today I looked at Amazon, Facebook, Polymer iron-list, and several AMP pages. None of them would see much of a performance improvement from this API.

Note that we're moving to much wrapper heavier apis all over the platform as well, see for example the Typed OM in CSS, custom paint, layout, etc. I'd really hesitate to attempt this kind of optimization around Nodes when so much of the rest of the platform is moving to create more objects anyway. We need to just make wrapper creation faster if that's an issue you observe.

We certainly can think about what it means to construct nodes inside a worker, perhaps with a stream interface, or some kind of WorkerNode interface where you build a tree and then transfer it. Perhaps we need to start by untangling the various problems you're trying to solve here.

esprehn commented Jun 20, 2016

Wouldn't custom element callbacks run inside the operation making all the intermediate states visible? I don't think we can skip running the constructors or attached callbacks and do a layout, most elements are mutating their state in there, we'd be doing a wasted layout. This API doesn't handle the creation of ShadowRoots either.

I've been profiling many JS applications recently and the wrapper creation is rarely the place where costs add up, certainly not enough to justify this style of API. Specifically today I looked at Amazon, Facebook, Polymer iron-list, and several AMP pages. None of them would see much of a performance improvement from this API.

Note that we're moving to much wrapper heavier apis all over the platform as well, see for example the Typed OM in CSS, custom paint, layout, etc. I'd really hesitate to attempt this kind of optimization around Nodes when so much of the rest of the platform is moving to create more objects anyway. We need to just make wrapper creation faster if that's an issue you observe.

We certainly can think about what it means to construct nodes inside a worker, perhaps with a stream interface, or some kind of WorkerNode interface where you build a tree and then transfer it. Perhaps we need to start by untangling the various problems you're trying to solve here.

@dherman

This comment has been minimized.

Show comment
Hide comment
@dherman

dherman Jun 20, 2016

It seems like there's general agreement that some kind of transactionality is desirable, but with a key unresolved question being whether we should strive to avoid exposing intermediate states. I find the idea of exposing those states pretty concerning, and I think it's worth trying to achieve an API that avoids it, which AIUI is what this proposal is aiming at.

I also find the idea of transactional change records an appealing primitive for modeling the DOM -- it seems like a good "explain the platform" approach: an execution trace is a sequence of DOM changes. So where other APIs @esprehn mentions might use wrappers to talk about existing nodes, it makes sense for transactional change records to operate at a lower conceptual layer.

Also, I'd be pretty worried about arguments saying that absent empirical data, we should have our primitives do more things and bet on our optimizations getting smarter. I think our default orientation should be the reverse: our primitives should be simpler and try to do less.

dherman commented Jun 20, 2016

It seems like there's general agreement that some kind of transactionality is desirable, but with a key unresolved question being whether we should strive to avoid exposing intermediate states. I find the idea of exposing those states pretty concerning, and I think it's worth trying to achieve an API that avoids it, which AIUI is what this proposal is aiming at.

I also find the idea of transactional change records an appealing primitive for modeling the DOM -- it seems like a good "explain the platform" approach: an execution trace is a sequence of DOM changes. So where other APIs @esprehn mentions might use wrappers to talk about existing nodes, it makes sense for transactional change records to operate at a lower conceptual layer.

Also, I'd be pretty worried about arguments saying that absent empirical data, we should have our primitives do more things and bet on our optimizations getting smarter. I think our default orientation should be the reverse: our primitives should be simpler and try to do less.

@drufball

This comment has been minimized.

Show comment
Hide comment
@drufball

drufball Jun 21, 2016

I generally agree with the principle that there should be some reason to believe that the API will improve performance. In this case, there are multiple good reasons to believe that, in addition to the reduced cost of bindings:

I think Ian was saying that we need to see numbers supporting the idea that this specific incarnation of the API will lead to more performance benefits than a different incarnation. You're right that this proposal would have a lot of performance wins. But looking at the improvements you mentioned, it seems like they would also be received from any general WorkerNode + asyncAppend API. Can you point to any additional benefits this proposal might have?

  • a less error-prone API from the perspective of developers on the platform, via a guarantee that none of the operations in change list can interleave with JavaScript (making it impossible for code to rely on observable intermediate layouts or paints).

To make sure I am clear, the point here is that by proposing a separate API surface, we could exclude any operations that would be invalid coming from a Worker because they depend on a relayout on the UI thread? This reasoning is compelling to me, but is there a way we could introduce similar benefits without introducing an entirely new API surface?

  • a (significant) reduction in JavaScript allocations when constructing and applying the change list (assuming that NodeToken indeed can be designed to avoid allocations)

It sounds like this point is still a bit unclear. 1) this proposal may require more bindings? 2) it's unclear if reducing bindings would lead to significant performance improvements.

  • the ability for the engine to apply the change list in a "smeared" way with much more control over the scheduling of the work.

This benefit could be achieved through any asyncAppend style API.

  • the ability to locate more JavaScript code that handles computing the updates off the main thread with an efficient way to transfer the expected operations back to the UI thread.

This benefit could be achieved through any WorkerNode style API.

In short, I don't think there's any disagreement that we'll see performance wins by allowing DOM in a worker and asynchronous DOM appending. But we need to see compelling numbers that this version of the API leads to more wins than other versions before committing to it.

drufball commented Jun 21, 2016

I generally agree with the principle that there should be some reason to believe that the API will improve performance. In this case, there are multiple good reasons to believe that, in addition to the reduced cost of bindings:

I think Ian was saying that we need to see numbers supporting the idea that this specific incarnation of the API will lead to more performance benefits than a different incarnation. You're right that this proposal would have a lot of performance wins. But looking at the improvements you mentioned, it seems like they would also be received from any general WorkerNode + asyncAppend API. Can you point to any additional benefits this proposal might have?

  • a less error-prone API from the perspective of developers on the platform, via a guarantee that none of the operations in change list can interleave with JavaScript (making it impossible for code to rely on observable intermediate layouts or paints).

To make sure I am clear, the point here is that by proposing a separate API surface, we could exclude any operations that would be invalid coming from a Worker because they depend on a relayout on the UI thread? This reasoning is compelling to me, but is there a way we could introduce similar benefits without introducing an entirely new API surface?

  • a (significant) reduction in JavaScript allocations when constructing and applying the change list (assuming that NodeToken indeed can be designed to avoid allocations)

It sounds like this point is still a bit unclear. 1) this proposal may require more bindings? 2) it's unclear if reducing bindings would lead to significant performance improvements.

  • the ability for the engine to apply the change list in a "smeared" way with much more control over the scheduling of the work.

This benefit could be achieved through any asyncAppend style API.

  • the ability to locate more JavaScript code that handles computing the updates off the main thread with an efficient way to transfer the expected operations back to the UI thread.

This benefit could be achieved through any WorkerNode style API.

In short, I don't think there's any disagreement that we'll see performance wins by allowing DOM in a worker and asynchronous DOM appending. But we need to see compelling numbers that this version of the API leads to more wins than other versions before committing to it.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 21, 2016

But looking at the improvements you mentioned, it seems like they would also be received from any general WorkerNode + asyncAppend API. Can you point to any additional benefits this proposal might have?

Before I do that, I'd be very interested to read a detailed WorkerNode proposal. Can you give me a pointer so I can do a proper analysis?

wycats commented Jun 21, 2016

But looking at the improvements you mentioned, it seems like they would also be received from any general WorkerNode + asyncAppend API. Can you point to any additional benefits this proposal might have?

Before I do that, I'd be very interested to read a detailed WorkerNode proposal. Can you give me a pointer so I can do a proper analysis?

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 21, 2016

To make sure I am clear, the point here is that by proposing a separate API surface, we could exclude any operations that would be invalid coming from a Worker because they depend on a relayout on the UI thread?

More than that, the API gives developers a way to express a batch of DOM operations that by definition cannot accidentally interleave accesses that would force a layout flush. Because the batch is applied without interleaving JavaScript code, the API creates a natural staging between blocks of pure DOM mutation and blocks of CSS-dependent changes.

It sounds like this point is still a bit unclear. 1) this proposal may require more bindings? 2) it's unclear if reducing bindings would lead to significant performance improvements.

If it's impossible to implement the NodeToken type without introducing new heap allocated JavaScript object, I definitely want to know that. Is that what you're saying?

This benefit could be achieved through any asyncAppend style API.

As long as the only mutations we're talking about are various forms of append. Do you envision things like asyncNodeValue =, asyncSetAttribute, etc. Or is there a reason that only async versions of append operations would be necessary?

This benefit could be achieved through any WorkerNode style API.

I really want to see a detailed version of a WorkerNode-style proposal. I must have missed previous discussions fleshing it out; it's very hard for me to compare the impact of such a proposal against this proposal. I would really love a pointer to more details that could help me do a careful cost accounting of the different proposals.

wycats commented Jun 21, 2016

To make sure I am clear, the point here is that by proposing a separate API surface, we could exclude any operations that would be invalid coming from a Worker because they depend on a relayout on the UI thread?

More than that, the API gives developers a way to express a batch of DOM operations that by definition cannot accidentally interleave accesses that would force a layout flush. Because the batch is applied without interleaving JavaScript code, the API creates a natural staging between blocks of pure DOM mutation and blocks of CSS-dependent changes.

It sounds like this point is still a bit unclear. 1) this proposal may require more bindings? 2) it's unclear if reducing bindings would lead to significant performance improvements.

If it's impossible to implement the NodeToken type without introducing new heap allocated JavaScript object, I definitely want to know that. Is that what you're saying?

This benefit could be achieved through any asyncAppend style API.

As long as the only mutations we're talking about are various forms of append. Do you envision things like asyncNodeValue =, asyncSetAttribute, etc. Or is there a reason that only async versions of append operations would be necessary?

This benefit could be achieved through any WorkerNode style API.

I really want to see a detailed version of a WorkerNode-style proposal. I must have missed previous discussions fleshing it out; it's very hard for me to compare the impact of such a proposal against this proposal. I would really love a pointer to more details that could help me do a careful cost accounting of the different proposals.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 21, 2016

Member

The TypeScript might have been too confusing. To reiterate, it's just there to illustrate the conceptual model. In an implementation, all createElement() has to do is return a pointer and store that plus arguments in an underlying opaque buffer. That pointer can then be passed to setAttribute() that will similarly store an instruction in the opaque buffer. None of that is validated ahead of time. It's only "validated" once the buffer is run (i.e., applied).

So basically, we're just creating an ArrayBuffer consisting of a set of pointers and DOM mutation operations (agree that we should add shadow roots). That should make transfer between threads cheap, should be significantly better than object creation for larger number of objects, etc.

Also, the handwaving about WorkerNode while simultaneously saying this is adding too much API surface is not helping. Any kind of worker API would need at least this much API surface. And then folks likely want the same API in workers and documents so they can reuse library code at which point we're back to a new set of primitives, which is exactly what @wycats has delivered here.

(Furthermore, having new primitives helps address the mismatch between the HTML parser and DOM APIs in terms of name validation.)

Member

annevk commented Jun 21, 2016

The TypeScript might have been too confusing. To reiterate, it's just there to illustrate the conceptual model. In an implementation, all createElement() has to do is return a pointer and store that plus arguments in an underlying opaque buffer. That pointer can then be passed to setAttribute() that will similarly store an instruction in the opaque buffer. None of that is validated ahead of time. It's only "validated" once the buffer is run (i.e., applied).

So basically, we're just creating an ArrayBuffer consisting of a set of pointers and DOM mutation operations (agree that we should add shadow roots). That should make transfer between threads cheap, should be significantly better than object creation for larger number of objects, etc.

Also, the handwaving about WorkerNode while simultaneously saying this is adding too much API surface is not helping. Any kind of worker API would need at least this much API surface. And then folks likely want the same API in workers and documents so they can reuse library code at which point we're back to a new set of primitives, which is exactly what @wycats has delivered here.

(Furthermore, having new primitives helps address the mismatch between the HTML parser and DOM APIs in terms of name validation.)

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Jun 21, 2016

Member

The TypeScript might have been too confusing. To reiterate, it's just there to illustrate the conceptual model. In an implementation, all createElement() has to do is return a pointer and store that plus arguments in an underlying opaque buffer. That pointer can then be passed to setAttribute() that will similarly store an instruction in the opaque buffer. None of that is validated ahead of time. It's only "validated" once the buffer is run (i.e., applied).

Can you give more detail, perhaps IDL? Are you actually using unsigned long? If so, how do you deal with spoofing where I pass arbitrary "pointers" by choosing a likely-seeming integer, e.g. most-recently-returned integer minus 1 or similar?

Remember that the in-JS allocation cost of a wrapper is the same as the allocation cost of an integer (on a 64-bit system); both are 64-bit integers.

Member

domenic commented Jun 21, 2016

The TypeScript might have been too confusing. To reiterate, it's just there to illustrate the conceptual model. In an implementation, all createElement() has to do is return a pointer and store that plus arguments in an underlying opaque buffer. That pointer can then be passed to setAttribute() that will similarly store an instruction in the opaque buffer. None of that is validated ahead of time. It's only "validated" once the buffer is run (i.e., applied).

Can you give more detail, perhaps IDL? Are you actually using unsigned long? If so, how do you deal with spoofing where I pass arbitrary "pointers" by choosing a likely-seeming integer, e.g. most-recently-returned integer minus 1 or similar?

Remember that the in-JS allocation cost of a wrapper is the same as the allocation cost of an integer (on a 64-bit system); both are 64-bit integers.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 21, 2016

Member
interface DOMTreeConstruction {
  unsigned long openElement(DOMString name, NamespacePrefix = "html");
  void closeElement();

  void setAttribute(DOMString name, optional DOMString value = "", optional NamespacePrefix = "");
  unsigned long appendText(DOMString text);
  unsigned long appendComment(DOMString text);
  Bounds appendHTML(DOMString html);
};
enum NamespacePrefix { "", "html", ... };

(Maybe unsigned long long?)

Spoofing does not matter. That would simply result in an error when the "C++ side" uses the instruction set to create a tree. And since we cannot support rollback and proper transactions anyway due to <iframe>, that doesn't seem like a problem. Or at least not one we can solve.

I'm not entirely sure what you're saying. The cost to create a promise or a node is equal to creating a number?

Member

annevk commented Jun 21, 2016

interface DOMTreeConstruction {
  unsigned long openElement(DOMString name, NamespacePrefix = "html");
  void closeElement();

  void setAttribute(DOMString name, optional DOMString value = "", optional NamespacePrefix = "");
  unsigned long appendText(DOMString text);
  unsigned long appendComment(DOMString text);
  Bounds appendHTML(DOMString html);
};
enum NamespacePrefix { "", "html", ... };

(Maybe unsigned long long?)

Spoofing does not matter. That would simply result in an error when the "C++ side" uses the instruction set to create a tree. And since we cannot support rollback and proper transactions anyway due to <iframe>, that doesn't seem like a problem. Or at least not one we can solve.

I'm not entirely sure what you're saying. The cost to create a promise or a node is equal to creating a number?

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Jun 21, 2016

Member

Not a promise, but for a Node, the JavaScript-side cost is equal, yes. (It is just a pointer to the C++ backing object.)

Member

domenic commented Jun 21, 2016

Not a promise, but for a Node, the JavaScript-side cost is equal, yes. (It is just a pointer to the C++ backing object.)

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 21, 2016

Member

Sure, but you have to allocate the backing object too. Are you not counting that? This approach doesn't require a backing object since it's just an instruction set.

Member

annevk commented Jun 21, 2016

Sure, but you have to allocate the backing object too. Are you not counting that? This approach doesn't require a backing object since it's just an instruction set.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Jun 21, 2016

Member

You have to allocate the backing Node in both approaches, as Yehuda pointed out in his OP.

Member

domenic commented Jun 21, 2016

You have to allocate the backing Node in both approaches, as Yehuda pointed out in his OP.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 21, 2016

Member

No you don't and I don't think he did. All we're doing is building up an instruction set. Actual node creation happens during commit.

Member

annevk commented Jun 21, 2016

No you don't and I don't think he did. All we're doing is building up an instruction set. Actual node creation happens during commit.

@drufball

This comment has been minimized.

Show comment
Hide comment
@drufball

drufball Jun 22, 2016

Before I do that, I'd be very interested to read a detailed WorkerNode proposal. Can you give me a pointer so I can do a proper analysis?

Sorry! We've passed ideas in this area back and forth so many times I didn't realize that we've never actually written any of our thoughts down in one place. I threw a rough summary of our thoughts for a WorkerNode proposal into a repo so you can take a look:

https://github.com/drufball/worker-node

I think we all agree that there's value in allowing DOM mutation on workers and in allowing a 'smeared' application of layout. One concern we have is that each of these actions is valuable independent (and in sometimes unrelated ways) from each other. Because of this, we're hesitant to bundle both problems into a single, interleaved solution. Right now it feels like we're solving a specific use case (e.g. the Virtual DOM diff + patch model), which may prove unwieldy when people aren't building that precise functionality. For example, when developers just want to smear the layout for a DOM tree they're working with.

Can we separate these into two distinct proposals - one for async application of layout and another for DOM mutations on worker threads? It would be nice to solve each of these problems with their own primitive, making sure to design them such that they can be used in tandem where it makes sense. For example, we can make sure to keep this concept of a DOMChangeList in mind while developing the async append proposal - which is poorly named since it's about general async DOM mutations, not just append :)

Happy to continue the Worker DOM discussion in a repo that's focused on just that problem. We could use the repo I linked above, or proceed somewhere else!

drufball commented Jun 22, 2016

Before I do that, I'd be very interested to read a detailed WorkerNode proposal. Can you give me a pointer so I can do a proper analysis?

Sorry! We've passed ideas in this area back and forth so many times I didn't realize that we've never actually written any of our thoughts down in one place. I threw a rough summary of our thoughts for a WorkerNode proposal into a repo so you can take a look:

https://github.com/drufball/worker-node

I think we all agree that there's value in allowing DOM mutation on workers and in allowing a 'smeared' application of layout. One concern we have is that each of these actions is valuable independent (and in sometimes unrelated ways) from each other. Because of this, we're hesitant to bundle both problems into a single, interleaved solution. Right now it feels like we're solving a specific use case (e.g. the Virtual DOM diff + patch model), which may prove unwieldy when people aren't building that precise functionality. For example, when developers just want to smear the layout for a DOM tree they're working with.

Can we separate these into two distinct proposals - one for async application of layout and another for DOM mutations on worker threads? It would be nice to solve each of these problems with their own primitive, making sure to design them such that they can be used in tandem where it makes sense. For example, we can make sure to keep this concept of a DOMChangeList in mind while developing the async append proposal - which is poorly named since it's about general async DOM mutations, not just append :)

Happy to continue the Worker DOM discussion in a repo that's focused on just that problem. We could use the repo I linked above, or proceed somewhere else!

@drufball

This comment has been minimized.

Show comment
Hide comment
@drufball

drufball Jun 22, 2016

I'll try to put together my best-guess describing my goals and how I would accomplish them in user-space with the available APIs so we can see precisely where the gaps are, and what APIs we would need to close the gaps.

That sounds awesome!

drufball commented Jun 22, 2016

I'll try to put together my best-guess describing my goals and how I would accomplish them in user-space with the available APIs so we can see precisely where the gaps are, and what APIs we would need to close the gaps.

That sounds awesome!

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 22, 2016

Right now it feels like we're solving a specific use case (e.g. the Virtual DOM diff + patch model), which may prove unwieldy when people aren't building that precise functionality.

Just to make sure we're all on the same page, Glimmer does not use a VirtualDOM diff + patch model. It doesn't maintain its own representation of an intermediate not-real DOM, and does not diff two not-real DOMs in order to determine what operations it needs to perform.

I know of at least three models (Google's Incremental DOM, React's Virtual DOM, and Ember's Glimmer VM approach) that provide write-only abstractions that can identify a list of changes to apply, and then apply them in a batch.

That doesn't necessarily mean that the requirements of those systems are general purpose (and I strongly agree with your concern about unnecessarily coupling solutions together), but the solution is not tightly coupled to the diff/patch approach used by React and similar approaches.

wycats commented Jun 22, 2016

Right now it feels like we're solving a specific use case (e.g. the Virtual DOM diff + patch model), which may prove unwieldy when people aren't building that precise functionality.

Just to make sure we're all on the same page, Glimmer does not use a VirtualDOM diff + patch model. It doesn't maintain its own representation of an intermediate not-real DOM, and does not diff two not-real DOMs in order to determine what operations it needs to perform.

I know of at least three models (Google's Incremental DOM, React's Virtual DOM, and Ember's Glimmer VM approach) that provide write-only abstractions that can identify a list of changes to apply, and then apply them in a batch.

That doesn't necessarily mean that the requirements of those systems are general purpose (and I strongly agree with your concern about unnecessarily coupling solutions together), but the solution is not tightly coupled to the diff/patch approach used by React and similar approaches.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 22, 2016

Can we separate these into two distinct proposals - one for async application of layout and another for DOM mutations on worker threads?

How would you feel about an effort to identify the lowest layer of DOM operations that is sufficient to describe the higher layering of just creating and mutating the DOM data structure that is used in DOM1-4? (Some of the work that Chrome has done in its extension's support for Isolated Worlds feels like it overlaps with the constraints here.)

That subset would then be a natural starting point for tree construction and mutation operations that could be transferred across a worker boundary (and could be decoupled from the more controversial effort to reduce the need to use garbage-collected JavaScript objects to represent the intermediate states).

wycats commented Jun 22, 2016

Can we separate these into two distinct proposals - one for async application of layout and another for DOM mutations on worker threads?

How would you feel about an effort to identify the lowest layer of DOM operations that is sufficient to describe the higher layering of just creating and mutating the DOM data structure that is used in DOM1-4? (Some of the work that Chrome has done in its extension's support for Isolated Worlds feels like it overlaps with the constraints here.)

That subset would then be a natural starting point for tree construction and mutation operations that could be transferred across a worker boundary (and could be decoupled from the more controversial effort to reduce the need to use garbage-collected JavaScript objects to represent the intermediate states).

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Jun 22, 2016

Member

Can we separate these into two distinct proposals - one for async application of layout and another for DOM mutations on worker threads?

I think I mentioned earlier there is value in having these combined, since it means that the API and programming model you end up with is shared, meaning abstractions can also be reused, etc.

I do think we need a way to spread the style, layout and paint steps inside the browser out over several frames.

But should it be up to the browser to decide which nodes get rendered when? I think developers would want control over what is included and what is not. The tight coupling of various disciplines is what worries me most about the "async append" proposal. It seems counter to the extensible web lessons.

I'm not convinced that doing that requires building an integer based abstraction over the DOM, that certainly doesn't match any existing implementation and doesn't feel like a primitive.

I would argue it's more of a primitive than tightly coupling DOM mutations and layout, but also:

  • By having a way to express a patch we encourage developers to think more carefully about DOM mutations and how to go about them.
  • It enables developers to send all UI input directly to workers and work on new views there.
  • We could allow these patches to be reused, even have them accept parameters, to make tree building simpler and even more encourage this kind of grouping.

Creating a patch is also a pattern that all major frameworks seem to have converged on. Some as the result of diffing, others through slightly different techniques. It definitely feels like a primitive worth having to me.

(@drufball, you might want to open separate issues for the other proposals. That should at least inform everyone following the development of the DOM Standard where it might go next.)

Member

annevk commented Jun 22, 2016

Can we separate these into two distinct proposals - one for async application of layout and another for DOM mutations on worker threads?

I think I mentioned earlier there is value in having these combined, since it means that the API and programming model you end up with is shared, meaning abstractions can also be reused, etc.

I do think we need a way to spread the style, layout and paint steps inside the browser out over several frames.

But should it be up to the browser to decide which nodes get rendered when? I think developers would want control over what is included and what is not. The tight coupling of various disciplines is what worries me most about the "async append" proposal. It seems counter to the extensible web lessons.

I'm not convinced that doing that requires building an integer based abstraction over the DOM, that certainly doesn't match any existing implementation and doesn't feel like a primitive.

I would argue it's more of a primitive than tightly coupling DOM mutations and layout, but also:

  • By having a way to express a patch we encourage developers to think more carefully about DOM mutations and how to go about them.
  • It enables developers to send all UI input directly to workers and work on new views there.
  • We could allow these patches to be reused, even have them accept parameters, to make tree building simpler and even more encourage this kind of grouping.

Creating a patch is also a pattern that all major frameworks seem to have converged on. Some as the result of diffing, others through slightly different techniques. It definitely feels like a primitive worth having to me.

(@drufball, you might want to open separate issues for the other proposals. That should at least inform everyone following the development of the DOM Standard where it might go next.)

@bmaurer

This comment has been minimized.

Show comment
Hide comment
@bmaurer

bmaurer Jun 23, 2016

Hey guys -- @drufball pointed us at this proposal and the async dom proposal and asked for some thoughts from FB on the api. I think there's a lot of interest in both APIs. Facebook incrementally renders a large part of our site and being able to do things like do that rendering in a worker thread and send a transaction or to do async layout are really valuable.

With that in mind I wanted to take a step back and talk a bit about how this might benefit a site like FB. At a very high level the goal of the API is to keep the UI responsive in the face of manipulations to the UI. Ideally this API would make it possible to parallelize layout with work the application wants to do.
 
So to take a concrete example, let’s say that you’re using FB newsfeed and that the page is downloading a new feed story to display. At the same time the user is writing a comment.
 

  • Ideally we want to minimize the typing lag in the comment while allowing the rendering of the feed story.
  • The event handler for the keyboard events for the feed story needs to have information that reflects the current visual state of the DOM (for example, we want to know the position/size of the text box so that if we grow the text box due to the comment being too big we can scroll the document in a reasonable way).
  • What choices does the API provide about the concurrency of the visual update of the new feed story vs the comment. If the feed story is inserted but hasn’t been laid out yet does the API allow the user to let the simple update to the text box happen before the feed story is inserted?
  • Often times, a developer needs to do layout in JS and see the results all before the item is painted. IE, when it inserts the feed story at the top of the page, it might need to figure out how high the feed story is so it can scroll the viewport by that much when it inserts the story. This should all happen before the user sees anything

bmaurer commented Jun 23, 2016

Hey guys -- @drufball pointed us at this proposal and the async dom proposal and asked for some thoughts from FB on the api. I think there's a lot of interest in both APIs. Facebook incrementally renders a large part of our site and being able to do things like do that rendering in a worker thread and send a transaction or to do async layout are really valuable.

With that in mind I wanted to take a step back and talk a bit about how this might benefit a site like FB. At a very high level the goal of the API is to keep the UI responsive in the face of manipulations to the UI. Ideally this API would make it possible to parallelize layout with work the application wants to do.
 
So to take a concrete example, let’s say that you’re using FB newsfeed and that the page is downloading a new feed story to display. At the same time the user is writing a comment.
 

  • Ideally we want to minimize the typing lag in the comment while allowing the rendering of the feed story.
  • The event handler for the keyboard events for the feed story needs to have information that reflects the current visual state of the DOM (for example, we want to know the position/size of the text box so that if we grow the text box due to the comment being too big we can scroll the document in a reasonable way).
  • What choices does the API provide about the concurrency of the visual update of the new feed story vs the comment. If the feed story is inserted but hasn’t been laid out yet does the API allow the user to let the simple update to the text box happen before the feed story is inserted?
  • Often times, a developer needs to do layout in JS and see the results all before the item is painted. IE, when it inserts the feed story at the top of the page, it might need to figure out how high the feed story is so it can scroll the viewport by that much when it inserts the story. This should all happen before the user sees anything
@Jxck

This comment has been minimized.

Show comment
Hide comment
@Jxck

Jxck Feb 14, 2017

This Proposal seems so nice for me.
I think if DOMChangeList could serialize into string will let us ..

  • sending that from server via WebSocket and, deserialize/apply to DOM for SPA easily.
  • saving that to localStorage/IDB, take them and calculate reverse change will enable Undo DOM easily.

Jxck commented Feb 14, 2017

This Proposal seems so nice for me.
I think if DOMChangeList could serialize into string will let us ..

  • sending that from server via WebSocket and, deserialize/apply to DOM for SPA easily.
  • saving that to localStorage/IDB, take them and calculate reverse change will enable Undo DOM easily.
@aronallen

This comment has been minimized.

Show comment
Hide comment
@aronallen

aronallen Feb 14, 2017

Hi,

I have been working on something similar. I created a format called vDOMSON that is a subset of JSON. My dream is to standardize how vDOM nodes are represented, so we can completely omit h and just use the Array constructor. Hopefully vDOM libs would start adapting vDOMSON as input format, instead of creating their own JS-objects.

My first attempt is to convince the snabbdom project to adapt vDOMSON.

snabbdom/snabbdom#186

What is cool about this approach is that the Web Workers do not need to import whatever vDOM lib that is used, since vDOMSON just is a subset of JSON. Furthermore the vDOM representation is fully decoupled from the patching algorithm.

I am transmitting these vDOMSON trees through the postMessage API, depending on the browser and message size stringify or structured copy is fastest.

What I would like to do to improve performance further. Instead of messaging the entire vDOMSON tree I would run the patch in the web worker and only transmit the necessary changes to the main thread where the actual DOM is touched.

I don't know if transferring objects from a sub-thread to the main thread could beat a really fast structured copy, the structured copy is a simpler approach and I haven't hit any noticeable performance bottlenecks.

aronallen commented Feb 14, 2017

Hi,

I have been working on something similar. I created a format called vDOMSON that is a subset of JSON. My dream is to standardize how vDOM nodes are represented, so we can completely omit h and just use the Array constructor. Hopefully vDOM libs would start adapting vDOMSON as input format, instead of creating their own JS-objects.

My first attempt is to convince the snabbdom project to adapt vDOMSON.

snabbdom/snabbdom#186

What is cool about this approach is that the Web Workers do not need to import whatever vDOM lib that is used, since vDOMSON just is a subset of JSON. Furthermore the vDOM representation is fully decoupled from the patching algorithm.

I am transmitting these vDOMSON trees through the postMessage API, depending on the browser and message size stringify or structured copy is fastest.

What I would like to do to improve performance further. Instead of messaging the entire vDOMSON tree I would run the patch in the web worker and only transmit the necessary changes to the main thread where the actual DOM is touched.

I don't know if transferring objects from a sub-thread to the main thread could beat a really fast structured copy, the structured copy is a simpler approach and I haven't hit any noticeable performance bottlenecks.

@samccone

This comment has been minimized.

Show comment
Hide comment
@samccone

samccone Oct 2, 2017

@domenic

One concern I have about the idea of running these things in a worker is that the IPC involved in getting back to the main thread (in the case of hundreds or thousands of nodes/nodetokens) would probably be a significant cost. I'm not sure whether it would be greater than or less than the cost of just doing the work on the main thread; similarly I am not sure whether the IPC would be significant enough to cause jank. But maybe all you have to IPC is a single memory location of the queue in which case it is presumably fine.

With the arrival of shared array buffers and atomics, this IPC overhead should be little to none now since the instruction backing buffer can be transfered via a SAB.

samccone commented Oct 2, 2017

@domenic

One concern I have about the idea of running these things in a worker is that the IPC involved in getting back to the main thread (in the case of hundreds or thousands of nodes/nodetokens) would probably be a significant cost. I'm not sure whether it would be greater than or less than the cost of just doing the work on the main thread; similarly I am not sure whether the IPC would be significant enough to cause jank. But maybe all you have to IPC is a single memory location of the queue in which case it is presumably fine.

With the arrival of shared array buffers and atomics, this IPC overhead should be little to none now since the instruction backing buffer can be transfered via a SAB.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Oct 2, 2017

Member

If they actually use numbers though, we run into the forgeability problem pointed out in the first part of #270 (comment)

Member

domenic commented Oct 2, 2017

If they actually use numbers though, we run into the forgeability problem pointed out in the first part of #270 (comment)

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Oct 25, 2017

@domenic How much can existing implementations create new value types internally? I think it's important that these values aren't real pointers, but any kind of unforgeable (but copyable) value would be fine.

wycats commented Oct 25, 2017

@domenic How much can existing implementations create new value types internally? I think it's important that these values aren't real pointers, but any kind of unforgeable (but copyable) value would be fine.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Oct 25, 2017

For those following along, I've started an implementation of the tree construction part of this proposal for use in Glimmer: https://github.com/glimmerjs/glimmer-vm/tree/master/packages/%40glimmer/dom-change-list.

If there's interest in helping to work on this outside of Glimmer, I'd be happy to extract it into its own repository. I'll also be publishing regular npm snapshots pretty soon.

wycats commented Oct 25, 2017

For those following along, I've started an implementation of the tree construction part of this proposal for use in Glimmer: https://github.com/glimmerjs/glimmer-vm/tree/master/packages/%40glimmer/dom-change-list.

If there's interest in helping to work on this outside of Glimmer, I'd be happy to extract it into its own repository. I'll also be publishing regular npm snapshots pretty soon.

@Gozala

This comment has been minimized.

Show comment
Hide comment
@Gozala

Gozala Oct 25, 2017

@wycats I have being implementing library for encoding and applying changeLists https://github.com/gozala/dominion which is largely inspired by this proposal. It might be worth meeting discussing our efforts.

Gozala commented Oct 25, 2017

@wycats I have being implementing library for encoding and applying changeLists https://github.com/gozala/dominion which is largely inspired by this proposal. It might be worth meeting discussing our efforts.

@Gozala

This comment has been minimized.

Show comment
Hide comment
@Gozala

Gozala Oct 25, 2017

@wycats To be specific, one thing that I'm really interested in, which this proposal does not cover, is event handling and I could really use your input on that if you have any thoughts.

Gozala commented Oct 25, 2017

@wycats To be specific, one thing that I'm really interested in, which this proposal does not cover, is event handling and I could really use your input on that if you have any thoughts.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Oct 25, 2017

@Gozala what's the easiest way to get in touch?

wycats commented Oct 25, 2017

@Gozala what's the easiest way to get in touch?

@Gozala

This comment has been minimized.

Show comment
Hide comment
@Gozala

Gozala Oct 25, 2017

@wycats I've being noticing you at MozPDX fairly often, maybe next time you're there we can chat ? Alternatively IRC, you can find me as @Gozala on both freenode and moz.

Gozala commented Oct 25, 2017

@wycats I've being noticing you at MozPDX fairly often, maybe next time you're there we can chat ? Alternatively IRC, you can find me as @Gozala on both freenode and moz.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Nov 10, 2017

Member

I looked into the concerns around NodeToken being an integer (or number). For the DOMTreeConstruction idea it can be an integer. The integers get returned as you use DOMTreeConstruction. The moment you turn those integers into nodes using getNode(), there's a reference to DOMTreeConstruction available and any lookup only happens within that object. An implementation could store a map of integers to nodes on DOMTreeConstruction the moment nodes are created as part of insertTreeBefore(). As these integers are therefore entirely self-contained, this even paves the path toward DOMTreeConstruction being serializable and network-transferable. To ensure we don't have to keep that map alive forever we could deterministically empty it by .then()'ing the returned promise from insertTreeBefore(). Those details can be fiddled with, this just shows it's viable.

DOMChangeList is trickier as the integers can also represent nodes in an existing tree (via getToken(), which is listed under DOMTreeConstruction in OP, but is not part of it as no method takes a NodeToken there), in a different thread. That means the integers for DOMChangeList need to be unique within an agent cluster and are probably not suitable for network-transfer, ever. It also means they need to be long-lived as they need to be able to transfer across threads. The fogeability problem I would address by when looking up the node corresponding to an integer, also ensuring that its node document is identical to the document upon which applyChanges() was invoked. And as said before for any instructions that are incorrect we'd abort, leaving you with a potentially broken tree as we cannot do transactions (see upthread).

(Given the nature of integers in DOMChangeList I suspect there would need to be some requirement for them to be random to avoid fingerprinting.)

So in conclusion I think both designs are viable.

Member

annevk commented Nov 10, 2017

I looked into the concerns around NodeToken being an integer (or number). For the DOMTreeConstruction idea it can be an integer. The integers get returned as you use DOMTreeConstruction. The moment you turn those integers into nodes using getNode(), there's a reference to DOMTreeConstruction available and any lookup only happens within that object. An implementation could store a map of integers to nodes on DOMTreeConstruction the moment nodes are created as part of insertTreeBefore(). As these integers are therefore entirely self-contained, this even paves the path toward DOMTreeConstruction being serializable and network-transferable. To ensure we don't have to keep that map alive forever we could deterministically empty it by .then()'ing the returned promise from insertTreeBefore(). Those details can be fiddled with, this just shows it's viable.

DOMChangeList is trickier as the integers can also represent nodes in an existing tree (via getToken(), which is listed under DOMTreeConstruction in OP, but is not part of it as no method takes a NodeToken there), in a different thread. That means the integers for DOMChangeList need to be unique within an agent cluster and are probably not suitable for network-transfer, ever. It also means they need to be long-lived as they need to be able to transfer across threads. The fogeability problem I would address by when looking up the node corresponding to an integer, also ensuring that its node document is identical to the document upon which applyChanges() was invoked. And as said before for any instructions that are incorrect we'd abort, leaving you with a potentially broken tree as we cannot do transactions (see upthread).

(Given the nature of integers in DOMChangeList I suspect there would need to be some requirement for them to be random to avoid fingerprinting.)

So in conclusion I think both designs are viable.

@AshleyScirra

This comment has been minimized.

Show comment
Hide comment
@AshleyScirra

AshleyScirra Dec 19, 2017

I think it's worth pointing out this experimental library I just put together: https://github.com/AshleyScirra/via.js

This allows full access to all DOM APIs using Proxys. It effectively builds an internal list of DOM commands and posts them to the main thread to be carried out. It happens entirely in JS land. I think its main advantage over DOMChangeList is it allows you to use all APIs the same way you do in the DOM, rather than a limited tree-construction subset with its own API. For example you can do div.style.fontWeight = "bold" from a worker, or make use of mainthread-only APIs like the Web Audio API from a worker. Also the performance is actually kind of reasonable (it's not as terrible as I thought it might be, at least).

The main problem is that it leaks memory. GC is not observable so it's not possible to identify when the placeholder Proxy objects on the worker get collected and drop their corresponding references on the main thread. I think this can be solved with a special WeakKey object which still doesn't make GC observable: https://discourse.wicg.io/t/a-way-to-the-use-dom-in-a-web-worker/2483/2?u=ashleyscirra

If I read this proposal right, it sounds like this WeakKey idea is like NodeToken but generalised to represent any DOM object at all. This significantly broadens the use cases it covers. For example we want to move our entire HTML5 game engine in to a worker, but the lack of APIs like Web Audio is a much bigger problem than whether or not we can create nodes. Via.js lets you do both: build up lists of DOM commands, and access arbitrary APIs only present on the main thread.

AshleyScirra commented Dec 19, 2017

I think it's worth pointing out this experimental library I just put together: https://github.com/AshleyScirra/via.js

This allows full access to all DOM APIs using Proxys. It effectively builds an internal list of DOM commands and posts them to the main thread to be carried out. It happens entirely in JS land. I think its main advantage over DOMChangeList is it allows you to use all APIs the same way you do in the DOM, rather than a limited tree-construction subset with its own API. For example you can do div.style.fontWeight = "bold" from a worker, or make use of mainthread-only APIs like the Web Audio API from a worker. Also the performance is actually kind of reasonable (it's not as terrible as I thought it might be, at least).

The main problem is that it leaks memory. GC is not observable so it's not possible to identify when the placeholder Proxy objects on the worker get collected and drop their corresponding references on the main thread. I think this can be solved with a special WeakKey object which still doesn't make GC observable: https://discourse.wicg.io/t/a-way-to-the-use-dom-in-a-web-worker/2483/2?u=ashleyscirra

If I read this proposal right, it sounds like this WeakKey idea is like NodeToken but generalised to represent any DOM object at all. This significantly broadens the use cases it covers. For example we want to move our entire HTML5 game engine in to a worker, but the lack of APIs like Web Audio is a much bigger problem than whether or not we can create nodes. Via.js lets you do both: build up lists of DOM commands, and access arbitrary APIs only present on the main thread.

@isiahmeadows

This comment has been minimized.

Show comment
Hide comment
@isiahmeadows

isiahmeadows Feb 18, 2018

Could I get some sort of status update on this concept in general? (Evolutions, meeting notes, etc.?) Haven't really heard or seen anything about it anywhere, either here or on the WICG discourse. (I'm not active on any of the W3C/WHATWG mailing lists, though.)

isiahmeadows commented Feb 18, 2018

Could I get some sort of status update on this concept in general? (Evolutions, meeting notes, etc.?) Haven't really heard or seen anything about it anywhere, either here or on the WICG discourse. (I'm not active on any of the W3C/WHATWG mailing lists, though.)

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Feb 18, 2018

Member

There's no active mailing list. There was a meeting at a Mozilla All Hands, but that wasn't really conclusive one way or another. The main problem here is lack of implementer interest.

Member

annevk commented Feb 18, 2018

There's no active mailing list. There was a meeting at a Mozilla All Hands, but that wasn't really conclusive one way or another. The main problem here is lack of implementer interest.

@AshleyScirra

This comment has been minimized.

Show comment
Hide comment
@AshleyScirra

AshleyScirra Feb 18, 2018

I wrote a blog post that covers some of the options to better support implementing this entirely in a library: https://www.scirra.com/blog/ashley/38/why-javascript-needs-cross-heap-collection

AshleyScirra commented Feb 18, 2018

I wrote a blog post that covers some of the options to better support implementing this entirely in a library: https://www.scirra.com/blog/ashley/38/why-javascript-needs-cross-heap-collection

@isiahmeadows

This comment has been minimized.

Show comment
Hide comment
@isiahmeadows

isiahmeadows Feb 18, 2018

@annevk Sorry, I meant any of the W3C/WHATWG mailing lists. (I edited my comment accordingly.)

isiahmeadows commented Feb 18, 2018

@annevk Sorry, I meant any of the W3C/WHATWG mailing lists. (I edited my comment accordingly.)

@isiahmeadows

This comment has been minimized.

Show comment
Hide comment
@isiahmeadows

isiahmeadows Feb 19, 2018

Also, I'm not a huge fan of the API proposed - it's a bit too verbose and too much of a shotgun for something that's really just performing transactions on nodes. I'd personally prefer there was a way to just force browsers to batch something without having changes exposed to the main thread until the next animation frame, and a way to run changes in the layout pass to reduce the impact of style recalcs (which account for 90% of the perf issues with DOM frameworks).


Edit: Hidden for easier scrolling

Here's my thought on such an API, one that's a bit less ambitious:

// `execute` runs after animation frames, after all previous transactions have been
// executed, but before intersection observers have been run. Any transactions started
// here are deferred to the *next* animation frame.
//
// When `execute` called, if any properties of a node all of these are true, then an
// `InvalidStateError` should be thrown:
// 1. The node being mutated is in the calling document.
// 2. The node is live, or would have been made live in the transaction.
// 3. UI-visible attributes of that node are read/modified, including ones like
//    `parentNode` and even if indirectly (like in a transaction).
// 4. The node *is not* in `read` (if read) or `write` (if modified).
dictionary LayoutOptions {
    AbortSignal? signal;
    sequence<Node> read = [];
    sequence<Node> write = [];
    // I know this is invalid syntax, but `this` is supposed to be the layout options object
    Promise<any>? execute();
}

// Transactions assume the state right after the animation frames are run *are* the
// initial state - they don't actually save anything first, and they act as if the
// operations work on the state after animation frame callbacks run.
[Exposed=Window, Constructor]
interface DOMTransaction {
    void insertBefore(ParentNode node, ChildNode? node, ChildNode next);
    void insertAfter(ParentNode node, ChildNode? node, ChildNode prev);
    void replace(ChildNode node, ChildNode? other);
    void setAttribute(Element elem, DOMString key, DOMString? value, DOMString? ns);
    void toggleClass(Element elem, DOMString classes, boolean force);
    void setStyle(HTMLElement elem, DOMString key, DOMString? value);
    void setNodeValue(Node node, DOMString value);
    // The tree is locked between when animation frames and intersection observers
    // would be run. The fulfillment value is that of `callback.execute()` or
    // `undefined` if the method wasn't passed.
    Promise<any> end(optional LayoutOptions? options);
}

There's a few specific omissions and design points with my proposal here that I thought I'd note:

  1. Most DOM interactions can be reduced to these.

    • Append: transaction.insertAfter(parent, null, node)
    • Prepend: transaction.insertBefore(parent, null, node)
    • Insert before: transaction.insertBefore(parent, next, node)
    • Insert after: transaction.insertAfter(parent, prev, node)
    • Replace: transaction.replace(node, other)
    • Remove: transaction.replace(node, null)
    • Set attribute: transaction.setAttribute(node, key, value, ns)
    • Remove attribute: transaction.setAttribute((node, key, null, ns)
    • Set class: transaction.toggleClass(node, classes, true)
    • Remove class: transaction.toggleClass(node, classes, false)
    • Set node value: transaction.setNodeValue(text, value)
    • Set style: transaction.setStyle(elem, key, value)
    • Remove style: transaction.setStyle(node, key, null)
    • Set id/className/etc.: set that attribute
  2. The few that can't could be handled in the LayoutCallback without having to spend an extra animation frame. (It also exists for the sake of refs, like what most vdom frameworks have some variant of.)

  3. I chose to include both, and only, insertBefore and insertAfter as it's much simpler than the half dozen existing methods, and it works better with single-pass updates.

  4. I chose to exclude properties that aren't available via normal DOM attributes because there's generally nothing left to do.

  5. This is technically polyfillable without an explosion of code short of the throwing errors part. It would require wrapping requestAnimationFrame and cancelAnimationFrame to call the callbacks at the right time, but that's about it.

  6. The reason I require explicit lists is so browsers can lock the others in that document to update them in parallel and be able to compute their layout without having to assume the prospect of interference with other code. (In practice, they only need one read/write lock to rule them all per document, and after that, a quick flag check whose value is initially calculated after animation frame callbacks are run to assert that the node is destined to become live.)

  7. The choice of whether to fire an extra callback is deferred until the end because you might find while patching that you might not need to fire that layout callback (which would be potentially pretty expensive). If no layout callbacks are scheduled, the browser can compute layout immediately and exercise its normal magic, all off-thread.

  8. The DOMTreeConstruction API is about 90% solved by innerHTML with <template>s, and in general, it looks to be solving a problem orthogonal to transactions (so I'm more neutral on it). You could still get very nice gains from using them in place of document.createElement(elem) with large trees, while still adding them via elem.appendChild(tree). (I've witnessed this in action already with template.content.cloneNode(true) + a few elem.querySelector calls being faster than the equivalent document.createElement + elem.appendChild sequence, and in some cases, even when dynamically created with innerHTML.)

isiahmeadows commented Feb 19, 2018

Also, I'm not a huge fan of the API proposed - it's a bit too verbose and too much of a shotgun for something that's really just performing transactions on nodes. I'd personally prefer there was a way to just force browsers to batch something without having changes exposed to the main thread until the next animation frame, and a way to run changes in the layout pass to reduce the impact of style recalcs (which account for 90% of the perf issues with DOM frameworks).


Edit: Hidden for easier scrolling

Here's my thought on such an API, one that's a bit less ambitious:

// `execute` runs after animation frames, after all previous transactions have been
// executed, but before intersection observers have been run. Any transactions started
// here are deferred to the *next* animation frame.
//
// When `execute` called, if any properties of a node all of these are true, then an
// `InvalidStateError` should be thrown:
// 1. The node being mutated is in the calling document.
// 2. The node is live, or would have been made live in the transaction.
// 3. UI-visible attributes of that node are read/modified, including ones like
//    `parentNode` and even if indirectly (like in a transaction).
// 4. The node *is not* in `read` (if read) or `write` (if modified).
dictionary LayoutOptions {
    AbortSignal? signal;
    sequence<Node> read = [];
    sequence<Node> write = [];
    // I know this is invalid syntax, but `this` is supposed to be the layout options object
    Promise<any>? execute();
}

// Transactions assume the state right after the animation frames are run *are* the
// initial state - they don't actually save anything first, and they act as if the
// operations work on the state after animation frame callbacks run.
[Exposed=Window, Constructor]
interface DOMTransaction {
    void insertBefore(ParentNode node, ChildNode? node, ChildNode next);
    void insertAfter(ParentNode node, ChildNode? node, ChildNode prev);
    void replace(ChildNode node, ChildNode? other);
    void setAttribute(Element elem, DOMString key, DOMString? value, DOMString? ns);
    void toggleClass(Element elem, DOMString classes, boolean force);
    void setStyle(HTMLElement elem, DOMString key, DOMString? value);
    void setNodeValue(Node node, DOMString value);
    // The tree is locked between when animation frames and intersection observers
    // would be run. The fulfillment value is that of `callback.execute()` or
    // `undefined` if the method wasn't passed.
    Promise<any> end(optional LayoutOptions? options);
}

There's a few specific omissions and design points with my proposal here that I thought I'd note:

  1. Most DOM interactions can be reduced to these.

    • Append: transaction.insertAfter(parent, null, node)
    • Prepend: transaction.insertBefore(parent, null, node)
    • Insert before: transaction.insertBefore(parent, next, node)
    • Insert after: transaction.insertAfter(parent, prev, node)
    • Replace: transaction.replace(node, other)
    • Remove: transaction.replace(node, null)
    • Set attribute: transaction.setAttribute(node, key, value, ns)
    • Remove attribute: transaction.setAttribute((node, key, null, ns)
    • Set class: transaction.toggleClass(node, classes, true)
    • Remove class: transaction.toggleClass(node, classes, false)
    • Set node value: transaction.setNodeValue(text, value)
    • Set style: transaction.setStyle(elem, key, value)
    • Remove style: transaction.setStyle(node, key, null)
    • Set id/className/etc.: set that attribute
  2. The few that can't could be handled in the LayoutCallback without having to spend an extra animation frame. (It also exists for the sake of refs, like what most vdom frameworks have some variant of.)

  3. I chose to include both, and only, insertBefore and insertAfter as it's much simpler than the half dozen existing methods, and it works better with single-pass updates.

  4. I chose to exclude properties that aren't available via normal DOM attributes because there's generally nothing left to do.

  5. This is technically polyfillable without an explosion of code short of the throwing errors part. It would require wrapping requestAnimationFrame and cancelAnimationFrame to call the callbacks at the right time, but that's about it.

  6. The reason I require explicit lists is so browsers can lock the others in that document to update them in parallel and be able to compute their layout without having to assume the prospect of interference with other code. (In practice, they only need one read/write lock to rule them all per document, and after that, a quick flag check whose value is initially calculated after animation frame callbacks are run to assert that the node is destined to become live.)

  7. The choice of whether to fire an extra callback is deferred until the end because you might find while patching that you might not need to fire that layout callback (which would be potentially pretty expensive). If no layout callbacks are scheduled, the browser can compute layout immediately and exercise its normal magic, all off-thread.

  8. The DOMTreeConstruction API is about 90% solved by innerHTML with <template>s, and in general, it looks to be solving a problem orthogonal to transactions (so I'm more neutral on it). You could still get very nice gains from using them in place of document.createElement(elem) with large trees, while still adding them via elem.appendChild(tree). (I've witnessed this in action already with template.content.cloneNode(true) + a few elem.querySelector calls being faster than the equivalent document.createElement + elem.appendChild sequence, and in some cases, even when dynamically created with innerHTML.)

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Feb 19, 2018

Member

It's not transactions. We cannot do transactions given iframe and such. And also, your API doesn't work in workers, which is rather big part of the value proposition. If you still wish to pursue it, please do so in a distinct issue.

Member

annevk commented Feb 19, 2018

It's not transactions. We cannot do transactions given iframe and such. And also, your API doesn't work in workers, which is rather big part of the value proposition. If you still wish to pursue it, please do so in a distinct issue.

@isiahmeadows

This comment has been minimized.

Show comment
Hide comment
@isiahmeadows

isiahmeadows commented Feb 19, 2018

Okay. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment