Use cases where we need fast equality comparison? #199

marjakh · 2020-09-16T19:08:40Z

One of the discussion points of the proposal has been making equality comparison fast. But which use cases need it?

The thought process goes like this:

We can achieve fast equality comparisons by internalizing.
Maybe we shouldn't internalize upfront on creation, since not all records / tuples will be equality-compared.
So let's internalize when we do the first equality comparison.
But do we ever do equality comparisons where both objects are "old"?

E.g., if we have a big state object rec_a, and we compare it against rec_b (with structural sharing), to figure out whether the state has changed. Then we throw one of them away. Say we keep rec_b. Then later we compare it against a new one rec_c, etc.

By having internalized any of them, we can't make the equality comparison any faster. Either we traverse (*) the new one to internalize it eagerly, or we traverse (*) it to internalize it lazily, or we don't internalize at all but traverse (*) it when doing the equality comparison.

(*) We only need to traverse until we hit the structurally shared part of the record / tuple. This holds for all the options.

Are there some other use cases which would benefit from fast equality? Alternatively, what are we missing?

(Kudos to @camillobruni and @LeszekSwirski for coming up with this thought process.)

jridgewell · 2020-09-16T19:42:35Z

The main case I can see doesn't actually concern rec_c, it just concerns rec_a and rec_b.

When an initial equality rec_a === rec_b is done, it'd be fine for this to be linear (a second rec_a === rec_b should be constant). If they're equal, then everything should be good. However, if they're not equal, I'm going to have to descend into the records to see what's changed. So essentially there'll be a for loop doing:

for (const key in rec_a) {
  if (rec_a[key] !== rec_b[key]) alert(key);
}
for (const key in rec_b) {
  if (rec_a[key] !== rec_b[key]) alert(key);
}

We need to ensure that, because I've done a rec_a === rec_b equality check first, the rec_a[key] === rec_b[key] check is fast. Else we'll have created an O(n²) algorithm in the vdom diffs where it's currently O(n).

If you later do rec_b === rec_c, then it'd be fine to have another slow check. But then rec_b[key] === rec_c[key] must be fast after it.

marjakh · 2020-09-16T19:54:50Z

But the same argument holds for rec_a[key] and rec_b[key]. Either they are the same pointer (because of structural sharing), or one of them is a pointer to an "old" record and another one is a pointer to a "new" record.

I'm assuming that whenever the new states are created, they're created with maximal structural sharing with the "current state", i.e., these records don't just get recreated out of nowhere, without structural sharing with the existing records.

This argument formulated in another way is: looks like we can achieve the same by just structural sharing + using normal JS objects in an immutable way.

jridgewell · 2020-09-16T20:13:22Z

But the same argument holds for rec_a[key] and rec_b[key]. Either they are the same pointer (because of structural sharing), or one of them is a pointer to an "old" record and another one is a pointer to a "new" record.

What I would like to see is the rec_a === rec_b check forces the pointers to be internalized (am I using that correct?). Eg, rec_b[key] should now point to rec_a[key] if they were equal, or if they were unequal then I should be able to do rec_a[key] === rec_b[key] without another slow deep check.

I'm assuming that whenever the new states are created, they're created with maximal structural sharing with the "current state", i.e., these records don't just get recreated out of nowhere, without structural sharing with the existing records.

I don't think that holds for current vdom implementations. Creating a tree of nodes will always create brand new, completely unshared with anything before it, records. Certain constant nodes could be hoisted out and reused when creating a new tree, but that would be a very small minority of them.

syg · 2020-09-16T20:24:42Z

I'm kind of surprised that creating a new vdom tree always creates a completely new tree. That seems very inefficient, and how you're hoping Records and Tuples to help out here is for the JS engine to do another walk over that tree to deduplicate. There seems to be a better path for better performance by doing structural sharing directly in the user code.

Part of the point @marjakh is making here is that how/when the engine should internalize records will ultimately be a heuristic, and as a heuristic, the engine could guess wrong here. The vdom use is a specific use case, and your example points to one of the heuristics to be something like: for large Records, lazily and recursively internalize on ===. I can see that getting pretty expensive if we guessed wrong. Not excited about creating another IIFE-like heuristic that folks end up working around.

jridgewell · 2020-09-16T20:42:06Z

I'm kind of surprised that creating a new vdom tree always creates a completely new tree.

As a rough example:

// Input:
/** @jsx h */
function Component({ dynamic }) {
  return (
    <div>
      <span>I can be intelligently shared across all renders</span>
      <h1>{dynamic}</h1>
    </div>
  );
}

render(
  <Component
    dynamic={"I'm dynamic, and anything that contains me can't be state shared"}
  />,
  document.body
);

// Output
var _ref = h("span", null, "I can be intelligently shared across all renders");

/** @jsx h */
function Component({ dynamic }) {
  return h("div", null, _ref, h("h1", null, dynamic));
}

render(
  h(Component, {
    dynamic: "I'm dynamic, and anything that contains me can't be state shared"
  }),
  document.body
);

That <span> can be shared between renders, but anything that's dynamic (and anything that contains another node that is dynamic) cannot be shared. Notice the dynamic <h1> element is new each render, and the <div> that contains the <h1>. We end up generating a lot of unshared nodes.

marjakh · 2020-10-21T15:29:56Z

@jridgewell thanks for pointing out the wrong assumption (of maximal structural sharing) (sorry for the delay)

marjakh closed this as completed Oct 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use cases where we need fast equality comparison? #199

Use cases where we need fast equality comparison? #199

marjakh commented Sep 16, 2020 •

edited

jridgewell commented Sep 16, 2020

marjakh commented Sep 16, 2020 •

edited

jridgewell commented Sep 16, 2020

syg commented Sep 16, 2020

jridgewell commented Sep 16, 2020

marjakh commented Oct 21, 2020

Use cases where we need fast equality comparison? #199

Use cases where we need fast equality comparison? #199

Comments

marjakh commented Sep 16, 2020 • edited

jridgewell commented Sep 16, 2020

marjakh commented Sep 16, 2020 • edited

jridgewell commented Sep 16, 2020

syg commented Sep 16, 2020

jridgewell commented Sep 16, 2020

marjakh commented Oct 21, 2020

marjakh commented Sep 16, 2020 •

edited

marjakh commented Sep 16, 2020 •

edited