feat(effect): refactorings to improve performance, reduce memory usage #2345

basvanmeurs · 2020-10-09T14:21:32Z

This PR is a refactoring of the reactivity module.

Make use of object prototypes for effects to reduce memory usage
Ref-specific high-performance implementation for track and trigger
Miscellaneous memory- and performance-related tweaks

Results

Big runtime performance improvement for ref, computed, watch and watchEffect (30%-80% depending on the amount of dependencies)
Memory usage decreased by about 30% when creating ref, computed, watch and watchEffect
Creation time performance improvement, most notably for watchers and watchEffects

Performance

You can compare my benchmarks on your own machine:
This PR
3.0.0

Or view my own results:

Memory

In terms of memory I only did a pretty rudimentary test:

                const arr = [];
		for (let i = 0; i < 1e5; i++) {
			const v = ref(100);
			const c = computed(() => v.value * 2);
			const w = watch(c, (v) => {
			});
			const we = watchEffect(() => {
				const v = c.value;
			});
			arr.push(v);
			arr.push(c);
			arr.push(w);
			arr.push(we);
		}

I managed to save some memory by using more properties on the prototypes, and moved some parameters from the option object because those ReactiveEffectOptions objects have a memory overhead as well. Left is the PR in its current state, right is 3.0.0. A reduction of about 30%. I think we're approaching the limit of what we can do.

basvanmeurs · 2020-10-12T15:26:55Z

I think we need to keep track of the performance of the vue core, especially the reactivity part. To verify my own work, as well as testing for future development, I created this little repository: https://github.com/basvanmeurs/vue-next-benchmarks

You can run some reactivity benchmarks live here: https://basvanmeurs.github.io/vue-next-benchmarks/

You can either enter a specific version or a specific url.

The work in this PR can be compared to the current release using the following URLs:
https://basvanmeurs.github.io/vue-next-benchmarks/?v=vue.global.js
https://basvanmeurs.github.io/vue-next-benchmarks/?v=3.0.0

yyx990803 · 2020-10-13T21:06:17Z

This is great work! However, it seems it contains some non trivial changes to the usage of effect, which is technically part of the public API of @vue/reactivity. Can you summarize a list of changed APIs (both internally and externally)?

Regarding the benchmarks - this is very helpful, it'd be awesome to maybe extend this to component instance creation and vdom updates as well. We've had the thought of running something similar on CI but never got time to actually work on it.

dsonet · 2020-10-13T22:24:08Z

It would be very great to implement #2195 after this.

basvanmeurs · 2020-10-14T03:54:02Z

Can you summarize a list of changed APIs (both internally and externally)?

Reactivity API changes

ReactiveEffect is now an object instead of a function. It must now be invoked using the run method (effect.run() vs effect()).
The effect function has changed. The properties lazy, allowRecurse and scheduler have been extracted out of the ReactiveEffectOptions and are now passed as function parameters.

Vue API changes

The external vue api has not changed as the effect function is not exported. The ReactiveEffect and ReactiveEffectOptions types are exported from the vue package but they aren't mentioned in the official documentation.

basvanmeurs · 2020-10-14T14:04:41Z

Moving back to draft as I have some additional optimizations

basvanmeurs · 2020-10-15T15:23:58Z

After the latest updates:

jods4 · 2020-11-14T12:10:24Z

packages/reactivity/src/effect.ts

+
+ReactiveEffect.prototype.active = true
+ReactiveEffect.prototype.allowRecurse = false
+ReactiveEffect.prototype.options = EMPTY_OBJ


@basvanmeurs The code might be cleaner if you declare those three as getters on the class?
public get active() { return true } would end up on the prototype and I don't think usage of a getter rather than field would impact perf / memory in any noticeable way (we have your benchmarks to verify).

Other than coding style, it's also easier for tools to determine that a module is side-effect free if it doesn't contain statements, just class declarations.

I was initially enthousiastic. But when I tried changing this, all property assignments started to cause typescript errors (Cannot assign to 'options' because it is a read-only property.). I could ignore them for every assignment by casting to any: (this as any).options = {}. But personally I think that the resulting code will be worse than what it is now.

Ah, I did not see that one coming.
I agree your original code is a lot better than using any in callers.
I see no way to "idiomatically" have TS infer that we're having fields with default values stored in prototype.

If we did things in a "less-JS" way we would either mark those fields optional and provide the default value in code that reads them, or always init fields with the default value.
Not sure about the perf/memory impact of those ways, it would need some benchmarking.

Always assigning a value to a field member would increase memory usage but might lead to monomorphic code, which maybe executes faster. Not sure if it would be noticeable. You have a benchmark avail., maybe you could try.

I tried setting the properties in the constructor during development for the reason you mentioned. It did not affect performance at all, at least in Chromium. It might be different in other browsers, but probably not by much.

Because it really saved some memory I decided to put them on the prototype.

jods4 · 2020-11-14T12:28:46Z

@basvanmeurs Thank you very much for tackling this. ❤️
Reactive perf is at the heart of everything in Vue. Even small improvements will compound to larger benefits in big, intensive apps.

Like you, I noticed there was room for improvement in the reactivity package and wanted to create a PR. Couldn't manage to actually do it because I had too much work.
So I'll share one more idea that you may try out:

During a run, the dependency tracking logic is simply throwing out all previous dependencies, and builds up new arrays, maps, with the detected dependencies. I see you've optimized those, but it's still a non-trivial amount of allocations.
In busy systems with lots going on, that puts pressure on the GC.

A key observation is that most effects have constant, or nearly constant, list of dependencies.

So prior art uses a different way to update those dependency lists:

Associate a run id with each registered effect dependency.
The run id is a simple global counter that is incremented every time an effect runs.
Dependencies are not reset on each run, instead:
Track with existing dependencies simply assigns the current run id but does nothing more.
Track on new dependencies adds them as usual, with the current run id.
At the end of the effect run, a clean-up phase iterates all dependencies and removes those whose ids is less than the current run (i.e. those that did not call track on this run).

jods4 · 2020-11-14T13:35:44Z

Looked at your code and there's another thing that you could do on top of my previous comment.

trigger currently snapshots (i.e. copies into an array) dependencies, so it's not impacted by their unlikely mutation.
With some additional cleverness the copy is not required.

When a trigger effect runs, other effects may be triggered in cascade and run on top of it, possibly adding or removing themselves in the dependency list currently iterated.

I'll explain why the "concept" of adding or removing dependencies doesn't matter:

Newly added dependencies can be skipped based on having a higher run id than the value when trigger started.
Notice that "updated" dependencies, that would have run because of a nested trigger, also have an updated run id and will be skipped. And that's actually great and better than the snapshot option: since they already ran, there's no need to run them a second time as they already saw this trigger updated value.
By the same logic, removed dependencies can be ignored, since they just ran there's no need to run them a second time, esp. now that the trigger source is not a dependency of those effects anymore.

Removed dependencies can create a mutation problem in the array, though. If we're at index i and a dependency before i is spliced out, that would cause us to skip one element in the array, which is a bug.

This can be solved by not splicing removed dependencies, but rather setting them to null. That way, the array never shrinks and iteration won't skip any element.

One short-term benefit here is that setting an array item to null is more efficient than splicing, which must copies all subsequent items and is O(N).

One problem is that now we may have holes in the array and over time that may become a memory leak if a dependency is added/removed in a cycling fashion.
That problem can be solved in 2 ways:

Create a global Set list of arrays that needs compacting. Use requestIdleCallback to schedule compaction of all those arrays at a time when nothing happens (nor our code, not even user activity). That's great because it moves shrinking arrays work out of the immediate feedback loop into a time where we don't care about CPU.
Not all browsers have requestIdleCallback (https://caniuse.com/?search=requestIdleCallback), we can fallback to end-of-JS-turn for the old browsers. Same behavior, at a slightly less convenient time.
I like 1 better but another design is to keep the index of the first free slot in each dependencies array. Instead of pushing new deps, those slots are reused by doing deps[freeSlot] = newDep, which also works past the end of array. Once added, the next freeSlot must be found by iterating down the array.

All of this might sound like it's a bit tricky, but if you abstract dependencies list management into a single class or module or derived array then it's self contained and well encapsulated, calling code is clean.

basvanmeurs · 2020-11-16T10:46:08Z

Hi @jods4

Thanks for your review and feedback!

I also realized the importance of improving the performance on this, as in my own app I found that too much time was spent in triggering and tracking stuff.

Let me first respond to your first idea (effect ids).

During a run, the dependency tracking logic is simply throwing out all previous dependencies, and builds up new arrays, maps, with the detected dependencies. I see you've optimized those, but it's still a non-trivial amount of allocations.
In busy systems with lots going on, that puts pressure on the GC.

That's a claim that can be easily investigated with my vue-next-benchmarks ;-) I've used my own PR for benchmarking.

Benchmarks

The 'computed:write ref, read 1000 computeds' is a good test to see the impact of this tracking/triggering in practice. The code is like this:

() => {
    const v = ref(100);
    const computeds = [];
    for (let i = 0, n = 1000; i < n; i++) {
      const c = computed(() => {
        return v.value * 2
      });
      const cv = c.value;
      computeds.push(c);
    }
    let i = 0;
    return suite.add("write ref, read 1000 computeds", () => {
      v.value = i++;
      computeds.forEach(c => c.value);
    });
  }

If the total time, about 30% of the time is spent in the 'cleanup' of effects, and 18% (surprisingly less than the cleanup) is spent in trackRefValue, and 3% is spent in triggerRefValue. More than half of the total time is spent on the process that you described. GC only takes about 1 to 2% by the way. Chrome is very good at reusing those many small temporary arrays and sets efficiently.

What if we reverse the test, having a single computed having many (1000) dependencies?

() => {
    const refs = [];
    for (let i = 0, n = 1000; i < n; i++) {
      refs.push(ref(i));
    }
    const c = computed(() => {
      let total = 0;
      refs.forEach(ref => total += ref.value);
      return total;
    });
    let i = 0;
    const n = refs.length;
    return suite.add("1000 refs, 1 computed", () => {
      refs[i++ % n].value++;
      const v = c.value;
    });
  }

The test is twice as fast as the first one. But cleanup now takes 47% of all time! trackEffects now takes 18%, while triggerRefValue only amounts to 0.5%.

Looking towards reactive objects (reactiveObject:write reactive obj, read 1000 computeds):

() => {
    const r = reactive({a: 1});
    const computeds = [];
    for (let i = 0, n = 1000; i < n; i++) {
      const c = computed(() => {
        return r.a * 2
      });
      computeds.push(c);
    }
    let i = 0;
    return suite.add("write reactive obj, read 1000 computeds", () => {
      r.a = i++;
      computeds.forEach(c => c.value);
    });
  }

There are 1000 computeds referencing 1 ref.

The figures here are 18% cleanup, 17% tracking, 6% triggering. In comparison, tracking/triggering is relatively more expensive because of the dependency sub-set selection. And the relative time spent in the tracking/triggering is less because the proxy in itself causes a lot of overhead as well. Still it amounts to 41%.

Profiling results for the reactiveObject:1000 reactive objs, 1 computed test are:
21% cleanup, 17% tracking and only 0.2% triggering. GC is again below 2%.

Reactivity benchmark conclusions

So I think we can roughly gain about 50% performance by preventing the deps cleanup/retrack process. Very significant and way too much to ignore if you ask me.. Especially as it's, as you mentioned, at the heart of everything.

As GC time is not significant, I think we shouldn't focus on prevent assigning new Sets and arrays. We should be focusing on the idea of preventing tracking and cleanup, or at least making it smarter. Then re-run the performance tests to see where we're at :-)

Cleanup performance

You mention some interesting ideas. To tell you the truth I had an idea of my own, but it relied on the order of tracked effects which complicated things.

Will your ideas help performance? I believe that even checking whether an effect already exists in a set takes a similar amount of time than to really add it! So your idea will not help us much in terms of tracking performance. But not having to cleanup beforehand will really help a lot based on the profiling information. Up to 20 to 30%.

Another risk is introducing (hard to find) bugs. Your algorithm is simple, but we have to take into account that effects are triggered recursively. So during the run of an effect, other effects will be ran as well, and the run id is incremented and decremented multiple times before the cleanup takes place.

Let's say we have effect A. It's ran, so we record the current run id and invoke the effect. In the case that one of the original effects (let's call it B) of effect A is no longer used by A, but is still used by one of the recursively triggered effects of A, B will still be registered as a dependency of A as it's run id is higher than A's run id. The end result will be that A 'falsely' retains its dependency to B.

We could fix this by maintaining an array of run ids but this has (performance and complexity) downsides. So let's take a step back: is it a problem that A retains the dependency to B? First off, this is really an edge case. And as far as I can tell, this is not really a problem because when that dependency is triggered, it would recursively trigger the subject as well anyway. It won't lead to memory leaks as it will be removed once it is no longer a dependency of a dependency of A.

This idea is quite easy to implement and test. It will cost only a couple of lines of code, and gain us 20 to 30% of performance. I think that's worth it.

Having said all that, I must also share my disappointment. After the release of vue-next, it seems that all progress on the core has come to a halt. I'm reluctant to spend any time on implementing this or other improvements right now, as long as this PR is not merged. I would only be making the merge more difficult if I would. And there's no guarantee that it will be merged at all..

This project is great but this is the time to streamline the core. If we don't do it now it will become more and more difficult.

jods4 · 2020-11-16T11:54:31Z

Your algorithm is simple, but we have to take into account that effects are triggered recursively. So during the run of an effect, other effects will be ran as well, and the run id is incremented and decremented multiple times before the cleanup takes place.

To clarify: run id is a monotone sequence, always increasing.
At the beginning of each run you increment by one and take note of the current value. You never decrease its value.

Effects are not re-entrant. The code checks if the effect is already on the stack and doesn't run it a second time.
Different effects can stack, but each effect has its own distinct dependencies (in internal terms of course, two effects may depend on the same source), there's no overlap between two effects.
The technique described in my first comment (run ids to remove unused deps) was not invented by me. I stole it from Steven Sanderson, who used it in KO, where it's been working for years. My second comment is more uncharted territory.

Reduces memory usage for (computed) refs by 25% and improves creation performance

Move out of the ReactiveEffectOptions in an effort to remove the necessity of creating this object for most use cases, saving memory and performance.

Move several options to params, so that an object is not required for watchers or computeds. This reduces memory. On top of that, place some optional params on the prototype unless non-default.

There's no point in underscoring the effect property of the runner.

…body 1. One variable less on the stack 2. Smaller functions means less time to optimize

Out-of-bounds should be avoided always, for performance reasons. In practice, some benchmarks (especially 'write ref, read 1000 computeds') showed significant (30%) improvement after this change.

Explain that refs store their deps in a local property.

Improves performance.

basvanmeurs · 2021-06-17T14:43:07Z

@yyx990803 would you prefer a new PR which only includes the performance improvements that don't affect the API?

yyx990803 · 2021-06-22T19:42:04Z

@basvanmeurs yes, I'm actually reviewing the changes for 3.2 right now and it would be great to have a new PR with a perf-only changeset (and rebased).

Also, I apologize for letting this sit for so long - and really appreciate your patience. Thanks again!

basvanmeurs · 2021-06-23T14:22:08Z

Thanks Evan. See #3995

Based on #2345 , but with smaller API change - Use class implementation for `ReactiveEffect` - Switch internal creation of effects to use the class constructor - Avoid options object allocation - Avoid creating bound effect runner function (used in schedulers) when not necessary. - Consumes ~17% less memory compared to last commit - Introduces a very minor breaking change: the `scheduler` option passed to `effect` no longer receives the runner function.

basvanmeurs marked this pull request as draft October 9, 2020 21:16

basvanmeurs added a commit to basvanmeurs/vue-next-benchmarks that referenced this pull request Oct 13, 2020

feat(vue): updated vue build (for PR vuejs/core#2345)

8e8a836

basvanmeurs changed the title ~~feat(effect): improve performance, reduce memory usage~~ feat(effect): refactorings to improve performance, reduce memory usage Oct 13, 2020

basvanmeurs marked this pull request as ready for review October 13, 2020 10:46

basvanmeurs marked this pull request as draft October 14, 2020 14:03

basvanmeurs marked this pull request as ready for review October 14, 2020 14:27

basvanmeurs marked this pull request as draft October 14, 2020 14:28

basvanmeurs marked this pull request as ready for review October 15, 2020 15:33

yyx990803 added the scope: reactivity label Nov 3, 2020

dsonet mentioned this pull request Nov 14, 2020

Ref sugar vuejs/rfcs#228

Closed

jods4 reviewed Nov 14, 2020

View reviewed changes

basvanmeurs added 9 commits November 16, 2020 13:42

feat(effect): use Class for effect

44dedfb

Reduces memory usage for (computed) refs by 25% and improves creation performance

feat(effect): do not store 'lazy' flag unnecessarily

f9ed4c3

feat(effect): set 'allowRecurse' as param

134e33c

Move out of the ReactiveEffectOptions in an effort to remove the necessity of creating this object for most use cases, saving memory and performance.

feat(effect): reduce memory consumption

6b26c38

Move several options to params, so that an object is not required for watchers or computeds. This reduces memory. On top of that, place some optional params on the prototype unless non-default.

feat(effect): rename _effect to effect

d649a4d

There's no point in underscoring the effect property of the runner.

feat(apiWatch): extract function that doesn't need to be in function …

9bd002b

…body 1. One variable less on the stack 2. Smaller functions means less time to optimize

feat(effect): implement ref-specific high-performance track/triggering

3981a86

feat(effect): remove out-of-bounds check

de906b7

Out-of-bounds should be avoided always, for performance reasons. In practice, some benchmarks (especially 'write ref, read 1000 computeds') showed significant (30%) improvement after this change.

chore(effect): add comment to targetMap

295e46c

Explain that refs store their deps in a local property.

basvanmeurs added 2 commits November 16, 2020 13:42

feat(effect): refactor triggering for non-refs

4ed374c

Improves performance.

chore(resolve): rebase on master

911fde1

yyx990803 added the ❗ p4-important Priority 4: this fixes bugs that violate documented behavior, or significantly improves perf. label Dec 4, 2020

LinusBorg added this to Planned in Next Patch Feb 1, 2021

LinusBorg moved this from Planned to Guidance needed in Next Patch Feb 1, 2021

LinusBorg removed this from Guidance needed in Next Patch Feb 5, 2021

LinusBorg added the need guidance The approach/solution in the PR is unclear and requires guidance from maintainer to proceed further. label Feb 6, 2021

yyx990803 added this to In progress in 3.2.0 via automation Feb 25, 2021

yyx990803 force-pushed the master branch from 24f486f to fb96a85 Compare May 8, 2021 04:46

yyx990803 added the version: minor label May 25, 2021

yyx990803 changed the base branch from master to 3.2 June 22, 2021 19:24

basvanmeurs mentioned this pull request Jun 23, 2021

feat(reactivity): ref-specific track/trigger and miscellaneous optimizations #3995

Merged

basvanmeurs closed this Jun 23, 2021

3.2.0 automation moved this from In progress to Done Jun 23, 2021

yyx990803 mentioned this pull request Jun 24, 2021

perf(reactivity): improve reactive effect memory usage #4001

Merged

basvanmeurs mentioned this pull request Jun 29, 2021

feat(reactivity): bitwise dep markers to optimize re-tracking #4017

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(effect): refactorings to improve performance, reduce memory usage #2345

feat(effect): refactorings to improve performance, reduce memory usage #2345

basvanmeurs commented Oct 9, 2020 •

edited

basvanmeurs commented Oct 12, 2020 •

edited

yyx990803 commented Oct 13, 2020

dsonet commented Oct 13, 2020

basvanmeurs commented Oct 14, 2020

basvanmeurs commented Oct 14, 2020

basvanmeurs commented Oct 15, 2020

jods4 Nov 14, 2020

basvanmeurs Nov 16, 2020

jods4 Nov 16, 2020

basvanmeurs Nov 16, 2020

jods4 commented Nov 14, 2020

jods4 commented Nov 14, 2020

basvanmeurs commented Nov 16, 2020 •

edited

jods4 commented Nov 16, 2020

basvanmeurs commented Jun 17, 2021

yyx990803 commented Jun 22, 2021

basvanmeurs commented Jun 23, 2021

feat(effect): refactorings to improve performance, reduce memory usage #2345

feat(effect): refactorings to improve performance, reduce memory usage #2345

Conversation

basvanmeurs commented Oct 9, 2020 • edited

Results

Performance

Memory

basvanmeurs commented Oct 12, 2020 • edited

yyx990803 commented Oct 13, 2020

dsonet commented Oct 13, 2020

basvanmeurs commented Oct 14, 2020

Reactivity API changes

Vue API changes

basvanmeurs commented Oct 14, 2020

basvanmeurs commented Oct 15, 2020

jods4 Nov 14, 2020

Choose a reason for hiding this comment

basvanmeurs Nov 16, 2020

Choose a reason for hiding this comment

jods4 Nov 16, 2020

Choose a reason for hiding this comment

basvanmeurs Nov 16, 2020

Choose a reason for hiding this comment

jods4 commented Nov 14, 2020

jods4 commented Nov 14, 2020

basvanmeurs commented Nov 16, 2020 • edited

Benchmarks

Reactivity benchmark conclusions

Cleanup performance

jods4 commented Nov 16, 2020

basvanmeurs commented Jun 17, 2021

yyx990803 commented Jun 22, 2021

basvanmeurs commented Jun 23, 2021

basvanmeurs commented Oct 9, 2020 •

edited

basvanmeurs commented Oct 12, 2020 •

edited

basvanmeurs commented Nov 16, 2020 •

edited