Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

surplus: faster clear via rAF, O(1) select row, update to latest S-array #168

Merged
merged 2 commits into from
May 14, 2017

Conversation

adamhaile
Copy link
Contributor

A few perf tweaks to Surplus-v0.4.0

  • faster clear via rAF
  • index rows by id to provide for O(1) select behavior
  • bump S-array version, as new version has a fast path to/from empty arrays

keyedresults
nonkeyedresults

@leeoniya
Copy link
Contributor

leeoniya commented Apr 28, 2017

i wonder if the rAF optimization for bulk clearing is done at the expense of predictable "liveness" guarantees and lifecycle hooks expectations (which doesnt matter for this bench). i think mainly libs that do not need to adhere to some of these rules can afford to implement this level of async/laziness. that or libs that have enough analysis to know when these guarantees aren't needed :)

just thinking out loud...

@localvoid
Copy link
Contributor

@leeoniya this rAF trick(inconsistent data) is even worse than implicit keyed recycling(mem leaks) :)

@adamhaile
Copy link
Contributor Author

Heh, I thought this one might draw some discussion.

Does anybody feel like they have a grasp on why the bulk clear is faster after an rAF? Maybe something about splitting the browser's work into two hunks, one for the button click animation and one for the clear, results in less work overall? Dunno, and haven't profiled it enough yet to understand.

It's definitely not a universal optimization, which is why I added it in app code rather than framework code.

@leeoniya yeah, good questions. For surplus, the whole idea of a "lifecycle" doesn't apply. DOM nodes are real, and they're just values like any other. That opens opportunities for tuning.

@localvoid are you talking about surplus having inconsistent data and mem leaks? Not sure I follow. Guaranteeing consistency and being leak-free are pretty much the main points of S.

BTW, we were wondering earlier about glimmer's fast clear result. Looks like it also its DOM mutations after an rAF. Same for vidom, which was where I first noticed it.

@localvoid
Copy link
Contributor

@adamhaile Inferno has implicit keyed recycling that can cause memory leaks. And now this rAF trick, can you at least implement it so that operations are serializable, because right now if you trigger "clear", and then "add" ops before rAF, you'll obviously get "add" and then "clear".

@adamhaile
Copy link
Contributor Author

Hey, I'm not willing to fall on my sword for rAF clears. My general approach with benchmarks like this is that they let us implementers compare approaches and share knowledge so as to make all the libraries better. The fact that bulk clears (or at least this bulk clear) seem to be faster after an rAF is, as far as I know, a new bit of info. I'm mostly curious to understand why and when it's faster and whether or not it can be exploited reasonably.

I'm not sure I agree that the controller should serialize clear and add operations. I intentionally put the rAF in the controller (not the store), b/c the controller serves the view, and it's a reasonable assumption that a user couldn't click both the clear and add buttons in a single frame.

If this were production code I was tasked with optimizing, and this "trick" bought me a 15-20% perf gain, heck yeah I'd do it. With a big freakin' comment in the code to explain why, no doubt, but I wouldn't think twice. That opens the whole debate about whether these should be optimized implementations or "naive" ones. I'd actually love to enter a "naive" surplus implementation, but I figured with two horses in the race already (keyed and non), I'd be pushing Stefan's patience to create a third.

@localvoid
Copy link
Contributor

localvoid commented Apr 30, 2017

I guess next step with this logic would be to implement occlusion culling, because that is how you'd do it in production code.

There are libraries in this benchmark that can correctly schedule DOM updates with rAF without any weird tricks, and that is how it should be done in production code, but they don't use this feature in benchmarks to make sure that they are working like everyone else. It would be easier to fix implementations that use async updates, instead of trying to add one-line rAF hacks to everyone else to get 15-20% perf gain.

@leeoniya
Copy link
Contributor

i tend to agree w/@localvoid.

i didn't look into the actual impl too deeply before but since it's basically an impl-level hack that can be done for all existing libs by wrapping the bulk clears, it's probably not worth doing. i would advise that @krausest roll back this optimization from the vanilla impl as well.

glimmer's clear doesnt seem particularly noteworthy, though vidom's is.

FWIW, domvm's default redraw mode is rAF-based, but it specifically does sync redraws (.redaw(true)) [1] so the timers come out accurate in the console. i wonder if i just change the impl to plain .redraw(), whether there'd be similar gains.

it's an interesting peculiarity and cool that you found it, though :)

[1] https://github.com/krausest/js-framework-benchmark/blob/master/domvm-v2.1.4-keyed/src/main.es6.js#L49

@thysultan
Copy link
Contributor

thysultan commented Apr 30, 2017

This doesn't look like it's measuring the operation since .clear() is executed after stopMeasure; stopMeasure() should go into the function passed to requestAnimationFrame.

How it looks:

startMeasure("clear"); // sync
requestAnimationFrame(() => this.store.clear()); // async
stopMeasure(); // sync

How it's executed (unless stopMeasure is also async):

startMeasure("clear"); // sync
stopMeasure(); // sync

this.store.clear()

How it should be for a correct measure

startMeasure("clear"); // sync
requestAnimationFrame(() => {
    this.store.clear() // sync
    stopMeasure(); // sync
}); // async

@leeoniya
Copy link
Contributor

leeoniya commented Apr 30, 2017

@thysultan this bench takes measurements from chrome's timeline. start/stopmeasure are just for console logging; where they're called should have no effect on the bench output.

@thysultan
Copy link
Contributor

@leeoniya Interesting, thanks for clearing that up.

@adamhaile
Copy link
Contributor Author

In case it's not clear, I'm OK with either way. I'll push back a bit against the insinuation that this was a one-line hack to "win" a benchmark, because that was not my intent. It's pretty obvious that all the other implementations could do the same change I did for the surplus version, so the relative rankings would be unchanged. I knew that when I submitted.

However, there is a real question here. The question most libs have pursued so far is "what's the fastest set of API interactions to update the DOM." This rAF issue opens a little door into a slightly different question, "what's the fastest way to schedule those updates." I don't know how much there is to gain on that avenue, but it's a legit area of exploration. I think we should hesitate before closing that off.

It's also a line of exploration that favors the vdom libraries over surplus. You have far more global information for doing analyses than I do in surplus. Surplus's approach is basically to slice the DOM operations into their smallest atomic pieces, then let S's dependency network figure out which to (re)run. That's good for doing the minimal amount of work, but bad for doing holistic analyses that cross units of work.

@leeoniya
Copy link
Contributor

I'll push back a bit against the insinuation that this was a one-line hack to "win" a benchmark, because that was not my intent.

I certainly didn't mean to imply this; sorry if it came off that way. I don't think you have anything to prove given surplus's current metrics. Generally i'm simply opposed to convoluted userland optimizations all of which:

  1. no user would ever know to use
  2. non-idiomatic for the framework / bloats readability for the sake of shaving a few ms
  3. could easily be replicated in all impls - improving them on one axis (perf) while degrading them on other axes.

since doing the domvm impl of this bench, i have moved more and more userland optimizations into the core and simplifying the impl, leaving it as obvious as possible while maintaining/improving perf. this rAF thing is basically the opposite of this IMO, but that's my personal $0.02. i have no final word in these matters here.

@adamhaile
Copy link
Contributor Author

Leon, no offence taken.

Sure, I can resubmit without the rAF clear. I'm away from my laptop right now, so might be a day or two.

@leeoniya
Copy link
Contributor

leeoniya commented May 1, 2017

for the record, i benched swapping redraw(true) for redraw() which uses rAF debouncing in the domvm impl and it had the same positive effect on bulk clearing plus some other bulk ops, but slightly worse perf on the cheaper ops. overall it won an extra ~1% from enabling rAF across the board. technically this is a legit 1% since redraw() is the typical use case and not an optimization; redraw(true) is a de-opt to force sync redraws which aren't even necessary here.

@krausest
Copy link
Owner

krausest commented May 2, 2017

I suggest banning rAF from the client side code, but I can't mandate that framework code must not use rAF.

I'm not sure about vanillajs. I might remove it there and cap all faster results to vanillajs' duration (similar to sub 16 msecs duration). But I'm neither sure if it's really worth nor that it's fair doing so - no matter if we like it or not, the rAF approach is simply faster in that case. Last, but not least, there are enough frameworks that handle clear rows still very badly and this is what I consider to be the real value of this benchmark, the gain by using rAF is in comparison quite low.

@leeoniya
Copy link
Contributor

leeoniya commented May 2, 2017

I suggest banning rAF from the client side code

👍 no reservations from me. though it leaves me in the awkward position of intentionally not simplifying my client code just to avoid benefiting from domvm's default rAF behavior, heh.

cap all faster results to vanillajs' duration

👎 dunno about this one.

every time vanillajs is anointed as the baseline, some clever lib finds a legitimately faster way of doing something.

@solkimicreb
Copy link
Contributor

Does anyone know if the rAF hack is a chrome specific 'improvement'?

@adamhaile
Copy link
Contributor Author

OK, pushed a version with rAF clear reverted. Also came across a try-finally in a hot loop that I refactored.

I'm on a new laptop -- MBP died :( -- and I'm not getting very reliable benchmark numbers right now. I think it's somewhere in the 1.028-1.045 range.

Thanks, again, Stefan, for the benchmark and for incorporating this pull.

@adamhaile
Copy link
Contributor Author

Stefan --
If you haven't looked at this PR yet, feel free to ignore it.

In respect of your time, I don't want to pepper you with small updates. Without the rAF clear, the cumulative effect of these changes is small. I'm working on some changes in S that will hopefully lower its memory usage considerably and which will probably have a bigger effect than these. So feel free either to kill this or leave it hanging until I can get the other changes finished.

Best
-- Adam

@krausest krausest merged commit 82ab318 into krausest:master May 14, 2017
@krausest
Copy link
Owner

@adamhaile You're welcome - thanks for the interesting discussion. Results are here

@leeoniya leeoniya mentioned this pull request May 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants