-
Notifications
You must be signed in to change notification settings - Fork 812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Showing JS-only, Browser-only, and JS+Browser measurements #1233
Comments
i've been championing this for a while, along with reducing 16x slowdown to maybe 4x, and reducing the DOM size. so far there's been little enthusiasm. even very heavy apps like Slack consistently stay well below 10k dom nodes, while this bench generates 88k for the "create 10k rows + append 1k" metric. |
Potentially if the purpose of the slowdown is to make some tests weigh more than they otherwise would on the average the slowdown could just be deleted and the result could be multiplied by some constant number, I guess it should achieve the same result, perhaps lowering the amount of time it takes to run the benchmark also? I'm not exactly sure how the slowdown is implemented in Chrome also, it may be more reliable just to multiply the normal result by some constant. 80k+ nodes is kind of absurd, though I think the point of that test is more see how the framework scales, and perhaps to magnify small problems that might exist at lower numbers but go unnoticed. We could probably just switch to 100 rows base case and 1000 rows "worst" case without losing too much information. Maybe that'd be good if it makes the benchmark significantly cheaper to run. |
This comment was marked as off-topic.
This comment was marked as off-topic.
For ember I'd actually expect ~28 msecs: Hmmm. Even if I add a large sleep after run benchmark both puppeteer and playwright report something like ScriptDuration = 0.027602 before runBenchmark and 0.028274 after, which yields a duration of 0.672 mescs. The timestamps shows that values are about 1 sec apart (due to the wait). Not good. |
great to see some progress! i would also make sure to include any gc cost in this if it's not part of "script" already. i've seen it under "system" in chrome's profiler summary (especially forced gc at end of a run) |
One opinion: how the script manipulates DOM still counts. One framework takes much less time scripting but with much more duplicated DOM operations, while one other may take some more time scripting but significantly less time with DOM operations. These values can be measured separately, but must be calculated together. |
To give higher cost to inefficient DOM operations one can just add 10 mutation observers on the page or something like that, those things are slow. Problem is ~everybody uses .appendChild because it's faster for no good reason, when measured in isolation, while .append is faster if there are any mutation observer on the page, potentially faster by a huge amount. |
That's still not a very good solution to simulate slow repaint/reflows. When moving empty text nodes around, it's costing almost nothing if there's no mutation observers, but with multiple observers added, it's adding unrealistic costs to these originally cheap operations. |
agreed. this whole js/gc-only measurement exercise only makes sense for frameworks that already do near-optimal DOM ops with identical repaint/reflow costs. however, there could be a different approach here. instead of trying to measure script execution directly, why dont we measure the fastest possible restyle+reflow+paint and simply subtract this from the totals. that will hopefully cover all cases, including frameworks that do duplicate/inefficient dom ops. |
here's another interesting project: https://github.com/yamiteru/isitfast |
That's an interesting idea, but it's hard to determine what is the fastest possible time. We have multiple different implementations with different pros and cons, different approaches may have different limitations from each other. It's very hard to find an even ground for these measures. |
this benchmark does not attempt to pick the fastest framework in all categories. there will likely never be any implementation that has 1.00 across the board. each metric is ranked in isolation, which makes the proposed approach conisistent with how it already works. |
I implemented a first version that tries to compute script duration from the trace files, since I couldn't get reasonable values from performance.getMetrics: It seems to work for some nasty cases like ui5-webcomponents: Looking forward to your feedback. I haven't checked enough values yet to be confident that all values are correct. |
As the author of ef.js, I lost my first place of swap rows in only-js, that's not good 😈 Kidding, but I agree that GC time should be taken account into scripting. And still, what baseline should we take for reflow+restlyle+repaint? |
whatever implementation is fastest in reflow+restlyle+repaint for each metric. |
I came back to that issue. We have an established way to compute total duration (end of paint - start of click). I created a way to measure JS duration (sum of the duration of all events that are "EventDispatch", "EvaluateScript", "v8.evaluateModule", "FunctionCall", "TimerFire", "FireIdleCallback", "FireAnimationFrame", "RunMicrotasks", "V8.Execute"). This gives the table above. And it seems to be close to what chrome displays as scripting. If Browser-ony meant total duration minus js duration results get odd. Miso is fastest for create 1k with 56 msecs total duration, 23 msecs js duration (giving 33 msecs browser only), whilst vanillajs has 39 msecs total duration and 2 msecs js duration which gives 39 msecs browser only. Sorted by create 1 k the table looks like that: It seems to make little sense for create 1,000 rows. Some benchmarks are a little more interesting like remove row: Maybe it makes more sense to compute the lengths of all painting and layout intervals as a third factor? The current results table lets one play with the three options. |
I added a "only render duration" selection. It computes the duration as the sum of all intervals for the "UpdateLayoutTree", "Layout", "Commit", "Paint", "Layerize", "PrePaint" events. I haven't had the time to do a big quality check yet, so please take the results with a grain of salt. |
I ran a check that total time >= script time + paint time. That assertion holds for most runs, except 21 traces (openui is causing 15 of those for replace rows. All other cases are below 1 msec difference, so they don't matter much). The trace looks like that: |
The result table allows to choose total duration, only JS duration and render duration. |
As I understand it tests like "create many rows" measure the entire change, not just how long it took for the javascript to execute, but also how long it took for style recalculations, layout recalculations, and painting.
If that's the case It'd be interesting to be able to see results under 3 different filters, browser-only measurements (to check if the implementation has abnormal layout recalculations for example), javascript-only measurements (to check how much work the framework is actually doing, which assuming everything else is normal is largely what actually matters in the benchmark), and combined view where everything is taken into account.
The more practical motivation is that I think I might know how to half the amount of javascript work that is needed in some unkeyed tests (maybe some keyed tests also, depending on the exact definition one is using), but if the number displayed is dwarfed by style/layout/paint calculations it will seem like not much, even though it's actually pretty significant.
The text was updated successfully, but these errors were encountered: