Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deterministic memory metrics #916

Closed
leeoniya opened this issue Jul 29, 2021 · 25 comments
Closed

deterministic memory metrics #916

leeoniya opened this issue Jul 29, 2021 · 25 comments

Comments

@leeoniya
Copy link
Contributor

i've been investigating whether domvm has a real memory leak (in domvm/domvm#229) or if this was simply a recent change in chrome's GC behavior. right now it's looking like the latter (see profiles in that issue).

wonder if chromedriver or lighthouse has an explicit api to collect garbage, like puppeteer [1] which can be invoked prior to taking mem measurements. would be curious to see if this makes a difference for some libs.

cc @krausest @ryansolid

[1] puppeteer/puppeteer#1394 (comment)

@krausest
Copy link
Owner

krausest commented Jul 29, 2021

The memory benchmark (tries to) run the GC after initBenchmark and each runBenchmark loop: https://github.com/krausest/js-framework-benchmark/blob/master/webdriver-ts/src/forkedBenchmarkRunner.ts#L271
chrome has to be started with a special flag to accept window.gc calls:
https://github.com/krausest/js-framework-benchmark/blob/master/webdriver-ts/src/webdriverAccess.ts#L239
So there should be a garbage collection between all runs (unless something changed in recent chrome versions).

@leeoniya
Copy link
Contributor Author

that mem metric was in line with other adjacent libs for a long time and changed suddenly without any lib or bench impl updates, so idk 🤷

@leeoniya
Copy link
Contributor Author

here are 3 cycles of "create 1k" x 5, with a few seconds in between, then a forced GC from devtools at about 17500ms:

image

you can see the GC tends to collect under pressure (e.g. on the first or middle run of the next 5x set); simply being idle after each 5x set is not always sufficient, even though the memory can technically be reclaimed.

same story for create/clear:

image

@leeoniya
Copy link
Contributor Author

below are similar runs for ivi. (certainly, domvm is doing something that makes the GC quite a bit lazier during idle periods, i havent had much luck with tracking this down)

image

image

@leeoniya
Copy link
Contributor Author

finally, react-hooks:

image

image

@krausest
Copy link
Owner

Memory consumption went up with chrome 91. Here are the result table for chrome 90, 91 and 92:

@krausest
Copy link
Owner

krausest commented Jul 30, 2021

For the most frameworks the memory results were within 0.1 MB. Unless I missed a framework those are the exception (chrome 90 to the left chrome 91 to the right):

bdc is an exception: It used less memory on chrome 91
ember and glimmer differ, but maybe they should be ignored here due to their high variance.
The other frameworks (alpine, callbag, dojo, domvm, ls-element, marko, preact) report a much higher memory usage with chrome 91. It's interesting that there are all combinations: No change (all except the above), only in replace 1k (ls-element), only in create/clear (callbag) and in both (domvm, marko, preact).

@krausest
Copy link
Owner

I disabled the forceGC method. Looks like a potential for optimzation, it doesn't change the results.

So where are we now?

  • Memory consumption changed only for some frameworks
  • It's not due to broken forced GC calls

@krausest
Copy link
Owner

Over time there were two memory metric implemented. The one currently used extracts MajorGC events from the timeline and uses their reported memory value (usedHeapSizeAfter).
The other metric executes a :takeHeapSnapshot command and sums the size of all returned objects (maybe that's the cause the forced gc isn't necessary since this command might force a gc itself).
Both metrics are close, but of course not equal. For domvm both report high values:

Benchmark MajorGC HeapSnapshot
24 replace 1k 8.3246 8.4451
25 create/clear 8.5558 8.6764

@leeoniya
Copy link
Contributor Author

leeoniya commented Aug 1, 2021

do you get similar profiles to mine in devtools on your machine when manually/continuously running these metrics?

the GC strategy likely changed between 90 and 91, but it's unlikely that the memory used is now 3x for the same code. so either the original results were wonky, the new ones are wonky, or both.

@krausest
Copy link
Owner

krausest commented Aug 1, 2021

Here's a screenshot of domvm 3 cycles of 5 x create 1,000.
Screenshot from 2021-08-01 21-50-33
After GC memory is down to ~ 3,1 MB JS heap.

@leeoniya
Copy link
Contributor Author

leeoniya commented Aug 1, 2021

that looks consistent with mine. there's no leak happening AFAICT (forced gc at any time reclaims everything), and the peak usage should be pretty much identical to pre-91. hence this issue. i don't have a solution, but just wanted to get some more eyeballs on the situation.

@krausest
Copy link
Owner

krausest commented Aug 2, 2021

A third method would be to use
await driver.executeScript("return window.performance.memory.usedJSHeapSize");
and divide by 1024^2.
The results correlate to the reported values, but differ (all other measurement removed):

Benchmark vanillajs domvm
24 replace 1k 2.984 9.151
25 create/clear 3.185 9.510

I don't understand what's going on here. Seems like results in chrome seem to vary (often after reloading the page). Here's a memory profile of chrome after clicking 5x create 1,000 rows with memory similar to webdriver's results.
Shapshot 1 is before GC and snapshot 2 after pressing the GC button:

And here's a much lower result:

Currently I think that:

  • The following results in low memory results: Load domvm, open devtools and open the memory tab. Click creaate 1,000 rows five times. Just looking at the "Select profiling type" page displays on the bottom the used memory. domvm shows values between 3.1 and 9.1 MB on my machine, and running GC reduces it to 3.1 MB. Reload the page and repeat the test and it stays in that range and repeatedly shows that behaviour.
  • If you start chrome, load domvm switch to console and enter window.performance.memory.usedJSHeapSize and switch to memory, click "create 1,000 rows", evaluate usedJSHeapSize in the console again and look at the memory tab. The first run looks like above, but later runs start with high memory useage just when loading the page and running gc doesn't reduce memory (like in the first screenshot above).

So much for today, maybe I find some time tomorrow to look further.

@krausest
Copy link
Owner

krausest commented Aug 6, 2021

Here's where I stand:

  • All three ways to measure memory consumption in chromedriver report about 9 MB memory usage. This is true for all metrics used in islolation (usedJSHeapSize, :takeHeapSnapshot and GC timeline info usedHeapSizeAfter, also when adding driver.sleep for various seconds.
  • I played with chrome flags used for launching, but it doesn't change anything
  • OSX on M1 looks pretty much the same for domvm

All in all I'm pretty sure this test really consumes 9 MB memory in chromedriver.

But when running in a normal chrome instance most of the time I see that memory consumption drops from about 9 MB to about 3,1 MB.

I'm really close to giving up on this one which means temporarily disabling most memory benchmarks.

Does anyone have an idea what else to try?

@krausest
Copy link
Owner

krausest commented Aug 7, 2021

Last resort was to try puppeteer. Here's a crude and small hack to emulate the memory benchmark with puppeteer:

import * as puppeteer from 'puppeteer-core';

const countObjects = async (page: puppeteer.Page) => {
    const prototypeHandle = await page.evaluateHandle(() => Object.prototype);
    const objectsHandle = await page.queryObjects(prototypeHandle);
    const numberOfObjects = await page.evaluate((instances) => instances.length, objectsHandle);

    await Promise.all([
        prototypeHandle.dispose(),
        objectsHandle.dispose()
    ]);
    
    return numberOfObjects;
};

export async function runPuppeteer22(url: string, name: string)
{
    const browser = await puppeteer.launch({ headless: false, executablePath: '/usr/bin/google-chrome' });
    const page = await browser.newPage();
    // await (page as any)._client.send('HeapProfiler.enable');
    await page.goto(url);
    await page.click('#run');
    let elem = await page.waitForXPath('//tbody/tr[1]');
    
    await page.waitForTimeout(1000);    
    let count = await countObjects(page);
    // await (page as any)._client.send('HeapProfiler.collectGarbage');
    await page.waitForTimeout(1000);    

    let metrics = await page.metrics();
    let memory22 = metrics.JSHeapUsedSize;
    await browser.close();
    console.log("Benchmark 22: Memory usage after adding 1000 rows:",memory22);
}

export async function runPuppeteer24(url: string, name: string)
{
    const browser = await puppeteer.launch({ headless: false, executablePath: '/usr/bin/google-chrome' });
    const page = await browser.newPage();
    // await (page as any)._client.send('HeapProfiler.enable');
    await page.goto(url);

    for (let i = 0; i < 5; i++) {
        await page.click('#run');
        await page.waitForTimeout(100);
        let element = await page.waitForXPath('//tbody/tr[1000]/td[1]');
        let value = await page.evaluate(el => el.innerText, element)
        let expected = (1000 * (i + 1)).toFixed();
        if (expected != value) {
            console.log("**** WRONG! Value ", value, "expected");
        }
    }

    await page.waitForTimeout(1000);    
    let count2 = await countObjects(page);
    // await (page as any)._client.send('HeapProfiler.collectGarbage');
    await page.waitForTimeout(1000);    

    let metrics = await page.metrics();
    let memory24_after_GC = metrics.JSHeapUsedSize;

    console.log("Benchmark 24: memory Memory usage after clicking create 1000 rows 5 times:",memory24_after_GC);
    await browser.close();
}


async function main() {
    const urls = [
        {name:'domvm', url:'http://localhost:8080/frameworks/keyed/domvm/index.html'},
        {name:'callbag-jsx-list', url:'http://localhost:8080/frameworks/keyed/callbag-jsx-list/index.html'},
        {name:'dojo', url:'http://localhost:8080/frameworks/keyed/dojo/output/dist/index.html'},
        {name:'marko', url:'http://localhost:8080/frameworks/keyed/marko/index.html'},
        {name:'preact', url:'http://localhost:8080/frameworks/keyed/preact/index.html'},
        {name:'ls-element', url:'http://localhost:8080/frameworks/keyed/ls-element/dist/index.html'},
        {name:'solid', url:'http://localhost:8080/frameworks/keyed/solid/index.html'},
        {name:'vanilla', url:'http://localhost:8080/frameworks/keyed/vanillajs/index.html'},
        {name:'react', url:'http://localhost:8080/frameworks/keyed/react/index.html'},
        {name:'vue', url:'http://localhost:8080/frameworks/keyed/vue/index.html'},
        {name:'svelte', url:'http://localhost:8080/frameworks/keyed/svelte/index.html'},
        {name:'angular', url:'http://localhost:8080/frameworks/keyed/angular//dist/angular/index.html'},
    ];
    for (let u of urls) {
        console.log(`results for ${u.name}`);
        await runPuppeteer22(u.url, u.name);
        await runPuppeteer24(u.url, u.name);
    };
}

main().then(() => {
    console.log("finished");
})

It prints:

./node_modules/.bin/ts-node src/puppeteerMemoryBench_main.ts
Results for vanilla
Benchmark 21: Memory usage after adding 1000 rows: 1567992
Benchmark 24: memory Memory usage after clicking create 1000 rows 5 times: 1465332
Results for domvm
Benchmark 21: Memory usage after adding 1000 rows: 1632960
Benchmark 24: memory Memory usage after clicking create 1000 rows 5 times: 8256952
finished

I added some sleep and page.queryObjects, which should cause a GC.
When I add

await (page as any)._client.send('HeapProfiler.enable');
await (page as any)._client.send('HeapProfiler.collectGarbage');

right before metrics = await page.metrics(); let memory24_after_GC = metrics.JSHeapUsedSize; I get the following:

Results for vanilla
Benchmark 21: Memory usage after adding 1000 rows: 1567992
Benchmark 24: memory Memory usage after clicking create 1000 rows 5 times: 1262544
Results for domvm
Benchmark 21: Memory usage after adding 1000 rows: 1632960
Benchmark 24: memory Memory usage after clicking create 1000 rows 5 times: 7629164
finished

So puppeteer also reports the high memory results for domvm.

@leeoniya
Copy link
Contributor Author

leeoniya commented Aug 8, 2021

similar results for the other libs with increased mem? (e.g. preact)

sorry i dont have much time to help with testing this ATM :(

@krausest
Copy link
Owner

krausest commented Aug 8, 2021

A picture says more...
Screenshot from 2021-08-08 09-59-20
First and third result row are from chromedriver, the line below both are puppeteer numbers. While there are difference where it might be interesting to discuss which one is close to the real value, the correlation holds.
Preact, dojo, domvm, marko, ls-element report bad values. vanilla, solid, svelte, vue, react are much lower.

I updated to code above. It's now closer to the chromedriver tests.

@krausest
Copy link
Owner

krausest commented Sep 1, 2021

Asking the puppeteer guys was a great help. It turned out that for puppeteer it's important to dispose any element you get via waitForXPath (which I didn't). Adding the dispose call caused the memory increase between runs to disappear.
And this seems to be the issue for chromedriver, too!
If you comment out the line await testTextContains(driver, "//tbody/tr[1000]/td[1]", (1000 * (i + 1)).toFixed(), config.TIMEOUT, false); in benchReplace5Memory.run then domvm reports 2.7 MB and preact about 3.7 MB.
(It's a bit strange that this impacts just a few framework)

Now to the next important question: Is there any way to dispose an element in selenium / chromedriver? I've never read about that...

@leeoniya
Copy link
Contributor Author

leeoniya commented Sep 1, 2021

nice find! 💯

img

as to why this only affects some frameworks but not others -- and i kind of suspected this previously -- is perhaps because they store back-references into the vdom as ad-hoc properties on the dom elements (el._node in domvm's case), which creates a difficult GC barrier. this explains maybe why the GC is lazier than i would expect.

i can try switching to an internal Map or WeakMap for this now that both are widely supported and it will likely fix the GC pressure but add overhead for maintaining the extra Map & doing lookups, might be a net win tho. 🤷

@krausest
Copy link
Owner

krausest commented Sep 1, 2021

@leeoniya I'll check how to work around this issue.

Has anyone read about a way to dispose the element handles from findElement in webdriver? It sounds like such a generic issue (and there's even an API method in puppeteer for it) that I'd really be surprised if there was no solution in webdriver / selenium.

If nothing else works I'll consider moving to puppeteer.

@leeoniya
Copy link
Contributor Author

leeoniya commented Sep 2, 2021

@leeoniya I'll check how to work around this issue.

no worries. just glad it's been root-caused.

@krausest
Copy link
Owner

I really thought I had a solution in the branch mentioned above. It looks much better for all frameworks except for alpine:
Screenshot from 2021-09-21 20-47-30

I'm not going to merge this PR at this time.

@krausest
Copy link
Owner

krausest commented Oct 21, 2021

And now I migrated the memory benchmark to puppeteer. Guess what it looks like now:
Screenshot from 2021-10-21 20-24-23
What's going on with alpine? The other frameworks look quite ok.
But actually chrome's memory tab tells the same:

I'll check what results I get for alpine 2.8.1 and puppeteer.
Here are the results - looks really like Alpine 3.2.3 has a memory issue.

So in summary this means that the PR mentioned above was actually okay and I didn't realize and porting to puppeteer wouldn't have been necessary.
But anyways I really like the way puppeteer handles a few things (most of all the pierce selector makes shadow dom bearable), so I'll clean up the code and then we'll use (at least currently) webdriver for the CPU benchmarks and puppeteer for the memory benchmarks (and hopefully someday we can move the CPU benchmarks to puppeteer too).
@leeoniya Does the benchmark screenshot above look plausible to you?

@krausest
Copy link
Owner

I pushed the code that uses puppeteer for the memory benchmarks.

@leeoniya
Copy link
Contributor Author

leeoniya commented Oct 23, 2021

sorry for the slow reply. i think if it also fixes preact (which is almost certainly leak-free) and some others, then this is 👍 .

might also be interesting to track peak mem as well, where template/block-based frameworks should shine vs traditional vdoms.

i'll add my usual criticism that when you have 80k dom nodes, gap between slow and fast frameworks shrinks since most of the time is dominated by browser layout and dom related tasks that are not framework specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants