Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: x1.5 CPU usage increase using Puppeteer between 20.8.3 to 20.9.0+ #12524

Closed
2 tasks
mostafa-hisham opened this issue Jun 3, 2024 · 17 comments
Closed
2 tasks

Comments

@mostafa-hisham
Copy link

mostafa-hisham commented Jun 3, 2024

Minimal, reproducible example

import puppeteer from 'puppeteer';
import { performance } from 'perf_hooks';
import { version } from 'process';
import { readFileSync } from 'fs';
const minimal_args = [
    '--autoplay-policy=user-gesture-required',
    '--disable-background-networking',
    '--disable-background-timer-throttling',
    '--disable-backgrounding-occluded-windows',
    '--disable-breakpad',
    '--disable-client-side-phishing-detection',
    '--disable-component-update',
    '--disable-default-apps',
    '--disable-dev-shm-usage',
    '--disable-domain-reliability',
    '--disable-extensions',
    '--disable-features=AudioServiceOutOfProcess',
    '--disable-hang-monitor',
    '--disable-ipc-flooding-protection',
    '--disable-notifications',
    '--disable-offer-store-unmasked-wallet-cards',
    '--disable-popup-blocking',
    '--disable-print-preview',
    '--disable-prompt-on-repost',
    '--disable-renderer-backgrounding',
    '--disable-setuid-sandbox',
    '--disable-speech-api',
    '--disable-sync',
    '--hide-scrollbars',
    '--ignore-gpu-blacklist',
    '--metrics-recording-only',
    '--mute-audio',
    '--no-default-browser-check',
    '--no-first-run',
    '--no-pings',
    '--no-sandbox',
    '--no-zygote',
    '--password-store=basic',
    '--use-gl=swiftshader',
    '--use-mock-keychain',
    '--disable-gpu',
    '--disable-canvas-aa',
    '--disable-2d-canvas-clip-aa',
    '--disable-gl-drawing-for-tests',
    '--disable-default-apps',
    '--disable-sync',
    '--safebrowsing-disable-auto-update',
    '--disable-accelerated-2d-canvas',
];

const getPackageVersion = (name) => {
    const fileName = `node_modules/${name}/package.json`;
    const content = readFileSync(fileName);
    const json = JSON.parse(content);
    return json.version;
};

export const task = async () => {
    const URL = 'https://google.com';
    const pptrVersion = getPackageVersion('puppeteer');

    const launchOptions = {
        headless: 'shell', // or 'old' depend on the puppeteer version
        args: minimal_args,
        pipe: true
    };

    const browser = await puppeteer.launch(launchOptions);
    const browserVersion = await browser.version();
    const allPages = await browser.pages();
    const page = (allPages.length) ? allPages[0] : await browser.newPage();
    await page.setRequestInterception(true);
    page.on('request', (req) => {
        if ('wpr-saas-no-intercept' in req.headers()){
            req.continue(); // Exception for LCP Beacon script.
        } else if ((req.resourceType() !== 'script' && req.resourceType() !== 'stylesheet' && req.resourceType() !== 'document')
            ||
            (
                req.resourceType() === 'script'
                && req.initiator().type === 'script'
            )
        ) {
            req.abort();
        } else {
            req.continue();
        }
    });
    const startTime = performance.now();
    await Promise.all([
        page.coverage.startCSSCoverage(),
        page.coverage.startJSCoverage()
    ]);
    await page.goto(URL, { waitUntil: 'load' });
    
    const elements = await page.$$('*');
    const tasks = elements.map((elem) => elem.evaluate((el) => el.className));
    await Promise.all(tasks);
    const [cssCoverage, jsCoverage] = await Promise.all([
        page.coverage.stopCSSCoverage(),
        page.coverage.stopJSCoverage(),
    ]);
    const res = {
        node: version,
        puppeteer: pptrVersion,
        browser: browserVersion,
        duration: performance.now() - startTime,
        cssCoverage,
        jsCoverage
    };

    await browser.close();

    return res;
}

Error string

No error

Bug behavior

  • Flaky
  • PDF

Background

Puppeteer is used to do a task involving query elements on the page, run CSS and JS coverage, and request interception similar to the code example above.
We use one browser instance and open multiple pages then close the page after the task is done, we don't create a new browser for each task.

Problem
After updating Puppeteer to v22.10.0 we noticed CPU usage has around x1.5 increase
The version we originally used was pretty old so we tried to track back when the CPU usage issue started

Test
We changed the puppeteer version and tested it on two servers with similar specs and did the same task on the same page URLS

The following screenshots have a graph for CPU usage on both servers, the only difference between the servers during the test is the Puppeteer version

- Green (mirror) -> used to test different Puppeteer versions
- Yellow -> is always using Puppeteer v15.5.0

if we used the same puppeteer version we always have them with CPU usage equal (up to 1 : 2%)

  • both 15.5.0 (both around 40% CPU usage)
    image

  • Green (mirror) -> v22.10.0 (mirror is higher 60% CPU usage)
    image

  • Green (mirror) -> v22.10.0 with more pages opened in the browser, (mirror is higher around 70% CPU usage)
    image

  • Green (mirror) -> v20.9.0 (mirror is higher 60% CPU usage)
    image

  • Green (mirror) -> v20.8.3 (both around 40% CPU usage)
    image

Conclusion
The x1/.5 CPU usage increase started since Puppeteer v20.9.0 with browser version HeadlessChrome/115.0.5790.98 and we still have this CPU usage increase in v22.10.0 with browser version HeadlessChrome/125.0.6422.78

similar issue
#11944

Expectation

no CPU usage increase

Reality

CPU usage has around x1.5 increase

Puppeteer configuration file (if used)

const {join} = require('path');
/**
 * @type {import("puppeteer").Configuration}
 */
module.exports = {
    // Changes the cache location for Puppeteer.
    cacheDirectory: join(__dirname,'.cache', 'puppeteer'),
};

Puppeteer version

22.10.0

Node version

18.0.0

Package manager

npm

Package manager version

8.6.0

Operating system

Linux

Copy link

github-actions bot commented Jun 3, 2024

The issue has been labeled as confirmed by the automatic analyser.
Someone from the Puppeteer team will take a look soon!


Analyzer run

@github-actions github-actions bot added confirmed and removed invalid labels Jun 3, 2024
@OrKoN
Copy link
Collaborator

OrKoN commented Jun 3, 2024

@mostafa-hisham could you check if you see the difference with 21.1.1 to 21.2.0? is it the same as #11944 ?

@OrKoN
Copy link
Collaborator

OrKoN commented Jun 3, 2024

@mostafa-hisham could you please also clarify if the increased CPU usage is for the node process running Puppeteer or for the browser process(es)?

@mostafa-hisham
Copy link
Author

mostafa-hisham commented Jun 3, 2024

@OrKoN

could you please also clarify if the increased CPU usage is for the node process running Puppeteer or for the browser process(es)?

CPU usage in the graphs is for container CPU usage at this time.
it should include both node running Puppeteer and browser process(es)

  • Green (mirror) -> v21.1.1 (mirror at 30% CPU usage)
    image

CPU is lower but many pages crash with this error TargetCloseError Protocol error on stoping CSSpagecoverage
that means that pages crash before any processes are done on puppeteer like coverage and evaluation so it is not very representative.

TargetCloseError: Protocol error (CSS.stopRuleUsageTracking): Session closed. Most likely the page has been closed.
2024-06-03T12:52:35.147148385Z     at CDPSessionImpl.send (/app/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:461:35)
2024-06-03T12:52:35.147151791Z     at CSSCoverage.stop (/app/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Coverage.js:293:57)
2024-06-03T12:52:35.147153755Z     at Coverage.stopCSSCoverage (/app/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Coverage.js:116:40)

@MathieuLamiot
Copy link

Hello @OrKoN I work with @mostafa-hisham.

could you check if you see the difference with 21.1.1 to 21.2.0? is it the same as #11944 ?

We though about that initially, but in #11944, it seems you found precisely the version update that introduce the issue. In our case, we see the issue when going from 20.8.3 to 20.9 so it seems different.
As @mostafa-hisham explained above, our code crashes with 21.1.1 so the test is not very useful, but since 20.9 is already impacted, I guess we have a different issue than #11944?

@OrKoN OrKoN added the P3 label Jun 5, 2024
@OrKoN
Copy link
Collaborator

OrKoN commented Jun 5, 2024

Thanks for the details, could you reproduce the issue outside of the server? Because I won't be able to accurately measure cpu usage increase without the actual workload and the server configuration (which CPUs, how many cores, anything else running there). Could you please update the example to measure the time spent similar to #11944? Also, it would be great to understand if you see the cpu increase in the node process or in the chrome process.

@OrKoN OrKoN changed the title [Bug]: x1.5 CPU usage increase using Puppeteer v22.10.0 [Bug]: x1.5 CPU usage increase using Puppeteer between 20.8.3 to 20.9.0+ Jun 5, 2024
@OrKoN
Copy link
Collaborator

OrKoN commented Jun 5, 2024

So I finally got a chance to run it. I have reduced the script a bit:

import puppeteer from "puppeteer";

async function main() {
  const browser = await puppeteer.launch({
    headless: true, 
    pipe: true,
  });
  const allPages = await browser.pages();
  const page = allPages.length ? allPages[0] : await browser.newPage();
  await page.setRequestInterception(true);
  page.on("request", (req) => {
    if ("wpr-saas-no-intercept" in req.headers()) {
      req.continue(); // Exception for LCP Beacon script.
    } else if (
      (req.resourceType() !== "script" &&
        req.resourceType() !== "stylesheet" &&
        req.resourceType() !== "document") ||
      (req.resourceType() === "script" && req.initiator().type === "script")
    ) {
      req.abort();
    } else {
      req.continue();
    }
  });
  const startTime = performance.now();
  await Promise.all([
    page.coverage.startCSSCoverage(),
    page.coverage.startJSCoverage(),
  ]);
  await page.goto("https://google.com", { waitUntil: "load" });

  const elements = await page.$$("*");
  await Promise.all(
    elements.map((elem) => elem.evaluate((el) => el.className)),
  );

  await Promise.all([
    page.coverage.stopCSSCoverage(),
    page.coverage.stopJSCoverage(),
  ]);
  const time = performance.now() - startTime;
  await browser.close();
  return time;
}

let n = 10;
let sum = 0;

for (let i = 0; i < n; i++) {
    sum += await main();
}

console.log(sum / n);

// 20.8.3 = 717.2703088 
// 20.9.0 = 707.1119905999997

I do not see a difference in time spent. Can you try the reduced script, do you still see the issue with? If yes, do you see a different time spent on the task or only the CPU usage?

Note that I removed custom args as the support level for them is varying and to rule out that they are not the cause of the issue.

UPD: same results with the same args

@OrKoN
Copy link
Collaborator

OrKoN commented Jun 5, 2024

In the latest version, my average time is 1014 but it is likely the other issue.

@MathieuLamiot
Copy link

MathieuLamiot commented Jun 7, 2024

Thanks for the update and looking into it.
We are working on providing more insights:

  • We will check time measurements on our servers that report higher CPU, to see if time measurment would also spot the issue.
  • Then we will rework the script above to reproduce locally. We think that we need to browse different pages, we have lists of what our servers work on so we can build from it.

We'll get back to you with that.

In the meantime, is there a way for us to run Puppeteer v20.9 with a browser version from Puppeteer v20.8? It would help us see if the issue comes from the puppeteer library or the browser itself?

@OrKoN
Copy link
Collaborator

OrKoN commented Jun 7, 2024

@MathieuLamiot yes, you can specify the path to the browser you want to run via executablePath in the launch command. That should generally works unless there were some changes in the browser that required Puppeteer changes that are not backward compatible.

@OrKoN
Copy link
Collaborator

OrKoN commented Jun 11, 2024

Did you have a chance to investigate where the CPU usage coming from and create a locally reproducible script?

@MathieuLamiot
Copy link

Hello, sorry not yet. The last few days have been crazy here ; but that's for sure still on the todo. Beginning with trying a different browser version

@MathieuLamiot
Copy link

@OrKoN We tested with Puppeteer v20.9 with chrome/linux-114.0.5735.133 (which is used in 20.8.3) and the issue does not appear. It would mean the browser is at the origin of the issue I suppose?
What would you recommend as the next steps to investigate and reach out for a resolution? Thanks

@OrKoN
Copy link
Collaborator

OrKoN commented Jun 17, 2024

@MathieuLamiot it would be helpful to figure out what process of the browser causes high usage or have a minimal example that reproduces the increased cpu usage. In the notes, I see that behavior of coverage has changed in 115 as well as the lazy frames behavior #10584 For coverage, I see that the browser produces more accurate ranges so if you do a lot of coverage reporting, maybe more detailed coverage results in more cpu used.

@mostafa-hisham
Copy link
Author

@OrKoN
I created an example that you can use to reproduce the issue here

  • it opens a couple of Chrome pages with different URLs
  • do coverage
  • evaluate elements on the page
  • measuring the time of evaluation and coverage

using the history command
it runs the test on the following versions

20.8.3
20.9.0
20.9.0-114 (puppeteer@20.9.0-chrome@114.0.5735.133)
22.12.0

here are the results I currently have
test with coverage

{"values":[
{"puppeteer":"20.8.3","duration":1497.0737290000543,"browser":"HeadlessChrome/114.0.5735.133"},

{"puppeteer":"20.9.0","duration":2788.871739999857,"browser":"HeadlessChrome/115.0.5790.98"},

{"puppeteer":"20.9.0-114-chrome","duration":1468.5795260001905,"browser":"HeadlessChrome/114.0.5735.133"},

{"puppeteer":"22.12.0","duration":2801.1059679999016,"browser":"HeadlessChrome/126.0.6478.63"}
]}

without coverage

{"values":[

{"puppeteer":"20.8.3","duration":267.59014699980617,"browser":"HeadlessChrome/114.0.5735.133"},

{"puppeteer":"20.9.0","duration":274.9572280002758,"browser":"HeadlessChrome/115.0.5790.98"},

{"puppeteer":"20.9.0-114-chrome","duration":187.95026499964297,"browser":"HeadlessChrome/114.0.5735.133"},

{"puppeteer":"22.12.0","duration":409.4865039996803,"browser":"HeadlessChrome/126.0.6478.63"}
]}

@OrKoN
Copy link
Collaborator

OrKoN commented Jun 28, 2024

@mostafa-hisham so it looks like coverage in Chrome started to take more time? it is probably related to it being more detailed. Also, for the latest versions we should use page.$$('selector', {isolate: false}) to compare performance with the older versions.

@MathieuLamiot
Copy link

Indeed, it looks like the root cause could be the more detailed coverage. It's too bad there is not a way to keep the "simple" version for performance when the advanced stuff is not needed. Anyway, I guess there is nothing puppeteer can do about it. We are looking into getting rid of uses of coverage for our application. Thanks

@OrKoN OrKoN closed this as not planned Won't fix, can't repro, duplicate, stale Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants