Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puppeteer with headless:true is extremely slow #1718

Closed
foresightyj opened this issue Jan 4, 2018 · 34 comments
Closed

Puppeteer with headless:true is extremely slow #1718

foresightyj opened this issue Jan 4, 2018 · 34 comments

Comments

@foresightyj
Copy link

foresightyj commented Jan 4, 2018

Though this issues has been raised in #1550, but it was closed and the problem was not addressed.

My environment:

  • Puppeteer Version: "puppeteer": "^0.13.0"
  • Windows Server 2008 R2 Enterprise with Service Pack 1 & command prompt

With the following minimal test script:

const puppeteer = require('puppeteer');

const headless = process.argv[2] === "headless";

(async () => {
    const timeout = 30000
    const browser = await puppeteer.launch({
        headless: headless,
        timeout,
    });
    const page = await browser.newPage();
	const label = "Go to Bing " + (headless?"without": "with") + " head."
	console.time(label)
    await page.goto("https://www.bing.com/", { timeout })
	const title = await page.title()
	console.log("Title: " + title)
	console.timeEnd(label);
    await browser.close();
})();

image

headless.log
head.log

To aid debugging, I turned on debugging and collected two logs: headless.log & head.log respectively.

@aslushnikov
Copy link
Contributor

aslushnikov commented Jan 4, 2018

@foresightyj These are horrible numbers.

I can't reproduce this on OS X; I'll try tomorrow on win.
Meanwhile, could you please try installing puppeteer@next and see if anything's different? Puppeteer@next is a tip-of-tree version of pptr published to npm.

@foresightyj
Copy link
Author

foresightyj commented Jan 4, 2018

@aslushnikov Thanks for the quick response. I turned on debugging and collected two logs. See original post. I will try the same code in another machine.

@foresightyj
Copy link
Author

foresightyj commented Jan 4, 2018

@aslushnikov Sorry. The original test result was collected on a windows server 2008 (which was wrong in the original post but I corrected it just now). This server was our deployment server which runs Teamcity and IIS which hosts a dozen web sites. I am trying to set up puppeteer as one of the Teamcity build steps so it is better run without head.

My own computer is a windows 7 machine. The same code when run in my own Windows 7 shows different result, where headless is much faster than that with head. See screenshot:

image

@aslushnikov
Copy link
Contributor

@foresightyj Ah! Looks like everything's fine then. Do you want to close the issue?

@foresightyj
Copy link
Author

@aslushnikov No quite. It is fast in my own computer but not in the build server, which is where I really want to make it work.

I posted it here in the hope that some one might experienced the similar issue or the official team might give me some hints.

To highlight a few places in the headless.log where it seems to have spent way too much time on:

Thu, 04 Jan 2018 09:15:58 GMT puppeteer:protocol ◀ RECV {"id":2,"result":{"targetId":"(4FF9A0D357B850B93784A30FA034144A)"}}
Thu, 04 Jan 2018 09:16:05 GMT puppeteer:protocol ◀ RECV {"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"(83E2830C84CE35E39E62649364B9B168)","type":"page","title":"about:blank","url":"about:blank","attached":false}}}

Thu, 04 Jan 2018 09:16:07 GMT puppeteer:session ◀ RECV {"method":"Page.lifecycleEvent","params":{"frameId":"(4FF9A0D357B850B93784A30FA034144A)","loaderId":"8952.1","name":"networkIdle","timestamp":105719.945644}}
Thu, 04 Jan 2018 09:16:18 GMT puppeteer:protocol ◀ RECV {"method":"Target.receivedMessageFromTarget","params":{"sessionId":"(4FF9A0D357B850B93784A30FA034144A):1","message":"{"id":10,"result":{"frameId":"(4FF9A0D357B850B93784A30FA034144A)"}}","targetId":"(4FF9A0D357B850B93784A30FA034144A)"}}

Thu, 04 Jan 2018 09:16:22 GMT puppeteer:session ◀ RECV {"method":"Network.loadingFinished","params":{"requestId":"8952.14","timestamp":105736.780863,"encodedDataLength":0}}
Thu, 04 Jan 2018 09:16:26 GMT puppeteer:protocol ◀ RECV {"method":"Target.receivedMessageFromTarget","params":{"sessionId"...

@foresightyj
Copy link
Author

foresightyj commented Jan 4, 2018

@aslushnikov The build server hosts Teamcity and IIS which has much more network IO than my own computer. But I still do not see how it might affect puppeteer's performance. In addition, the comparison was done in the same machine with/without head and all other things equal. It is beyond my understanding/guess.

I can understand if the official team cannot help in this case as you cannot repeat the bug in your environment. I will close the issue in that case.

@aslushnikov
Copy link
Contributor

@foresightyj this sounds like a process priority issue. I don't have much experience with Windows, but some googling gave me a plausible entry on serverfault.

@foresightyj
Copy link
Author

foresightyj commented Jan 4, 2018

@aslushnikov That is a very plausible guess. I am not sure but windows might assign higher priority to processes with UI?

I will follow your hint and come back with results. Thanks a lot!

@foresightyj
Copy link
Author

I tried to increase the thread priority of all chrome.exe processes to have high priority. But it does not make any difference. Code used:

const puppeteer = require('puppeteer');

const headless = process.argv[2] === "headless";

const pid = process.pid;

const setPriorityCmd = "wmic process where name=\"chrome.exe\" call setpriority \"high priority\""

const child = require('child_process');

(async() => {
    const timeout = 30000
    const browser = await puppeteer.launch({
        headless: headless,
        timeout,
    });
    const page = await browser.newPage();

    console.log(child.execSync(setPriorityCmd).toString());

    const label = "Go to Bing " + (headless ? "without" : "with") + " head."
    console.time(label)
    await page.goto("https://www.bing.com/", { timeout })
    const title = await page.title()
    console.log("Title: " + title)
    console.timeEnd(label);
    await browser.close();
})();

The process explorer shows that all chrome.exe processes run with priority of 13, which is high priority.

image

priority

And the result is still bad:

image

@foresightyj
Copy link
Author

I placed the goto statement in a for loop and collected the following result:

image

It always takes more than 10 seconds to go to www.bing.com. In hindsight, I realized that thread priority should not cause such delays so deterministically. The delay is around 16 seconds 99% of time. There must be another reason.

I tried the same script in another windows server 2008 machine (same version of OS) and found that that time taken is roughly the same (still more than 10 seconds on average).

image

I also tested it in a Windows Server 2012 R2 (which is one of our production machines and runs in VPS). It runs rather quickly.

image

@foresightyj
Copy link
Author

foresightyj commented Jan 6, 2018

More update. One of my colleagues' work computer is also Windows Server 2008 and the script runs quickly in his computer. This is really a weird bug. The only difference now is the physical machines because the two windows 8 servers are blade servers.

@foresightyj
Copy link
Author

@aslushnikov Sorry. Accidentally closed the issue just now. I will leave it open for a few days. Hopefully some one else experiences similar issues can discuss with me here. Hope you won't mind.

@foresightyj foresightyj reopened this Jan 6, 2018
@aslushnikov
Copy link
Contributor

This is really a weird bug. The only difference now is the physical machines because the two windows 8 servers are blade servers.

@foresightyj these are the hardest to debug. How is you progress on this?

@butorinio
Copy link

I solved this using the flag: --deterministic-fetch
https://peter.sh/experiments/chromium-command-line-switches/#deterministic-fetch

@foresightyj
Copy link
Author

@aslushnikov Thanks for asking. No progress. The weird thing is that when it is run as a Teamcity build step, the browser never shows up even though the headless option is false. It runs as fast. So effectively I had it run headlessly.

@igor-butorin I tried the flag but it didn't help and it takes forever and stops at a unhandled promised rejection. The documentation says that with this flag it will run slowever. But I will read the whole documentation about those chromium flags and try some of those when I have time. Thanks!

@foresightyj
Copy link
Author

foresightyj commented Jan 12, 2018

@aslushnikov By the way, I do not mind if you close this issue. I'll leave it to you.

@aslushnikov
Copy link
Contributor

Thanks for asking. No progress. The weird thing is that when it is run as a Teamcity build step, the browser never shows up even though the headless option is false.

@foresightyj I'd try asking Teamcity team for help.

@aslushnikov By the way, I do not mind if you close this issue. I'll leave it to you.

Closing, I don't think we'll be of much help with this.

@bluermind
Copy link

Anybody fixed the issue yet?

@joeldouglass
Copy link

joeldouglass commented Apr 10, 2018

I'm seeing this same issue when running in a Windows service using node-windows. Runs pretty fast with {headless: false}, but times out after 30s almost every time when run with {headless: true}. The same app runs very fast as a standalone node app from the command line.

@Robula
Copy link

Robula commented May 17, 2018

macOS: 10.13.4
Node: 8.10.0
Puppeteer: 1.4.0

I am also seeing a horrendous amount of time added to my requests when running headless: true.
Here is a simple code example;

(async () => {
  const browser = await puppeteer.launch({ headless: true });
  const promises = [];
  console.time('Timer');
  for (let i = 0; i < 10; i++) {
    promises.push(renderPage(browser));
  }
  await Promise.all(promises);
  console.timeEnd('Timer');
})();

const renderPage = async (browser: puppeteer.Browser) => {
  const page = await browser.newPage();
  await page.setViewport({ width: 1920, height: 1080 });
  await page.setContent('<h1>Some really simple HTML</h1>');
  const screenshot = await page.screenshot({ fullPage: true }); // Sets buffer but isn't used in this example.
  await page.close();
};

Running the above code with headless: true results in 7747.209ms
Running the above code with headless: false results in 4775.771ms

This is a HUGE difference in performance. Obviously I am working with more complex HTML than a simple <h1> tag and the disparity between the two modes is far greater.

@eliseumds
Copy link

About 20% faster here with { headless: false }.

Puppeteer 1.4.0
NodeJS 10.1.0

@ChiggerChug
Copy link

Hey guys, so I've found a workaround for this. I found the answer in another thread unrelated to puppeteer but also has the problem with chromium being extremely slow in headless mode. I noticed that when I tried to debug chromium that network requests were the slow bit in terms of headless mode so I searched for that.

const browser = await puppeteer.launch({args: ["--proxy-server='direct://'", '--proxy-bypass-list=*']})

When launching the browser add these 2 arguments to the list and it seems (for me anyway) to resolve all issues with speed.

Link to thread if you're interested: codeceptjs/CodeceptJS#561 and props to oligee80 (who found the answer) for saving me about a week of work rolling back a bunch of product releases! 🎉

@SkyCTing
Copy link

@ChiggerChug I try that and it works,thank you~~

@RomHartmann
Copy link

I been getting conflicting results for different websites that I hit. Sometimes headless takes ages longer than headful, other times about the same. Here is my experiment.

methodology
I loaded the same page 100 times in both headful and headless mode and stored the results for time of different stages.

I've removed the timekeeping code (and lots of other code) for legibility. You'll also note that I've commented out the ublock extension, which is necessary for blocking out video and other ads that cause the second page to not load otherwise. (The extension works fine in headful mode)

const puppeteer = require('puppeteer');
const fs = require('fs');
var is_ok = false

var headless = true

(async () => {
    for (i = 0; i < 100; i++) {
        await main()
    }

    process.exit(Number(!is_ok));
})();

async function main() {
    const start = new Date()
    // const ext = __dirname + '/ublock';
    // const datadir = __dirname + '/ublock_data';
    const browser = await puppeteer.launch({
        ignoreHTTPSErrors: true,
        dumpio: false,
        // userDataDir: datadir,
        headless: headless,
        args: [
            '--no-sandbox',
            '--disable-setuid-sandbox',
            '--ignore-certificate-errors',
            '--proxy-server="direct://"',
            '--proxy-bypass-list=*',
            // `--disable-extensions-except=${ext}`,
            // `--load-extension=${ext}`
        ]
    });
    // browser_open time

    const page = await browser.newPage();

    await page.setViewport({
        width: 960,
        height: 800
    });
    await page.goto(url, {
            waitUntil: 'networkidle2',
            timeout: 60000,
        }).then((response) => {
            is_ok = response.ok;
        })
        .catch((error) => {
            console.error('died with Error:', error.toString());
        });

    // pageload time

    await page.screenshot({
            path: filepath + ".jpg",
            type: 'jpeg',
            fullPage: true,
        }).catch((error) => {
        console.error('Died on await target.screenshot:', error);
    });

    await browser.close().catch((error) => {
        console.error('Died on await browser.close():', error);
    });

    // screenshot_time
    // total_time
};

Results

Test 1: https://techcrunch.com/2016/09/12/slacks-director-of-engineering-leslie-miley-doesnt-believe-in-diversity-quotas/
(No ublock)

Headful:
no_everything_headful
Headless:
clean_headless

There is no significant difference between headful and headless mode.

Test 2: http://www.pcworld.com/article/3117032/software-productivity/microsoft-may-finally-have-its-slack-killer.html
(with ublock)

Headful:
full_headful
Headless:
ublock_headless

Where a quarter of screengrabs passed the 60s timeout.

conclusion
Other websites I tested had headless execution on par with headful, even when including ublock. Though these other websites were not quite as ad-heavy.

I'm not sure why this one site in particular behaves how it does. It looks like the headful browser helps the ublock extension close off ads faster or something like that.

So yeah, not sure what to do with this information myself, but... here you go.

@hexydec
Copy link

hexydec commented Sep 25, 2018

Hey guys, so I've found a workaround for this. I found the answer in another thread unrelated to puppeteer but also has the problem with chromium being extremely slow in headless mode. I noticed that when I tried to debug chromium that network requests were the slow bit in terms of headless mode so I searched for that.

const browser = await puppeteer.launch({args: ["--proxy-server='direct://'", '--proxy-bypass-list=*']})

When launching the browser add these 2 arguments to the list and it seems (for me anyway) to resolve all issues with speed.

Link to thread if you're interested: Codeception/CodeceptJS#561 and props to oligee80 (who found the answer) for saving me about a week of work rolling back a bunch of product releases! tada

We have a proxy, this fixed the issue for me.

@MaDetho
Copy link

MaDetho commented Sep 29, 2018

Try setting a valid UserAgent. This worked for me.

await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36')

@daniel-mf
Copy link

Thanks @MaDetho , that worked for me too.

@int2018
Copy link

int2018 commented Oct 23, 2018

Hi all,
I found this thread but I come from a different corner.
I set up a Dell Optiplex 9020 with Linux Mint 19 and the thing runs very nicely with 2 display attached, one on the vga port the other at the displayport (no graphics card, just Intel HD 6400). So everything is fine and I can access the system via vnc.

But as soon as I boot the machine without displays connected the performance is really poor, nothing is changed, just no display connected. When I plug the displays and boot again, the performance is very good again. I don't run any special programs, I just start the Linux Mint file manager, start a terminal and so on. No page-load-testing or something. Everythings is much slower in headless mode.
Ok, this thread is about a Chrome issue and a workaround but the common problem is that the performance is low in headless mode. Maybe here is some similarity?

My idea was that with the display(s) connected somehow "the graphics memory" is activated in a different way (or at all?). Since I have only Intel on-board graphics. Maybe without displays connected the Intel HD graphics is sleeping? And the graphics is rendered via the regular cpu? Ok, just wild guessing...
Or is there something "integrated" into Linux Mint which has the same effects like for your chrome problem? And maybe there is an OS switch for this (in Linux)?

Many questions, and I hope there is an explanation for this ...
Thanks in advance :-)

@basmith
Copy link

basmith commented Sep 17, 2019

@ChiggerChug 's change worked for me. In my case, I'm on a VPN that restricts normal internet traffic, using Puppeteer to work with an intranet application. Headful was fine, but headless took around four minutes with timeout disabled. Perhaps headless is trying to open socket connections (for telemetry or otherwise)?

@fgroupindonesia
Copy link

very strange.... i found this similar case but resolved lately after using a valid user agent.

@dajiangqingzhou
Copy link

Try setting a valid UserAgent. This worked for me.

await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36')

It help me a lot. Thank you!!!!!!!!!

@uniquejava
Copy link

Try setting a valid UserAgent. This worked for me.

await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36')

Thank you. Mine is macOS, in headless mode set a userAgent like this is 5x faster! 👍

@parthasarathydNU
Copy link

parthasarathydNU commented Aug 4, 2023

I observed the following :

Upgrading the node version to 20.2.0 got the browser creation happen very fast.

Running Docker command - Attempt 1
Starting
CreatingBrowserInstance: 886.537ms
Browser created
Page created
Done
Running Docker command - Attempt 2
Starting
CreatingBrowserInstance: 466.263ms
Browser created
Page created
Done
Running Docker command - Attempt 3 ...

Here is the GitHub repo with the code base - Docker File, Simple Node app that takes a screenshot of google.com

https://github.com/parthasarathydNU/docker-m1-puppeteer

@RopoMen
Copy link

RopoMen commented Oct 9, 2023

@parthasarathydNU it's because you use Puppeteer 21.0.1 and you use 'new' headless. Read more https://developer.chrome.com/articles/new-headless/

Edit: I was just curious of knowing if these flags #1718 (comment) could be taken away in 'new' headless. Your results are promising, ty!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests