Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target creation event listeners are sometimes not executed early enough #2669

Closed
berstend opened this issue Jun 3, 2018 · 4 comments
Closed

Comments

@berstend
Copy link

berstend commented Jun 3, 2018

Note: Please see my updated comment as well.

Steps to reproduce

Tell us about your environment:

  • Puppeteer version: v1.3.0, v1.4.0, next
  • Node.js version: v8, v9, v10

What steps will reproduce the problem?

Please include code that reproduces the issue.

'use strict'

const puppeteer = require('puppeteer')

const delay = ms => new Promise(resolve => setTimeout(resolve, ms))
const PUPPETEER_ARGS = ['--no-sandbox', '--disable-setuid-sandbox']

puppeteer.launch({ args: PUPPETEER_ARGS }).then(async browser => {
  // Modify user agent when page is created
  browser.on('targetcreated', async (target) => {
    if (target.type() !== 'page') { return }
    const page = await target.page()
    let ua = await page.browser().userAgent()
    ua = ua.replace('HeadlessChrome/', 'Chrome/')
    await page.setUserAgent(ua)
  })

  // Define test case
  const testCase = async (index) => {
    const page = await browser.newPage()
    // await delay(50) // would fix this issue 99% (but not 100%) of the time
    const ua1 = await page.evaluate(() => window.navigator.userAgent)
    console.log(index, 'ua1', !ua1.includes('HeadlessChrome') ? 'PASS' : 'FAIL')

    await page.goto('about:blank')
    const ua2 = await page.evaluate(() => window.navigator.userAgent)
    console.log(index, 'ua2', !ua2.includes('HeadlessChrome') ? 'PASS' : 'FAIL')
  }
  // Run many test cases at once
  await Promise.all(
    [...Array(5)].map((slot, index) => testCase(index))
  )

  await browser.close()
})

What is the expected result?

  • All headers are being modified and PASS is returned for each test

What happens instead?

  • Sometimes the user-agent is not being modified for the first request (ua1), resulting in FAIL

Context

Apologies if this has been reported before, I did a thorough search and could only find similar issues: #1378 #386

I'm developing a plugin framework for puppeteer (not a fork but augmenting your guys amazing work) and ran into a couple of edge-case issues with target creation event listeners.

Sometimes the listeners are not executing fast enough when a new target is created, resulting in a page not being modified quick enough for e.g. header modifications to take effect.

Overloading browser.newPage()/browserContext.newPage() to a return with a small delay (e.g. 50ms) fixes this behaviour 99% of the time, but unfortunately not always and also feels like a very brittle solution in general.

I assume this is an issue by design (event listeners not being blocking)?

I wonder how to best mitigate this behaviour, other then to add a random delay to methods that return a new target. Would be great to better understand where this problem originates from so I can dig deeper into the puppeteer code to find a reliable workaround. :)

Thanks!

@berstend
Copy link
Author

berstend commented Jun 3, 2018

The issue rarely comes up when adding a resolve delay of 50ms to methods returning a new target, but my CI is testing so often that it can still happen. 😄

@berstend
Copy link
Author

berstend commented Jun 3, 2018

I noticed this only affects newly created pages. When adding await page.goto("about:blank") right after .newPage(), before the business logic, the problem is gone. This adds no measurable delay.

Looking into side-effects of that mitigation now and how to mitigate implicitly created targets like e.g. `window.open).

@berstend
Copy link
Author

berstend commented Jun 3, 2018

I dug deep into the lifecycle of targets to find a fix (especially for implicitly created ones) but so far didn't find a solution.

I get the impression that when a target is created through e.g. window.open Chrome will navigate and load the page so fast that pptr (or targetcreated consumers) have no time to modify the request.

I therefore think this issue can be split in two:

  • Targets created implicitly: Bug/Feature: Intercept Immediately #1378
    • My guess is that the solution is to find a way to pause navigation of newly created targets somehow for event listeners to be able to modify the request
    • I've added a window.open test case over there
    • edit, confirmed (popups are not attached early enough)
  • Targets created through Puppeteer (this ticket)
    • Given that pptr knows what's about to happen there should be a better way than my aforementioned hackfix, to add page.goto('about:blank') after page creation, to make event consumers happy :)
    • edit, this fix works reliably and is now shipped in my code.

If anyone with more knowledge of pptr internals can confirm my suspicions that'd be great 😄

@aslushnikov
Copy link
Contributor

@berstend first of all, thanks for a good repro I could play with! 🎉

I wonder how to best mitigate this behaviour, other then to add a random delay to methods that return a new target. Would be great to better understand where this problem originates from so I can dig deeper into the puppeteer code to find a reliable workaround. :)

In your example, there's a race between your test function and initialization process you run in the targetcreated event listener: these two run in parallel for every newly created page. It's mitigated by setting up page explicitly before running a test for it.

The following worked just fine for me:

const puppeteer = require('puppeteer');

const delay = ms => new Promise(resolve => setTimeout(resolve, ms))
const PUPPETEER_ARGS = ['--no-sandbox', '--disable-setuid-sandbox']

puppeteer.launch({ args: PUPPETEER_ARGS }).then(async browser => {
  // Modify user agent when page is created
  async function setupPage(page) {
    let ua = await page.browser().userAgent()
    ua = ua.replace('HeadlessChrome/', 'Chrome/')
    await page.setUserAgent(ua)
  }

  // Define test case
  const testCase = async (index) => {
    const page = await browser.newPage()
    await setupPage(page);
    // await delay(50) // would fix this issue 99% (but not 100%) of the time
    const ua1 = await page.evaluate(() => window.navigator.userAgent)
    console.log(index, 'ua1', !ua1.includes('HeadlessChrome') ? 'PASS' : 'FAIL')

    await page.goto('about:blank')
    const ua2 = await page.evaluate(() => window.navigator.userAgent)
    console.log(index, 'ua2', !ua2.includes('HeadlessChrome') ? 'PASS' : 'FAIL')
  }
  // Run many test cases at once
  await Promise.all(
    [...Array(5)].map((slot, index) => testCase(index))
  )

  await browser.close()
})

However, for the implicitly-created pages (or the ones that appeared due to window.open), we should provide a missing capability to "pause" page before it continues its business. This will be implemented as a part of #1378.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants