New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit new Page objects when new tabs created #386

Closed
notgne2 opened this Issue Aug 19, 2017 · 25 comments

Comments

Projects
None yet
@notgne2
Copy link

notgne2 commented Aug 19, 2017

Not sure how this would work, or if possible at all with how puppeteer works. A nice feature would be if the browser emit an event with a new Page when a new tab is opened by clicking on a target="_blank" or through some other means.

@michaelfward

This comment has been minimized.

Copy link

michaelfward commented Aug 21, 2017

I think I'd like to take a whack at this one. Does anyone have any objections / suggestions? @aslushnikov

@aslushnikov

This comment has been minimized.

Copy link
Contributor

aslushnikov commented Aug 21, 2017

@michaelfward take a look at Target domain, e.g. Target.targetCreated event https://chromedevtools.github.io/devtools-protocol/tot/Target/#event-targetCreated

@michaelfward

This comment has been minimized.

Copy link

michaelfward commented Aug 21, 2017

@aslushnikov thanks! I should have a PR out for this in the next 24 hours

@nwhitmont

This comment has been minimized.

Copy link

nwhitmont commented Aug 21, 2017

Yes! This would be super useful!

I have a similar question open on Stack Overflow: Puppeteer: How to handle multiple tabs?

@nwhitmont

This comment has been minimized.

Copy link

nwhitmont commented Aug 28, 2017

@michaelfward Any update?

@ebidel

This comment has been minimized.

Copy link
Member

ebidel commented Aug 28, 2017

@nwhitmont did you see #554?

@michaelfward

This comment has been minimized.

Copy link

michaelfward commented Aug 28, 2017

@nwhitmont apologies, I saw @JoelEinbinder 's PR and figured I'd turn to other issues! Does his implementation take care of your use case?

@nwhitmont

This comment has been minimized.

Copy link

nwhitmont commented Aug 30, 2017

@michaelfward Ah ha! That explains it ;-) Thanks for the update.

@JasonBoy

This comment has been minimized.

Copy link

JasonBoy commented Sep 28, 2017

one workaround is in evaluate method, get the anchor, and ele.removeAttribute('target'), then you click it

@pirate

This comment has been minimized.

Copy link

pirate commented Oct 13, 2017

Not sure how you guys were planning on implementing this, but an API like this would be awesome!

event:newpage Gets triggered by things like window.open() or links clicked with target="_blank".

page.on('newpage', async (new_page) => {
    const url = new_page.url;
    console.log('Browser opened new tab', url);
    const page = await new_page.page();
});

If anyone is looking for a workaround, for now I'm using:

await page.evaluate(() => {
    window.open = (new_url) => {window.location.href = new_url}
});
page.on('load', (event) => {
    console.log('Opened new url', window.location.toString());
});

ithinkihaveacat added a commit to ithinkihaveacat/puppeteer that referenced this issue Oct 31, 2017

feat(Browser): introduce Browser.pages() (GoogleChrome#554)
This patch:
- introduces Target class that represents any inspectable target, such as service worker or page
- emits events when targets come and go
- introduces target.page() to instantiate a page from a target

Fixes GoogleChrome#386, fixes GoogleChrome#443.
@avimar

This comment has been minimized.

Copy link
Contributor

avimar commented Nov 9, 2017

A few questions.

Goal: I'm filling in a form and getting a list of links. To not waste time and fragility on the dynamic page clicking the link and then going back, I'm opening each link in a new window.

So now, when I click the links, I can access the new window.
But I'm running into a few issues.

  1. I have a request-handler to block css & images (a TON of requests) but I can't seem to get it attached in time, even with this code! (mostly from https://stackoverflow.com/questions/45806684/puppeteer-how-to-handle-multiple-tabs)
browser.on('targetcreated',async function () {
	console.log('attaching pre-filter');
	const pageList = await browser.pages();
	const page = pageList[pageList.length - 1];
	await page.setRequestInterception(true); //make sure it won't load extra stuff
	page.on('request',blockLoading);
	});
}

I see it's returning a target object, but I actually can't figure out from the API or logging it what that actually is.

Where can I get an explanation of targets?

Why isn't there an event for/with the actual new tab that opens, that should be triggered right away?

Also, is there a simpler way to just attach the handler to the entire browser once, rather than the entire time? Or some way to ensure the listener is idempotently only attached once?

  1. When clicking a link that opens in a new tab, I go from await of a click() to an event. It seems the event isn't fired as soon as the tab is open (as I see when headless:false) but rather later, seemingly tied to some sort of network response.

What's a good way to wait & grab this even in an await/promise code-flow?

Right now, if I wait long enough to grab the handle for the second page, make sure everything opens into the same tab, and ignore the AssertionError [ERR_ASSERTION]: Request is already handled! then it does seem to work... but pretty flaky ATM.

@JoelEinbinder

This comment has been minimized.

Copy link
Collaborator

JoelEinbinder commented Nov 9, 2017

The target object is defined at https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#class-target
It is something devtools can inspect, usually a page. Calling target.page() connects Puppeteer to the tab and generates a Page object.

New tabs aren't opened immediately on click. A way to await events is to create a new promise.

const newPagePromise = new Promise(x => browser.once('targetcreated', target => x(target.page()));
await page.click('my-link');
const newPage = await newPagePromise;

We currently don't have a way to attach to targets as soon as they are created, so there is no way to set up request interception for new tabs that won't miss some early requests. This is a known issue.

@avimar

This comment has been minimized.

Copy link
Contributor

avimar commented Nov 9, 2017

@JoelEinbinder

This comment has been minimized.

Copy link
Collaborator

JoelEinbinder commented Nov 9, 2017

The target might be a service worker if you are working with service workers, or something else like a chrome extension. In that case you should use .on instead of .once and check target.type() (and probably target.url()) to make sure it is the page you want.

@avimar

This comment has been minimized.

Copy link
Contributor

avimar commented Nov 10, 2017

A way to await events is to create a new promise.

That worked great, my code is now running pretty smoothly -- thanks! (It needed an extra ) at the end.)

I'm a newb with await, so can you explain why a small change to not use a new variable didn't work?

const newPagePromise = new Promise(x => browser.once('targetcreated', target => x(target.page())));
await page.click('my-link');
await newPagePromise;
newPagePromise.waitForSelector(...)

It looks like I'm getting a promise object, but with the page data inside a rejection handler...? There's no waitForSelector function.

We currently don't have a way to attach to targets as soon as they are created, so there is no way to set up request interception for new tabs that won't miss some early requests. This is a known issue.

Is there an issue number for this? (I couldn't find one.) It would be great to add an intercept handler to the entire browser object so that I can be sure to intercept everything.

@JoelEinbinder

This comment has been minimized.

Copy link
Collaborator

JoelEinbinder commented Nov 10, 2017

await doesn't overwrite the promise with the value, it just returns the value.

const newPagePromise = ...
const newPage = await newPagePromise;
newPage.waitForSelector(...)
@bitliner

This comment has been minimized.

Copy link

bitliner commented Dec 30, 2017

Sometimes I solve it by removing the need to switch page, by removing target="_blank" attribute:

element = page.$(selector)

await page.evaluateHandle((el) => {
        el.target = '_self';
 }, element)

element.click()

link

@franklsm1

This comment has been minimized.

Copy link

franklsm1 commented Jun 5, 2018

I tried using @JoelEinbinder's solution, but sometimes when the new page took a while to load it was returning null or throwing errors. I created a little function based off his solution to validate the page has loaded before returning it.

 function getNewPageWhenLoaded() {
    return new Promise((x) => browser.once('targetcreated', async (target) => {
        const newPage = await target.page();
        const newPagePromise = new Promise(() => newPage.once('domcontentloaded', () => x(newPage)));
        const isPageLoaded = await newPage.evaluate(() => document.readyState);
        return isPageLoaded.match('complete|interactive') ? x(newPage) : newPagePromise;
    }));
}

This function can be used just like the previous solution:

const newPagePromise = getNewPageWhenLoaded();
await page.click('my-link');
const newPage = await newPagePromise;
@yeyu456

This comment has been minimized.

Copy link

yeyu456 commented Jul 1, 2018

@franklsm1 thx for your function.But if you launch the puppeteer with devtools, the target will be the devtools with 'other' target type.And the target.page() function will return null.So make sure turn off the devtools or you can wait for the second fired event.

function getNewPageWhenLoaded() {
    return new Promise((x) => browser.once('targetcreated', async (target) => {
        const newPage = await target.page(); //return null when turn on devtools
        ......
    }));
}
@hidenny

This comment has been minimized.

Copy link

hidenny commented Aug 30, 2018

Is it possible to add a time out for it? it keeps waiting if the click event doesn't really open a new tab.

@xprudhomme

This comment has been minimized.

Copy link

xprudhomme commented Sep 13, 2018

@JoelEinbinder : Thanks for your solution, improved with @franklsm1 's code it works like a charm !

However, I had a situation where the code got stuck at the following line (the Promise never resolved):
const newPage = await newPagePromise; // just after await page.click('my-link');

This might happen once every 10 000 clicks on a link. I am not interested in fixing why it happened but rather would love to be able to handle the never resolved Promise.

Is there any way to implement a timeout/watch dog, something similar to the pollInterval function used in the FrameManager.js's WaitTask class code?

@xprudhomme

This comment has been minimized.

@stephen-james

This comment has been minimized.

Copy link

stephen-james commented Sep 27, 2018

@yeyu456 :

@franklsm1 thx for your function.But if you launch the puppeteer with devtools, the target will be the devtools with 'other' target type.And the target.page() function will return null.So make sure turn off the devtools or you can wait for the second fired event.

function getNewPageWhenLoaded() {
    return new Promise((x) => browser.once('targetcreated', async (target) => {
        const newPage = await target.page(); //return null when turn on devtools
        ......
    }));
}

for this you can follow @JoelEinbinder 's advice and use an on instead of once and check the target type.

I use something like this which seems to do the trick

    const getNewPageWhenLoaded =  async () => {
        return new Promise(x =>
            global.browser.on('targetcreated', async target => {
                if (target.type() === 'page') {
                    const newPage = await target.page();
                    const newPagePromise = new Promise(y =>
                        newPage.once('domcontentloaded', () => y(newPage))
                    );
                    const isPageLoaded = await newPage.evaluate(
                        () => document.readyState
                    );
                    return isPageLoaded.match('complete|interactive')
                        ? x(newPage)
                        : x(newPagePromise);
                }
            })
        );
    };
@lg

This comment has been minimized.

Copy link

lg commented Nov 19, 2018

for people in the future, make sure to check out the new browser.waitForTarget: https://github.com/GoogleChrome/puppeteer/blob/v1.10.0/docs/api.md#browserwaitfortargetpredicate-options

@dheerajbhaskar

This comment has been minimized.

Copy link

dheerajbhaskar commented Jan 2, 2019

Using lg's suggestion and code from another user (shankarregmi ) (#2912 (comment))

Here's an alternative:


const link = await page.$('selector')
await link.click()    // at this time, a new page was successful opened in a new tab in chromium
//waits until the target is available [see browser.targests]
const newPage = await browser.waitForTarget(target => target.ur() === link.textContent());

// get all pages
pages = await browser.pages()
for (const page of pages) {
    console.log(page.url())   // new page now appear!
}

Edit1: above code doesn't wait till page is loaded. With stephen-james' and franklsm1's code, the issue was that "domcontentloaded" is not an event which is fired (perhaps it was earlier)
Edit2: Below is my working code, to attempt at the same thing. Also, I've created an issue to help make this process simpler #3718

const pageTarget = this._page.target(); //save this to know that this was the opener
await resultItem.element.click(); //click on a link
const newTarget = await this._browser.waitForTarget(target => target.opener() === pageTarget); //check that you opened this page, rather than just checking the url
const newPage = await newTarget.page(); //get the page object
// await newPage.once("load",()=>{}); //this doesn't work; wait till page is loaded
await newPage.waitForSelector("body"); //wait for page to be loaded
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment