Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash-only navigation doesn't work #257

Closed
aslushnikov opened this issue Aug 15, 2017 · 27 comments · Fixed by #2338
Closed

Hash-only navigation doesn't work #257

aslushnikov opened this issue Aug 15, 2017 · 27 comments · Fixed by #2338
Assignees

Comments

@aslushnikov
Copy link
Contributor

Since the hash-only navigation doesn't cause any network requests and doesn't cause load event,
the following gets stuck:

const puppeteer = require('puppeteer');
(async() => {
  let browser = await puppeteer.launch();
  let page = await browser.newPage();
  await page.goto('https://example.com');
  await page.goto('https://example.com#ohh'); // <== stuck here
  browser.close();
})();
@aslushnikov
Copy link
Contributor Author

Hash navigation breaks all our navigation primitives:

page.waitForNavigation();
page.goto();
page.goForward();
page.goBack();

@joelgriffith
Copy link
Contributor

I've seen this in just about every headless driver out there, and am suspecting it's an issue in the browser itself...

aslushnikov added a commit to aslushnikov/puppeteer that referenced this issue Aug 18, 2017
This patch teaches the following methods to support anchor
navigation:

- `page.goto`
- `page.waitForNavigation`
- `page.goBack`
- `page.goForward`

Fixes puppeteer#257.
@aslushnikov aslushnikov self-assigned this Aug 19, 2017
@DMQ
Copy link

DMQ commented Sep 6, 2017

@aslushnikov I got the same issue, has this issue been solved?

@Garbee
Copy link
Contributor

Garbee commented Sep 6, 2017

When it is solved the issue will be updated with further information in regards to it.

@onamission
Copy link

onamission commented Sep 6, 2017

I am not sure if this is completely relevant to this thread, but I just created a script that allows us to navigate to hashes on a page to take various screenshots. The reason I question the relevance is because we have some JS working in the background that assists our navigation, so I don't know if this script would work without the page JS or not. Anyway, here is what works for me (sorry, it is node6):

getScreenshotOfSlides(page, url, slides, buffer, options) {
        var self = this;
        return new Promise((resolve, reject) => {
            if (options && options.path) {
                screenshotOptions.path = options.path;
            }
            if (!slides || !slides.length) {
                return resolve(buffer);
            }
            var slide = slides.shift();
            var pageUrl = url + "/#/" + slide;   // our url's use slashes around the hash
            return page.goto(pageUrl, { waitUntil: 'networkidle' })
                .then(res => {
                    return page.screenshot(screenshotOptions);
                })
                .then(res => {
                    buffer.push(res);
                    return self.getScreenshotOfSlides(page, url, slides, buffer, options);
                })
                .then(res => {
                    return resolve(buffer);
                })
                .catch(err => {
                    return reject(err);
                });
        });

Using a the browser inspector and Charles it appears that the only network traffic this creates is to make the initial call to the server. After that has rendered the network goes quiet.

This is a solution for the issue I was trying to solve that lead me to #491 -- which brought me here.

I hope this helps someone.

@DMQ
Copy link

DMQ commented Sep 7, 2017

@Garbee thanks!

@DMQ
Copy link

DMQ commented Sep 7, 2017

@onamission thanks! It is works for me.

await page.goto(tag.url, {waitUntil: 'networkidle'})

@Means88
Copy link

Means88 commented Sep 20, 2017

And with the history API

// page
history.pushState(null, null, url);

// puppeteer
await page.goBack(); // <=

BTW, page.url() returns the original url, is it a feature or a bug?

@doctyper
Copy link

doctyper commented Oct 14, 2017

@Means88 This works as a workaround:

await page.goto(url, {waitUntil: 'networkidle'})
const url = await page.evaluate('location.href');

@CommanderXL
Copy link

Will the problem of anchor navigations be solved in 0.13?

@jfsiii
Copy link

jfsiii commented Nov 9, 2017

The incompatibility with History means that sites using react-router will have issues using page.url(), page.waitForNavigation(), etc

Here are some of my workarounds:

I use page.waitForSelector() instead of page.waitForNavigation() if possible.

I use these two functions for dealing with the URL

const getLocation = async (page) => page.evaluate(() => location)
const getLocationProp = async (page, prop) => (await getLocation(page))[prop]

and these for history:

const getHistory = async (page) => page._client.send('Page.getNavigationHistory')
const getHistoryEntry = async (page, index) => (await getHistory(page)).entries[index]
const getCurrentHistoryEntry = async (page) => {
  const { entries, currentIndex } = await getHistory(page)
  return entries[currentIndex]
}

based on https://github.com/GoogleChrome/puppeteer/blob/7d18275fb981e01cec4a4fbac61a9c66e46947bc/lib/Page.js#L532-L533

@MrNice
Copy link

MrNice commented Nov 16, 2017

I'm running into this issue trying to take screenshots of our production SPA for a perceptual diff test. If there's any way to help track down the root source of this weird behavior (goto not working within an app is bothersome), please point me in the right direction and I'll try to fix it.

@joelgriffith
Copy link
Contributor

It’s an issue in chromes remote protocol. I think/hope they’re looking into it. As an alternative, for the time being, I think you can monitor network inactivity and take your screenshot when that’s silent.

By no means a permanent solution, but should hopefully get you by in the meantime

@Aymkdn
Copy link

Aymkdn commented Dec 19, 2017

Using await page.goto(url, {waitUntil: 'networkidle'}) works for me.

Note that the documentation says :

networkidle0  - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms.
networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.

But these parameters don't seem to work...

@Aymkdn
Copy link

Aymkdn commented Dec 21, 2017

Switching between different versions of Puppeteer...

Result:

  • v0.12.0:  page.goto('same.url#different_hash', {waitUntil: 'networkidle'})PASSED
  • v0.13.0: page.goto('same.url#different_hash', {waitUntil: 'networkidle0'})FAILED
  • 1.0.0rc: page.goto('same.url#different_hash', {waitUntil: 'networkidle0'})FAILED

So it looks like there is a regression with lastest versions, or I don't understand the new options...

@stweedie
Copy link

stweedie commented Jan 5, 2018

This is related to a problem in chrome's protocol. In short, hash navigation won't trigger any of the page initialization events.

A puppeteer problem is that page.goto FORCES you to attempt to listen to one of those events (unless there's some undocumented configuration option), timing out with an error if (when) they never come. The only reliable way around this is to set a low timeout, catch (and disregard) the error that will come, and then manually check if the page has loaded with your own logic.

Is there some way around this?

@yi-ge
Copy link

yi-ge commented Mar 12, 2018

How to solve the problem in the new version(1.1.1)?

thank you.

@ebidel
Copy link
Contributor

ebidel commented Mar 12, 2018

https://github.com/GoogleChromeLabs/puppeteer-examples/blob/master/hash_navigation.js shows how to listen for hashchange events and react accordingly. You might be able to extract ideas from that for a workaround.

@yi-ge
Copy link

yi-ge commented Mar 13, 2018

@ebidel Thank you very much.

@intellix
Copy link

intellix commented Mar 14, 2018

Version: 1.1.1

Just to add to this, cause I don't think anyone mentioned it yet, but this also seems to cause problems with SPAs that use HTML5-style URLs and coupled with waitForNavigation.

Originally we were doing this:

await page.goto(`${domain}/${path}`, { waitUntil: 'networkidle0' });

But as observed without headless, you're losing the performance benefit of an SPA as every page gets entirely loaded again. So I've added this to speed up page-switching if a link exists it'll be clicked instead so a full reload doesn't occur:

const link = await page.$(`a[href="${path}"]`);
if (link) {
  return await Promise.all([
    page.waitForNavigation({ waitUntil: 'networkidle0' }),
    link.click(),
  ]);
}

await page.goto(`${domain}/${path}`, { waitUntil: 'networkidle0' });

The clicks work and everything is fast, but the waitForNavigation never resolves. We're using URLs like: www.mysite.com, www.mysite.com/contact, www.mysite.com/about-us so this doesn't seem to be limited to hashed URLs

Not even the following works:

await Promise.all([
  page.waitForNavigation({ waitUntil: 'networkidle0' }),
  new Promise(resolve => setTimeout(() => resolve(link.click()), 500)),
]);

Workaround is to enforce waitFor function selectors throughout to determine page readiness but was hoping to re-use the networkidle0 functionality. Perhaps that could be allowed in a standard waitFor call?

With workaround though, you have missing images in screenshots as you're unable to wait for all network requests to finish

Maybe related to: #1412

aslushnikov added a commit to aslushnikov/puppeteer that referenced this issue Apr 6, 2018
This roll includes:
- https://crrev.com/548598 - DevTools: implement Page.setBypassCSP method
- https://crrev.com/548690 - DevTools: introduce Page.navigatedWithinDocument event

References puppeteer#1229, puppeteer#257.
aslushnikov added a commit that referenced this issue Apr 6, 2018
This roll includes:
- https://crrev.com/548598 - DevTools: implement Page.setBypassCSP method
- https://crrev.com/548690 - DevTools: introduce Page.navigatedWithinDocument event

References #1229, #257.
aslushnikov added a commit to aslushnikov/puppeteer that referenced this issue Apr 10, 2018
This patch fixes puppeteer navigation primitives to work with
same-document navigation.

Same-document navigation happens when document's URL is changed,
but document instance is not re-created. Some common scenarios
for same-document navigation are:
- History API
- anchor navigation

With this patch:
- pptr starts dispatching `framenavigated` event when frame's URL gets
changed due to same-document navigation
- `page.waitForNavigation` now works with same-document navigation
- `page.goBack()` and `page.goForward()` are handled correctly.

Fixes puppeteer#257.
aslushnikov added a commit that referenced this issue Apr 10, 2018
This patch fixes puppeteer navigation primitives to work with
same-document navigation.

Same-document navigation happens when document's URL is changed,
but document instance is not re-created. Some common scenarios
for same-document navigation are:
- History API
- anchor navigation

With this patch:
- pptr starts dispatching `framenavigated` event when frame's URL gets
changed due to same-document navigation
- `page.waitForNavigation` now works with same-document navigation
- `page.goBack()` and `page.goForward()` are handled correctly.

Fixes #257.
@maximkoshelenko
Copy link

Guys can this issue affect my test? For GUID login testing im trying to go to URL 'http://www.obgyn.net?25AA8E3E' but actual navigation is going to http://www.obgyn.net/?25AA8E3E so its braking my testing. Is there any solution to get rid of '/' before '?' mark ? Thanks in advance

@x-name
Copy link

x-name commented May 21, 2018

maximkoshelenko, I think URL without path (slash /) simply incorrect.

@alexey-sh
Copy link

For those who don't want to fight with pptr

async function waitForNavigation(page: Page, timeout = 10_000) {
    const initialUrl = page.url();
    const start = Date.now();
    return new Promise((resolve, reject) => {
        const check = () => {
            if(page.url() !== initialUrl) {
                resolve(true);
            }
            if (Date.now() - start > timeout) {
                reject(new Error('Wait for navigation timeout'));
            } else {
                setTimeout(check, 500);
            }
        };
        check();
    });
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.