-
-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for lazy loaded elements when using the fullPage
option
#40
Changes from 7 commits
d81b43a
d4000ad
d61dc4c
5dec125
8662efa
a940db1
c165d3d
4b80210
16eec30
990ad56
e0ebd01
5f1aa86
ebb5c1c
7ae2147
46ee934
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -128,6 +128,8 @@ const parseCookie = (url, cookie) => { | |
return ret; | ||
}; | ||
|
||
const imagesHaveLoaded = () => [...document.images].map(element => element.complete); | ||
|
||
const captureWebsite = async (input, options) => { | ||
options = { | ||
inputType: 'url', | ||
|
@@ -321,6 +323,39 @@ const captureWebsite = async (input, options) => { | |
await options.beforeScreenshot(page, browser); | ||
} | ||
|
||
if (screenshotOptions.fullPage) { | ||
// Get the height of the rendered page | ||
const bodyHandle = await page.$('body'); | ||
const bodyBoundingHeight = await bodyHandle.boundingBox(); | ||
await bodyHandle.dispose(); | ||
|
||
// Scroll one viewport at a time, pausing to let content load | ||
const viewportHeight = viewportOptions.height; | ||
let viewportIncrement = 0; | ||
while (viewportIncrement + viewportHeight < bodyBoundingHeight) { | ||
const navigationPromise = page.waitForNavigation(); | ||
/* eslint-disable no-await-in-loop */ | ||
await page.evaluate(_viewportHeight => { | ||
/* eslint-disable no-undef */ | ||
window.scrollBy(0, _viewportHeight); | ||
/* eslint-enable no-undef */ | ||
}, viewportHeight); | ||
await navigationPromise; | ||
/* eslint-enable no-await-in-loop */ | ||
viewportIncrement += viewportHeight; | ||
} | ||
|
||
// Scroll back to top | ||
await page.evaluate(_ => { | ||
/* eslint-disable no-undef */ | ||
window.scrollTo(0, 0); | ||
/* eslint-enable no-undef */ | ||
}); | ||
|
||
// Some extra delay to let images load | ||
await page.waitForFunction(imagesHaveLoaded, {timeout: 60}); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the user would expect it to take up to one minute. If we keep this, it needs to be documented. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might also be added as an argument with default value set to timeout 60. Might want to comment on this or implement some other better functionality? Maybe it's also worth looking into alternate functions, such as the
In case it might be useful for documentation purposes; I retrieved part of the solution from https://stackoverflow.com/questions/51651830/how-to-wait-for-all-images-to-load-from-page-evaluate-function-in-puppeteer-when There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would look into something better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. using |
||
} | ||
|
||
const buffer = await page.screenshot(screenshotOptions); | ||
|
||
await page.close(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this line wait for
networkidle
instead of the defaultloaded
? I assume this promise will be awaited to let the images load, right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably agree with you, but care to clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting for network idle allows lazy elements to load (elements loaded after the "load" event of the document). I suggest doing as follows:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also worth to keep in mind that some (old) websites can actually never fulfill the requirements for the network idle navigation event - for example frequent chat requests.
In the end, the propsed flow above should work just fine (most cases).