-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #75 from the-markup/puppeteer-page-load-option
Adding the first draft of Puppeteer Page Load Option markdown file
- Loading branch information
Showing
1 changed file
with
21 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# **Analysis of Puppeteer's `page.goto()` WaitUntil Options** | ||
|
||
While investigating timeout issues in the Blacklight tool, we specifically explored tweaking the `waitUntil` option in Puppeteer's `page.goto()` method. This option determines when the navigation is considered complete. | ||
|
||
The `page.goto()` method is used to navigate the page to a URL. The `waitUntil` option controls when Puppeteer considers the navigation successful: | ||
|
||
- `'load'` - Wait for full page load | ||
- `'domcontentloaded'` - Wait for DOM only | ||
- `'networkidle0'` - Wait for no network activity | ||
- `'networkidle2'` - Wait for 2 or fewer connections | ||
|
||
In the end, no single option solved the timeout issues completely. However, these notes capture our analysis of the tradeoffs between the different strategies. | ||
|
||
We ultimately combined multiple approaches for the optimal solution. But this table represents a piece of the investigative journey to better understand Puppeteer page load behavior: | ||
|
||
| Options | Advantages | Disadvantages | | ||
|-------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------ |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ||
| `load` | - Ensures all resources are fully loaded.<br> - Straightforward and easy to understand. | - Can result in longer load times.<br> - Can result in unnecessary waiting if our script doesn't interact with or depend on all resources.<br> - If any resource fails to load, the `load` event will not fire, and Puppeteer will wait until the timeout. | | ||
| `domcontentloaded` | - Faster than `load` because it doesn't wait for stylesheets, images, and subframes to finish loading.<br> - Suitable if our script only interacts with the DOM. | - If our script interacts with or depends on resources that load after the DOM, it might run before these resources are ready. | | ||
| `networkidle0` | - Useful for pages that load additional resources after the load event.<br> - Waits until there are no more network connections for at least 500 ms. | - Can result in longer load times.<br> - If the page continuously makes new network requests, the `networkidle0` event might never occur, and Puppeteer will wait until the timeout. | | ||
| `networkidle2` | - Similar to `networkidle0`, but allows for up to 2 network connections.<br> - Useful for pages that keep a couple of connections open indefinitely. | - Can result in longer load times. <br> - If the page continuously makes more than 2 new network requests, the `networkidle`' event might never occur, and Puppeteer will wait until the timeout. | |