Skip to content

rickypc/puppeteer-page-pool

Repository files navigation

Version Downloads Dependency Status Dev Dependency Status Code Style Build Coverage Vulnerability Dependabot License

Puppeteer Page Pool

A Page resource pool for Puppeteer. It can be used to reuse or throttle usage of the Puppeteer Page resource.

Installation

$ npm install --save puppeteer-page-pool

API Reference

Provide Puppeteer Page resource pool.

See

Example

// use PagePool directly.
const PagePool = require('puppeteer-page-pool');

// Instantiate PagePool with default options.
const pagePool = new PagePool();
// Launch the browser and proceed with pool creation.
await pagePool.launch();
// Acquire and release the page seamlessly.
await pagePool.process(async (page) => {
  // Any page actions...
  await page.goto('https://angular.io');
});
// All done.
await pagePool.destroy();

Example

// create subclass as a child of PagePool.
class MyPagePool extends PagePool {
  constructor (options) {
    super(options);
    this.mine = true;
  }

  async takeOff () {
    // Launch the browser and proceed with pool creation.
    await this.launch();
    // Acquire and release the page seamlessly.
    await this.process(async (page) => {
      // Any page actions...
      await page.goto('https://angular.io');
    });
    // All done.
    await this.destroy();
  }
}

// Instantiate MyPagePool with default options.
const myPagePool = new MyPagePool();
// Custom action.
await myPagePool.takeOff();

Example

// use different puppeter library.
const puppeteer = require('puppeteer-extra');
// See https://bit.ly/32X27uf
puppeteer.use(require('puppeteer-extra-plugin-angular')());
const customPagePool = new MyPagePool({
  puppeteer,
});
// Custom action.
await customPagePool.takeOff();

Example

// instantiate with customized options.
const optionsPagePool = new MyPagePool({
  // See factory section of https://github.com/coopernurse/node-pool#createPool
  async onPageCreated (page) {
    // Bound function that will be called after page is created.
  },
  async onPageDestroy (page) {
    // Bound function that will be called right before page is destroyed.
  },
  async onValidate (page) {
    // Bound function that will be called to validate the validity of the page.
  },
  // See opts section of https://bit.ly/2GXZbUR
  poolOptions: {
    log: true,
  },
  puppeteer,
  // See https://bit.ly/2M6kVCd
  puppeteerOptions: {
    // I want to see all the actions :)
    headless: false,
  },
});
// Custom action.
await optionsPagePool.takeOff();

Example

// parallel processes.
const parallelPagePool = new PagePool({
  // See opts section of https://bit.ly/2GXZbUR
  poolOptions: {
    max: 3,
  },
  puppeteer,
  // See https://bit.ly/2M6kVCd
  puppeteerOptions: {
    headless: false,
  },
});
// Launch the browser and proceed with pool creation.
await parallelPagePool.launch();

const promises = [
  'https://angular.io',
  'https://www.chromium.org',
  'https://santatracker.google.com',
].map((url) => {
  // Acquire and release the page seamlessly.
  return parallelPagePool.process(async (page, data) => {
    // Navigate to given Url and wait until Angular is ready
    // if it's an angular page.
    await page.navigateUntilReady(data.url);
    await page.screenshot({
      fullPage: true,
      path: `${data.url.replace(/https?:|\//g, '')}-screenshot.png`,
    });
  }, { url });
});

// Wait until it's all done.
await Promise.all(promises);

// All done.
await parallelPagePool.destroy();

PagePool ⏏

Kind: Exported class
See

new PagePool(options)

Instantiate PagePool class instance.

Param Type Description
options Options PagePool options.

Example

const PagePool = require('puppeteer-page-pool');
const pagePool = new PagePool({});

pagePool.destroy() ⇒ null

Close and release all page resources, as well as clean up after itself.

Kind: instance method of PagePool
Returns: null - Null value.
Example

let pagePool = new PagePool();
pagePool = await pagePool.destroy();

pagePool.launch()

Launch the browser and create all page resources.

Kind: instance method of PagePool
Example

const pagePool = new PagePool();
await pagePool.launch();

pagePool.process(handler, ...args)

Process given args using provided handler.

Kind: instance method of PagePool

Param Type Description
handler ActionHandler Action handler.
...args * Action handler arguments.

Example

const args = { key: 'value' };
const pagePool = new PagePool();
await pagePool.process((page, data) => {}, args);

PagePool~PoolEventHandler : function

Pool factory event handler.

Kind: inner typedef of PagePool

Param Type Description
page Object The page resource.

Example

const poolEventHandler = (page) => {
  // Do something...
};

PagePool~Options : Object

PagePool instantiation options.

Kind: inner typedef of PagePool
See

Param Type Default Description
[onPageDestroy] PoolEventHandler The function that would be called before page is destroyed.
[onPageCreated] PoolEventHandler The function that would be called after page created.
[onValidate] PoolEventHandler The function that would be called to validate page resource validity.
[poolOptions] Object {} The pool instantiation options. See https://bit.ly/2GXZbUR
[puppeteer] Object require('puppeteer') Puppeteer library to be use.
[puppeteerOptions] Object {} Puppeteer launch options. See https://bit.ly/2M6kVCd

Example

const options = {
  async onPageDestroy (page) {},
  async onPageCreated(page) {},
  async onValidate(page) {},
  poolOptions: {},
  puppeteer,
  puppeteerOptions: {},
};

PagePool~ActionHandler : function

Action handler function that is executed with page resource from the pool.

Kind: inner typedef of PagePool

Param Type Description
page Object The page resource.
...args * Action handler arguments.

Example

const actionHandler = (page, ...args) => {
  // Do something...
};

Development Dependencies

You will need to install Node.js as a local development dependency. The npm package manager comes bundled with all recent releases of Node.js.

npm install will attempt to resolve any npm module dependencies that have been declared in the project’s package.json file, installing them into the node_modules folder.

$ npm install

Run Linter

To make sure we followed code style best practice, run:

$ npm run lint

Run Unit Tests

To make sure we did not break anything, let's run:

$ npm test

Contributing

If you would like to contribute code to Puppeteer Page Pool repository you can do so through GitHub by forking the repository and sending a pull request.

If you do not agree to Contribution Agreement, do not contribute any code to Puppeteer Page Pool repository.

When submitting code, please make every effort to follow existing conventions and style in order to keep the code as readable as possible. Please also include appropriate test cases.

That's it! Thank you for your contribution!

License

Copyright (c) 2018 - 2020 Richard Huang.

This module is free software, licensed under: GNU Affero General Public License (AGPL-3.0).

Documentation and other similar content are provided under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.