Skip to content
Headless chrome/chromium automation library (unofficial port of puppeteer)
Python HTML Other
Branch: dev
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs Allow options to be passed into pyppeteer.defaultArgs Sep 28, 2018
pyppeteer
tests
.coveragerc
.gitignore
.noserc Separate test server module Sep 1, 2017
.travis.yml Allow failure of windows tests on travis May 10, 2019
CHANGES.md Accept a list of arguments as ignoreDefaultArgs option Sep 28, 2018
LICENSE Update license description Aug 30, 2017
MANIFEST.in
Makefile
README.md Update README Aug 21, 2018
appveyor.yml
dodo.py Temporary disable spell check May 9, 2019
mypy.ini Disable mypy check on setup.py Sep 13, 2018
requirements-dev.txt Add spelling check on CI Sep 13, 2018
requirements-docs.txt Cannot pass documentation build with sphinx 1.8 Sep 13, 2018
requirements-test.txt
setup.py urllib3 > 1.25 fails in ssl error now May 10, 2019
spell.txt Accept a list of arguments as ignoreDefaultArgs option Sep 28, 2018
tox.ini Remove spell check dependencies on tox/travis May 9, 2019

README.md

Pyppeteer

PyPI PyPI version Documentation Travis status AppVeyor status codecov

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Installation

Pyppeteer requires python 3.6+. (experimentally supports python 3.5)

Install by pip from PyPI:

python3 -m pip install pyppeteer

Or install latest version from github:

python3 -m pip install -U git+https://github.com/miyakogi/pyppeteer.git@dev

Usage

Note: When you run pyppeteer first time, it downloads a recent version of Chromium (~100MB). If you don't prefer this behavior, run pyppeteer-install command before running scripts which uses pyppeteer.

Example: open web page and take a screenshot.

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('http://example.com')
    await page.screenshot({'path': 'example.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Example: evaluate script on the page.

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('http://example.com')
    await page.screenshot({'path': 'example.png'})

    dimensions = await page.evaluate('''() => {
        return {
            width: document.documentElement.clientWidth,
            height: document.documentElement.clientHeight,
            deviceScaleFactor: window.devicePixelRatio,
        }
    }''')

    print(dimensions)
    # >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Pyppeteer has almost same API as puppeteer. More APIs are listed in the document.

Puppeteer's document and troubleshooting are also useful for pyppeteer users.

Differences between puppeteer and pyppeteer

Pyppeteer is to be as similar as puppeteer, but some differences between python and JavaScript make it difficult.

These are differences between puppeteer and pyppeteer.

Keyword arguments for options

Puppeteer uses object (dictionary in python) for passing options to functions/methods. Pyppeteer accepts both dictionary and keyword arguments for options.

Dictionary style option (similar to puppeteer):

browser = await launch({'headless': True})

Keyword argument style option (more pythonic, isn't it?):

browser = await launch(headless=True)

Element selector method name ($ -> querySelector)

In python, $ is not usable for method name. So pyppeteer uses Page.querySelector()/Page.querySelectorAll()/Page.xpath() instead of Page.$()/Page.$$()/Page.$x(). Pyppeteer also has shorthands for these methods, Page.J(), Page.JJ(), and Page.Jx().

Arguments of Page.evaluate() and Page.querySelectorEval()

Puppeteer's version of evaluate() takes JavaScript raw function or string of JavaScript expression, but pyppeteer takes string of JavaScript. JavaScript strings can be function or expression. Pyppeteer tries to automatically detect the string is function or expression, but sometimes it fails. If expression string is treated as function and error is raised, add force_expr=True option, which force pyppeteer to treat the string as expression.

Example to get page content:

content = await page.evaluate('document.body.textContent', force_expr=True)

Example to get element's inner text:

element = await page.querySelector('h1')
title = await page.evaluate('(element) => element.textContent', element)

Future Plan

  1. Catch up development of puppeteer
    • Not intend to add original API which puppeteer does not have

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

You can’t perform that action at this time.