Automated control of Chrome using Node.js based on the chrome devtools protocol

Automated control of Chrome or Chromium using Node.js, a highly emulated user behavior simulator based on the chrome devtools protocol.

Borrowed from puppeteer, the rewrite was chosen because puppeteer has various strange bugs in practice, causing threads to constantly block and be difficult to fix, and some implementation details are not as expected.

The chrome devtools protocol api is too primitive and not developer-friendly. puppeteer api is feature-rich, but still too cumbersome and has some bugs that are hard to avoid.

auto-chrome is designed with simplicity and ease of use in mind, focusing on simplifying common application scenarios and meeting customization needs through extensions.


  • Auto-chrome supports auto-focus, which automatically switches tabs based on the currently active tabs, and gets the currently active tabs via This avoids the pain of switching between multiple tabs in practice and reduces the confusion caused by manual tab switching.

  • Support active navigation monitoring, avoiding the user's negligent failure to clearly define the navigation behavior, resulting in confusing operations.

  • Supports automatic release of message queue waiting timeout. Due to various uncontrollable factors, devtools does not guarantee 100% response to all messages sent, and messages will be in a continuous waiting state without a timeout mechanism, making it impossible for tasks to continue to be executed

  • Unbind mouse, keyboard and touch input devices from the page, avoiding frequent switching between multiple pages

  • Support high simulation input, simulating mouse trajectory and Touch gestures

  • Simplified error handling mechanism, even if the task is abnormal, it can still continue to run, so as far as possible to ensure that threads do not persistently block

  • Support GPS location


npm install auto-chrome

chromium installation

Due to the network environment, auto-chrome does not install chromium directly as an npm dependency like puppeteer does. So you need to download chromium manually and specify the installation path in the launch.executablePath configuration item.

Recommended source:

chrome devtools terminology

  • Target represents a target object in the browser, which can be one of the browser, page, iframe, or other types. When type is page, targetId corresponds to the frame id of the main frame.

  • Session session mechanism is used to create multiple sessions, you can bind a separate session for each Target, or you can let multiple Targets share the same session.

  • Page browser tab, Chrome allows multiple Pages to be opened, but only one Page is always active.

  • Runtime JavaScript runtime, used to inject JS code into a web page to manipulate the DOM.

  • Frame The frame in a web page, the main Frame is allowed to contain multiple child frames.

  • Context The context in which the JavaScript runs. Since a page may contain Frames, each Frame has a separate runtime, so a unique contextId needs to be generated to distinguish between them.


The 301 redirects to a new url and triggers multiple context switches in a row, resulting in a context mismatch. 301 is hard to detect and difficult to predict, so extra care should be taken when debugging.

Page navigation

Browser navigation events can be divided into two kinds of predictable and unpredictable, because the navigation is triggered in many ways, through the mouse, keyboard, JS script method may trigger unknown navigation events. If the navigation switch is not timed correctly, it can create a bug of misplaced contextual messages.

Predictable navigation

For explicit operations like chrome.newPage() and page.goto(), which explicitly include navigation behavior, autoChrome wraps them internally and requires no additional processing when using them.

Unpredictable navigation

  • Navigation may refresh the current tab or create a new tab

  • Unpredictable navigation behavior of jump links triggered by JS

  • One or more 301 redirects after clicking a link

  • Uncertainty of redirects due to browser 301 caching

In the above case, it is impossible to accurately predict whether an action will trigger a navigation event. autoChrome implements automatic navigation by means of round-robin detection, which has the disadvantage of not being very time-efficient and has limited application scenarios.

// await navigation keyboard sample code
await Promise.all(["Enter"), chrome.autoNav()]);

// Mouse example
await Promise.all(["#input"), chrome.autoNav()]);

High split screen resolution

If your chrome is running on a high split-screen device, you may get a serious bug with touch events being misaligned, in which case try using "--force-device-scale-factor=" to adjust the scaling.


  • options Object Global instance configuration options, with lower priority than page

    • args[ars, ...] Array Array of Chrome startup parameters

      • ars String Chrome startup parameters
    • executablePath String Chrome program execution path

    • userDataDir String Path to user profile, defines separate Chrome instances, supports parallelism in cluster mode

    • emulate Object Device emulation, this configuration is less effective for initial tags, probably because the initial targetCreated event is not caught.

      • viewport Object

        • mobile Boolean Mobile device, default false

        • width Number Screen width, default adaptive screen width

        • height Number Screen height, default adaptive screen height

      • geolocation Object geolocation, use Google Maps coordinates

        • longitude Number Longitude

        • latitude Number Latitude

        • accuracy Number Accuracy

    • headless Boolean Hide execution mode, default false

    • devtools Boolean Automatically turn on devtools for each page, default false

    • timeOut Number Message response timeout, default 150000

    • ignoreHTTPSErrors Boolean Ignore https errors, default false

    • disableDownload Boolean Disable downloading of files, default false

    • loadTimeout Number Maximum dwell time for auto-navigation to wait for a page to load, in ms

  • return Chrome Chrome class instance

class: Chrome


mouse, touch event instance, referenced from the currently active state page.clicker


Keyboard event instance, referenced from the currently active state page.keyboard


Map object containing all open pages

The page that is currently active


  • url String opens the page address, by default it opens a blank page


Closes the specified tab by pageId

  • pageId String The id of the page to delete


Creates a standalone browser environment that can only be run in incognito mode.

chrome.send(method, params)

Sends the original chrome devtools protocol message

  • method String method name

  • params Object parameters


  • time Number Wait timeout time

Cyclic monitoring, automatic navigation


Close the browser

class: Page


mouse, touch event instances, use touch instance when autoChrome(options) configuration item is true, otherwise use mouse instance


Keyboard instance


Device emulation, direct calls to this method may cause confusion, the normal event-driven execution of page.emulate() should be done at the creation of the label, manual calls will have a delayed override problem.

  • options Object options

    • mobile Boolean Mobile device

    • width Number screen width

    • width Number screen height

    • geolocation Object geolocation

      • longitude Number Longitude

      • latitude Number Latitude

      • accuracy Number precision


Open a new page inside a tab, ... .arg)

Inject a js function into the page and get the return value after execution

  • pageFunction Function injects the function

  • arg * serializable arguments, no function support

  • return Object Information about the remote resource, RemoteObject


Select a single element

  • selector String CSS selector

  • return Object Single Elment Instance


Selects multiple elements

  • selector String CSS selector

  • return Array Array of multiple Elment instances

Click on an element with a CSS selector


Clicking on an element via CSS selector, built-in navigation

  • selector String CSS selector

page.type(selector, text, options)

Focus input with CSS selector, input text

  • selector String CSS selector

  • text String input text

  • options Object configuration information

    • delay Number input interval, ms

page.send(method, params)

Send the original chrome devtools protocol message containing the session

  • method String method name

  • params Object parameters


Scrolls to the viewable area of the specified element, trying to center it along the Y-axis

  • selector String CSS selector


Focuses the element with a CSS selector

  • selector String CSS selector


Get the element coordinates by CSS selector, the value is obtained by the getBoundingClientRect() function

  • selector String CSS selector


Close the tag


Navigate to the previous history tab

Navigate to the next history tab

class: Element

Used to implement traceable remote elment to avoid duplicate code commits and repeated executions.

For large objects or DOM objects, it is not practical to return them directly, so an incremental mechanism for remote operations is needed. devtools implements state tracking by saving the execution results of injected functions and returning the reference id, so that incremental operations can be done on top of existing remote results.


  • selector String

  • return Object Elment Instance

Selects a single element and generates a remote reference object


  • selector String

  • return Array Array of multiple Elment instances

Selects multiple elements and generates a remote reference object


  • name String

Get the value of the property specified in elment

elment.set(name, value)

  • name String Attribute name

  • value * property value

Set the value of the attribute specified in the elment


  • value String Assignment

Gets or sets the value, only for form elements


Focus the element


Get the element size and coordinates with the getBoundingClientRect function


Quickly switch the specified element to the viewable area

class: Mouse, y, options)

Add a new mousemoved track simulation, the original click may only trigger once for efficiency reasons

The click operation already includes move, so in most cases it is no longer necessary to simulate a separate move operation, unless you only move the mouse and do not need to click

  • options Object

    • steps Number The number of times the mousemoved event is triggered, default 20

mouse.move(x, y, options)

Change the default value of steps to 20, the original value is 1, i.e. only triggered once. If the move distance is the same, the lower the number of triggers, the faster the corresponding move speed

  • options Object options

    • steps Number The number of times the mousemoved event is triggered, the default value is 20

mouse.scroll(x, y, step)

Scroll to specified coordinates, currently only supports vertical scrolling

  • x Number horizontal coordinate, 0

  • y Number vertical coordinate

  • step Number step length

class: Touch

touch.slide({start, end, steps})

Simulate touch single swipe gesture

  • start Object Start coordinates

    • x Number touchstart x-coordinate

    • y Number touchstart y coordinate

  • end Object end coordinates

    • x Number touchend x coordinate

    • y Number touchend y coordinate

  • steps Number number of touchmove triggers

  • delay Number dwell time before touch release, used for slide inertia control

touch.scroll(x, y, options)

Scrolls the page to the specified visual coordinates by touch

  • x Number Target x coordinate

  • y Number target y-coordinate

  • options Object

    • interval Number The interval between successive swipes, default 2000, in ms

