Automated control of Chrome or Chromium using Node.js, a highly emulated user behavior simulator based on the chrome devtools protocol.
Borrowed from puppeteer, the rewrite was chosen because puppeteer has various strange bugs in practice, causing threads to constantly block and be difficult to fix, and some implementation details are not as expected.
The chrome devtools protocol api is too primitive and not developer-friendly. puppeteer api is feature-rich, but still too cumbersome and has some bugs that are hard to avoid.
auto-chrome is designed with simplicity and ease of use in mind, focusing on simplifying common application scenarios and meeting customization needs through extensions.
-
Auto-chrome supports auto-focus, which automatically switches tabs based on the currently active tabs, and gets the currently active tabs via chrome.page. This avoids the pain of switching between multiple tabs in practice and reduces the confusion caused by manual tab switching.
-
Support active navigation monitoring, avoiding the user's negligent failure to clearly define the navigation behavior, resulting in confusing operations.
-
Supports automatic release of message queue waiting timeout. Due to various uncontrollable factors, devtools does not guarantee 100% response to all messages sent, and messages will be in a continuous waiting state without a timeout mechanism, making it impossible for tasks to continue to be executed
-
Unbind mouse, keyboard and touch input devices from the page, avoiding frequent switching between multiple pages
-
Support high simulation input, simulating mouse trajectory and Touch gestures
-
Simplified error handling mechanism, even if the task is abnormal, it can still continue to run, so as far as possible to ensure that threads do not persistently block
-
Support GPS location
npm install auto-chrome
Due to the network environment, auto-chrome does not install chromium directly as an npm dependency like puppeteer does. So you need to download chromium manually and specify the installation path in the launch.executablePath configuration item.
Recommended source: https://npm.taobao.org/mirrors/chromium-browser-snapshots/
-
Target
represents a target object in the browser, which can be one of the browser, page, iframe, or other types. When type is page, targetId corresponds to the frame id of the main frame. -
Session
session mechanism is used to create multiple sessions, you can bind a separate session for each Target, or you can let multiple Targets share the same session. -
Page
browser tab, Chrome allows multiple Pages to be opened, but only one Page is always active. -
Runtime
JavaScript runtime, used to inject JS code into a web page to manipulate the DOM. -
Frame
The frame in a web page, the main Frame is allowed to contain multiple child frames. -
Context
The context in which the JavaScript runs. Since a page may contain Frames, each Frame has a separate runtime, so a unique contextId needs to be generated to distinguish between them.
The 301 redirects to a new url and triggers multiple context switches in a row, resulting in a context mismatch. 301 is hard to detect and difficult to predict, so extra care should be taken when debugging.
Browser navigation events can be divided into two kinds of predictable and unpredictable, because the navigation is triggered in many ways, through the mouse, keyboard, JS script method may trigger unknown navigation events. If the navigation switch is not timed correctly, it can create a bug of misplaced contextual messages.
For explicit operations like chrome.newPage() and page.goto(), which explicitly include navigation behavior, autoChrome wraps them internally and requires no additional processing when using them.
-
Navigation may refresh the current tab or create a new tab
-
Unpredictable navigation behavior of jump links triggered by JS
-
One or more 301 redirects after clicking a link
-
Uncertainty of redirects due to browser 301 caching
In the above case, it is impossible to accurately predict whether an action will trigger a navigation event. autoChrome implements automatic navigation by means of round-robin detection, which has the disadvantage of not being very time-efficient and has limited application scenarios.
// await navigation keyboard sample code
await Promise.all([chrome.keyboard.press("Enter"), chrome.autoNav()]);
// Mouse example
await Promise.all([page.click("#input"), chrome.autoNav()]);
If your chrome is running on a high split-screen device, you may get a serious bug with touch events being misaligned, in which case try using "--force-device-scale-factor=" to adjust the scaling.
-
options
Object Global instance configuration options, with lower priority than page-
args[ars, ...]
Array Array of Chrome startup parametersars
String Chrome startup parameters
-
executablePath
String Chrome program execution path -
userDataDir
String Path to user profile, defines separate Chrome instances, supports parallelism in cluster mode -
emulate
Object Device emulation, this configuration is less effective for initial tags, probably because the initial targetCreated event is not caught.-
viewport
Object-
mobile
Boolean Mobile device, default false -
width
Number Screen width, default adaptive screen width -
height
Number Screen height, default adaptive screen height
-
-
geolocation
Object geolocation, use Google Maps coordinates-
longitude
Number Longitude -
latitude
Number Latitude -
accuracy
Number Accuracy
-
-
-
headless
Boolean Hide execution mode, default false -
devtools
Boolean Automatically turn on devtools for each page, default false -
timeOut
Number Message response timeout, default 150000 -
ignoreHTTPSErrors
Boolean Ignore https errors, default false -
disableDownload
Boolean Disable downloading of files, default false -
loadTimeout
Number Maximum dwell time for auto-navigation to wait for a page to load, in ms
-
-
return
Chrome Chrome class instance
mouse, touch event instance, referenced from the currently active state page.clicker
Keyboard event instance, referenced from the currently active state page.keyboard
Map object containing all open pages
The page that is currently active
url
String opens the page address, by default it opens a blank page
Closes the specified tab by pageId
pageId
String The id of the page to delete
Creates a standalone browser environment that can only be run in incognito mode.
Sends the original chrome devtools protocol message
-
method
String method name -
params
Object parameters
time
Number Wait timeout time
Cyclic monitoring, automatic navigation
Close the browser
mouse, touch event instances, use touch instance when autoChrome(options) configuration item emulate.viewport.mobile is true, otherwise use mouse instance
Keyboard instance
Device emulation, direct calls to this method may cause confusion, the normal event-driven execution of page.emulate() should be done at the creation of the label, manual calls will have a delayed override problem.
-
options
Object options-
mobile
Boolean Mobile device -
width
Number screen width -
width
Number screen height -
geolocation
Object geolocation-
longitude
Number Longitude -
latitude
Number Latitude -
accuracy
Number precision
-
-
Open a new page inside a tab
Inject a js function into the page and get the return value after execution
-
pageFunction
Function injects the function -
arg
* serializable arguments, no function support -
return
Object Information about the remote resource, RemoteObject
Select a single element
-
selector
String CSS selector -
return
Object Single Elment Instance
Selects multiple elements
-
selector
String CSS selector -
return
Array Array of multiple Elment instances
Click on an element with a CSS selector
Clicking on an element via CSS selector, built-in navigation
selector
String CSS selector
Focus input with CSS selector, input text
-
selector
String CSS selector -
text
String input text -
options
Object configuration informationdelay
Number input interval, ms
Send the original chrome devtools protocol message containing the session
-
method
String method name -
params
Object parameters
Scrolls to the viewable area of the specified element, trying to center it along the Y-axis
selector
String CSS selector
Focuses the element with a CSS selector
selector
String CSS selector
Get the element coordinates by CSS selector, the value is obtained by the getBoundingClientRect() function
selector
String CSS selector
Close the tag
Navigate to the previous history tab
Navigate to the next history tab
Used to implement traceable remote elment to avoid duplicate code commits and repeated executions.
For large objects or DOM objects, it is not practical to return them directly, so an incremental mechanism for remote operations is needed. devtools implements state tracking by saving the execution results of injected functions and returning the reference id, so that incremental operations can be done on top of existing remote results.
-
selector
String -
return
Object Elment Instance
Selects a single element and generates a remote reference object
-
selector
String -
return
Array Array of multiple Elment instances
Selects multiple elements and generates a remote reference object
name
String
Get the value of the property specified in elment
-
name
String Attribute name -
value
* property value
Set the value of the attribute specified in the elment
value
String Assignment
Gets or sets the value, only for form elements
Focus the element
Get the element size and coordinates with the getBoundingClientRect function
Quickly switch the specified element to the viewable area
Add a new mousemoved track simulation, the original click may only trigger once for efficiency reasons
The click operation already includes move, so in most cases it is no longer necessary to simulate a separate move operation, unless you only move the mouse and do not need to click
-
options
Objectsteps
Number The number of times the mousemoved event is triggered, default 20
Change the default value of steps to 20, the original value is 1, i.e. only triggered once. If the move distance is the same, the lower the number of triggers, the faster the corresponding move speed
-
options
Object optionssteps
Number The number of times the mousemoved event is triggered, the default value is 20
Scroll to specified coordinates, currently only supports vertical scrolling
-
x
Number horizontal coordinate, 0 -
y
Number vertical coordinate -
step
Number step length
Simulate touch single swipe gesture
-
start
Object Start coordinates-
x
Number touchstart x-coordinate -
y
Number touchstart y coordinate
-
-
end
Object end coordinates-
x
Number touchend x coordinate -
y
Number touchend y coordinate
-
-
steps
Number number of touchmove triggers -
delay
Number dwell time before touch release, used for slide inertia control
Scrolls the page to the specified visual coordinates by touch
-
x
Number Target x coordinate -
y
Number target y-coordinate -
options
Objectinterval
Number The interval between successive swipes, default 2000, in ms
Translated with www.DeepL.com/Translator (free version)