Skip to content

feat(keyboard): add pressSequence() for batched key presses#40734

Closed
SebTardif wants to merge 2 commits into
microsoft:mainfrom
SebTardif:feat/keyboard-press-sequence
Closed

feat(keyboard): add pressSequence() for batched key presses#40734
SebTardif wants to merge 2 commits into
microsoft:mainfrom
SebTardif:feat/keyboard-press-sequence

Conversation

@SebTardif
Copy link
Copy Markdown
Contributor

Summary

  • Add keyboard.pressSequence(keys, options) to press an array of named keys in a single protocol round-trip
  • Follows the same server-side batching pattern as the existing keyboard.type() for characters

Motivation

keyboard.type("hello") batches character-level key presses into a single protocol message, with all character events processed server-side. There is no equivalent for named keys (Space, ArrowDown, Enter, etc.).

When automating keyboard-driven interactions like drag-and-drop via accessibility patterns (Space to grab, Arrow keys to move, Space to drop) or multi-step form navigation, each keyboard.press() call requires a separate protocol round-trip:

// Current: 5 protocol round-trips
await page.keyboard.press('Space');
await page.keyboard.press('ArrowDown');
await page.keyboard.press('ArrowDown');
await page.keyboard.press('ArrowDown');
await page.keyboard.press('Space');
// New: 1 protocol round-trip
await page.keyboard.pressSequence(['Space', 'ArrowDown', 'ArrowDown', 'ArrowDown', 'Space'], { delay: 100 });

This is particularly relevant for:

  • Keyboard-based drag-and-drop: Mouse-based dragTo() does not work with popular DnD libraries like react-beautiful-dnd and dnd-kit (#13855, #35749). The common workaround is keyboard-based reordering (Space/Arrow/Space), which requires multiple sequential keyboard.press() calls.

  • AI browser agents: Tools like browser-use, Playwright MCP, and other automation agents rely heavily on keyboard sequences for form navigation. Each keyboard.press() call incurs protocol serialization + WebSocket frame + async scheduling overhead, which compounds in tight loops. See browser-use#4683 for a related request about faster input clearing.

  • Remote automation: When Playwright connects to a remote browser via connect() or connect_over_cdp(), network latency amplifies per-call overhead from ~5-10ms locally to 50-200ms per call. A 5-key drag sequence goes from ~250ms to ~1s of pure overhead.

Unlike evaluateAll alternatives, this cannot be replicated with page.evaluate(): CDP keyboard events go through the browser's native input pipeline (Input.dispatchKeyEvent), triggering IME, contenteditable, autocomplete, and framework-specific event handlers that synthetic dispatchEvent() calls do not.

API

// Press keys sequentially with optional delay between presses
await page.keyboard.pressSequence(['Space', 'ArrowDown', 'Space'], { delay: 100 });

// Supports modifier combinations (same format as keyboard.press)
await page.keyboard.pressSequence(['ControlOrMeta+A', 'Backspace']);

// Empty array is a no-op
await page.keyboard.pressSequence([]);

Implementation

Follows the standard 6-step API addition process: docs, client, protocol, dispatcher, server, tests. The server-side pressSequence() method iterates the key array and delegates each press to the existing press() method, applying the delay between consecutive presses. 7 tests covering: basic sequencing, delay timing, empty/single arrays, modifier combos, event ordering, and invalid key error handling.

@pavelfeldman
Copy link
Copy Markdown
Member

It is best to start with filing an issue, please see contributors' guide

@SebTardif
Copy link
Copy Markdown
Contributor Author

Filed #40740 as requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants