Skip to content

Conversation

@emregucerr
Copy link

why

OpenAI's computer-use-preview model tends to clear any input fields before typing in by running CTRL+A and DEL. However, the agent handler does not handle the meta keys properly resulting in extra 'A' typed in the field.

what changed

a special case for keys arrays including the meta key in the agent handler

caveats

[IMPORTANT] This PR completely ignores Windows machines and only works for Linux and MacOS.

@changeset-bot
Copy link

changeset-bot bot commented Apr 18, 2025

⚠️ No Changeset found

Latest commit: fc74166

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Fixed key combination handling in the Stagehand agent to properly manage CTRL+A and DEL sequences when clearing input fields, particularly for OpenAI's computer-use-preview model.

  • Added special case in /lib/handlers/agentHandler.ts to handle meta key combinations by pressing keys simultaneously
  • Modified key mapping to use Meta on macOS and Control on other platforms
  • Fixed issue where CTRL+A was incorrectly typing extra 'A' character in input fields
  • LIMITATION: Implementation only works on Linux and macOS, not Windows
  • IMPORTANT: Ensure users are aware of platform-specific behavior for key combinations

💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!

1 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines 292 to 296
const playwrightKeys = keys.map((key) => {
if (key.includes("CTRL")) return "Meta";
if (key.includes("CMD") || key.includes("COMMAND"))
return "Meta";
return this.convertKeyName(key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Always mapping CTRL to Meta will break Windows functionality. Need to handle Windows platform separately instead of ignoring it.

Suggested change
const playwrightKeys = keys.map((key) => {
if (key.includes("CTRL")) return "Meta";
if (key.includes("CMD") || key.includes("COMMAND"))
return "Meta";
return this.convertKeyName(key);
const playwrightKeys = keys.map((key) => {
if (key.includes("CTRL")) return process.platform === "darwin" ? "Meta" : "Control";
if (key.includes("CMD") || key.includes("COMMAND"))
return "Meta";
return this.convertKeyName(key);

Comment on lines 299 to 310
// Press all keys down in sequence
for (const key of playwrightKeys) {
await this.stagehandPage.page.keyboard.down(key);
}

// Small delay to ensure the combination is registered
await new Promise((resolve) => setTimeout(resolve, 100));

// Release all keys in reverse order
for (const key of playwrightKeys.reverse()) {
await this.stagehandPage.page.keyboard.up(key);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Key release order should match press order for consistent behavior. Currently pressing in forward order but releasing in reverse.

@emregucerr emregucerr closed this Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant