Skip to content

feat(computer-use): verify screenshot flow, OCR click trust, key chords#299

Merged
GCWing merged 1 commit intoGCWing:mainfrom
bobleer:feat/computer-use-ocr-ax-prompt
Mar 28, 2026
Merged

feat(computer-use): verify screenshot flow, OCR click trust, key chords#299
GCWing merged 1 commit intoGCWing:mainfrom
bobleer:feat/computer-use-ocr-ax-prompt

Conversation

@bobleer
Copy link
Copy Markdown
Collaborator

@bobleer bobleer commented Mar 28, 2026

Summary

  • Verify screenshot: After click / key / type / scroll / drag, recommend screenshot to confirm UI (Cowork-style); exposed via recommend_screenshot_to_verify_last_action and interaction state.
  • Pointer trust: After move_to_text (OCR globals) and after type_text, relax stale-capture guard so Enter/search flows are not blocked.
  • Key chords: Modifier timing for Cmd/Ctrl combos; additional arrow key name aliases; chord path works on non-macOS.
  • macOS: AX/OCR and ui_locate refinements (screen_ocr, macos_ax_ui, ui_locate_common).
  • Core: computer_use_host, tool implementations, claw_mode.md; web-ui session-config i18n.

Testing

  • cargo check -p bitfun-desktop (pass)

Notes

10 files; branch from upstream/main.

- Track pending_verify_screenshot and recommend screenshot after committed UI actions
- Trust pointer after OCR move_to_text and after type_text (avoid blocking Enter)
- Key chords: modifier hold timing; arrow key name aliases; chord handling on non-macOS
- macOS AX/OCR and ui_locate refinements; core tool result and claw_mode prompt updates
- Session-config i18n for new interaction hints
@GCWing GCWing merged commit a3d4215 into GCWing:main Mar 28, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants