support claude computer use on livekit-plugins-browser#4882
support claude computer use on livekit-plugins-browser#4882theomonnom merged 22 commits intomainfrom
Conversation
- Revert ProviderTool to just having id (remove definition) - Add AnthropicTool/ComputerUse in anthropic plugin (matches OpenAI pattern) - Route AnthropicTool.to_dict() through anthropic LLM plugin - Remove provider tool appending from core format handler - JSON-encode screenshot content in FunctionCallOutput.output - Remove FunctionCallOutput.content field - Hardcode f-keys, remove redundant comments, fix lint
…CEF keystroke, execute validates args
…eclaim
- type_text: use KEY_NAME_TO_VK for correct VK codes, CHAR-only for
shifted punctuation to avoid collisions (e.g. ord('!')=33=VK_PRIOR)
- add BrowserSession.reclaim_agent_focus() that broadcasts to participants
- use it in BrowserAgent for both initial start and interrupt recovery
… + shift Add SHIFTED_CHAR_TO_VK mapping (US layout) so shifted punctuation like !@#$% sends RAWKEYDOWN with the correct base key VK code and MOD_SHIFT, matching what Chrome actually dispatches. Uppercase letters also include MOD_SHIFT. Unknown chars (unicode) fall back to CHAR-only.
…rection - CHAR events now pass the character code (e.g. 97 for 'a') as windows_key_code instead of the VK code (65), matching Chrome behavior - Revert horizontal scroll: positive deltaX = scroll left in CEF - Wrap shifted characters with explicit Shift RAWKEYDOWN/KEYUP events
| elif direction == "left": | ||
| delta_x = pixels | ||
| elif direction == "right": | ||
| delta_x = -pixels |
There was a problem hiding this comment.
🔴 Horizontal scroll directions are inverted (left scrolls right, right scrolls left)
The scroll method in PageActions assigns the wrong sign to delta_x for horizontal scrolling, causing "left" to scroll right and "right" to scroll left.
Root Cause
In CEF's SendMouseWheelEvent, positive deltaX scrolls right and negative deltaX scrolls left. The vertical scroll directions are implemented correctly: "down" → delta_y = -pixels (negative = down in CEF) and "up" → delta_y = pixels (positive = up in CEF).
However, the horizontal directions are reversed:
"left"→delta_x = pixels(positive) → actually scrolls right"right"→delta_x = -pixels(negative) → actually scrolls left
They should be:
"left"→delta_x = -pixels(negative) → scrolls left"right"→delta_x = pixels(positive) → scrolls right
Impact: When the Anthropic computer-use model requests a horizontal scroll, the browser will scroll in the opposite direction, causing incorrect behavior for any task requiring horizontal scrolling.
| elif direction == "left": | |
| delta_x = pixels | |
| elif direction == "right": | |
| delta_x = -pixels | |
| elif direction == "left": | |
| delta_x = -pixels | |
| elif direction == "right": | |
| delta_x = pixels | |
Was this helpful? React with 👍 or 👎 to provide feedback.
Broadcast agent cursor position on browser-agent-cursor data channel when executing computer tool actions with coordinates.
No description provided.