Skip to content

feat: agent workspace isolation — cooperative input, Agent Space, PiP monitor, Windows VDD #29

@richard-devbot

Description

@richard-devbot

Context

The cursorless UI automation PR eliminates cursor/focus conflicts for accessibility-enabled elements. This issue tracks the next layer of agent/user coexistence — workspace-level isolation so the agent and user can genuinely work in parallel.

Backlog items

1. Linux AT-SPI support

Add pyatspi (D-Bus accessibility API) as an optional dependency. Implement cursorless click/type for GNOME/GTK apps on Linux. Wayland needs ydotool for input injection.

2. Cooperative input locking

Detect user mouse/keyboard activity via CGEventTap (macOS) or a low-level WH_MOUSE_LL/WH_KEYBOARD_LL hook (Windows). Queue agent actions while the user is actively typing or clicking. Resume on idle. Prevents conflicts when the cursorless path falls back to coordinate events.

3. Agent Space (macOS Spaces)

Subscribe to NSWorkspaceActiveSpaceDidChangeNotification. When the agent opens an app, force it to Space 2 using CGSMoveWindowsToManagedSpace (private API) or AppleScript. User stays on Space 1 and sees zero agent windows.

4. Picture-in-picture agent monitor

A floating PySide6 or Electron window on the user's screen showing a live view of the agent's active window. User can observe without switching Space.

5. Windows virtual display confinement (WindowsPC-MCP integration)

Port the Parsec Virtual Display Driver layer from https://github.com/ShikeChen01/WindowsPC-MCP into Operator-Use as an optional Windows-only plugin. Creates a dedicated virtual monitor for the agent — spatial isolation on top of the API-level isolation already shipped.

Reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions