Skip to content

Linux Support

Claude edited this page Jun 30, 2026 · 1 revision

Linux Support (native X11)

The mouse, keyboard, screenshot, process, recording, window management and background/unfocused targeting tools all work natively on Linux. On Linux hwnd is an X11 window id.

Prerequisites

An X11 session (or XWayland). Install the helpers:

Debian/Ubuntu:

sudo apt install xdotool wmctrl x11-utils imagemagick xvfb

Fedora:

sudo dnf install xdotool wmctrl xorg-x11-utils ImageMagick xorg-x11-server-Xvfb

pyautogui (foreground mouse/keyboard) also needs python3-xlib and a screenshot backend.

Check what the server can see:

linux_status {}

Returns {display, session_type, xdotool, wmctrl, xvfb, imagemagick_import, scrot}.

Backends

Capability Tool used
Window list/move/resize/action xdotool (+ wmctrl for maximize)
Show/hide window xdotool windowmap/windowunmap
Background type / keys xdotool type/key --window (XSendEvent)
Background click xdotool mousemove --window + click
Per-window capture ImageMagick import -window (fallback: mss region)
Headless display Xvfb

Headless-with-GUI (Xvfb)

Start a virtual display:

create_virtual_display { "display": 99, "width": 1280, "height": 800 }

Launch a GUI app onto it:

launch_on_virtual_display { "display": 99, "command": "xterm -e bash" }

List its windows:

list_virtual_display_windows { "display": 99 }

Drive a window on that display — pass display so input is routed there:

type_text { "hwnd": 2097164, "display": 99, "text": "echo hi" }
win_send_keys { "hwnd": 2097164, "display": 99, "keys": ["enter"] }

Capture one window, or the whole display:

screenshot { "hwnd": 2097164, "display": 99 }
screenshot_virtual_display { "display": 99 }

Stop it:

stop_virtual_display { "display": 99 }

Caveats

  • Background typing uses XSendEvent — most GTK/Qt apps accept it; xterm ignores it unless launched with -xrm 'XTerm.vt100.allowSendEvents: true'. Focus the window first for strict apps.
  • Pure Wayland windows (not XWayland) are not controllable — no portable API exists. linux_status reports session_type.
  • window_action maximize requires wmctrl.
  • To drive a window on a virtual display you must pass display, otherwise the default :0 is used and the window won't be found.

Clone this wiki locally