GUI AI Bridge(建议命令名为 guib)是一个本地 GUI 自动化 CLI 工具:
get:读取当前窗口语义结构(无障碍树)do:执行点击、输入、热键、滚动、拖拽do instruction:将自然语言指令拆解为动作链并执行
当前版本以 Windows 为主。
这条路线最简单:不需要 Python,下载后加入 PATH 即可在任意命令行使用。
- 系统:Windows 10/11
- 建议在普通桌面会话运行(避免受限远程会话)
- 建议目标应用和
guib使用相同权限级别(都普通权限,或都管理员权限)
- 打开项目仓库
- 进入 Releases
- 下载以下任一文件:
guib.exe(推荐,命令短)gui-ai-bridge.exe(完整命名)
如果没有可用发布包,请直接使用第 4 节“源码安装与本地打包”。
建议统一放在 C:\Tools\guib:
New-Item -ItemType Directory -Path C:\Tools\guib -Force | Out-Null
Copy-Item "$HOME\Downloads\guib.exe" "C:\Tools\guib\guib.exe" -ForcePowerShell(当前用户,推荐):
$target = "C:\Tools\guib"
$current = [Environment]::GetEnvironmentVariable("Path", "User")
if ([string]::IsNullOrWhiteSpace($current)) {
[Environment]::SetEnvironmentVariable("Path", $target, "User")
} elseif ($current -notlike "*$target*") {
[Environment]::SetEnvironmentVariable("Path", "$current;$target", "User")
}
if ($env:Path -notlike "*$target*") { $env:Path = "$env:Path;$target" }CMD(当前用户,持久生效):
setx PATH "%PATH%;C:\Tools\guib"说明:setx 不会刷新当前终端,会在新开的终端中生效。
where guib
guib env check --json
guib get scan若 where guib 显示 C:\Tools\guib\guib.exe,说明 PATH 配置成功。
guib get scan
guib get screen --compact-tree --a11y-backend auto
guib get screen --jsonget screen 输出包含:
Snapshot Summary:节点概况Backend Quality:后端质量评分Semantic Digest:页面语义摘要Actionable Targets Top:可操作目标候选
guib do click "创建仓库" --window-target "edge" --a11y-backend auto
guib do type "hello" --window-target "edge"
guib do hotkey "ctrl+l" --window-target "edge"
guib do swipe down --window-target "edge" --distance 500
guib do drag 240 300 240 700 --window-target "edge"guib do instruction "点击搜索框 输入 \"Python GUI\" 并回车" --window-target "msedge.exe"
guib do instruction "press ctrl+l" --window-target "msedge.exe"当没有发布包,或你需要自行构建时,使用本节。
- Python 3.10+
- Windows 10/11
cd d:\python_code\PythonApplication_guibridge
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -r requirements-windows.txt如遇 externally-managed-environment,请使用 .venv,不要直接改系统 Python。
pytest -qpython -m PyInstaller --noconfirm --clean guib.spec产物路径:
dist\guib.exe
打包后请回到第 2.3 和 2.4 节,将 dist\guib.exe 复制到固定目录并加入 PATH。
-
命令不存在(
guib不是内部或外部命令)- 运行
where guib - 确认
C:\Tools\guib已加入 PATH - 关闭并重开终端
- 运行
-
E_PERMISSION- 当前构建仅支持 Windows
- 检查权限级别是否不一致(例如目标应用管理员启动,而 guib 普通启动)
-
E_NOT_FOUND- 先执行
guib get screen --compact-tree --a11y-backend auto - 再按后端顺序重试:
cdp -> ia2 -> msaa -> uia -> hwnd
- 先执行
-
输出信息太少
- 先切换后端重读
- 必要时改用
--full-tree
- 复杂页面优先使用“读屏 -> 小步动作 -> 再读屏”的闭环。
- Electron 或浏览器场景不要固定单后端反复失败。
- 详细执行规范请参考
guide.md。
GUI AI Bridge (recommended command name: guib) is a local GUI automation CLI:
get: read semantic GUI structure (accessibility tree)do: execute click, type, hotkey, swipe, and drag actionsdo instruction: convert natural-language instructions into executable action steps
The current version is Windows-focused.
This is the easiest path: no Python required. Download the EXE and add it to PATH.
- OS: Windows 10/11
- Recommended: run in a normal desktop session (avoid restricted remote sessions)
- Recommended: run target apps and
guibat the same privilege level (both normal, or both admin)
- Open the repository page
- Go to Releases
- Download one of the following assets:
guib.exe(recommended, shorter command)gui-ai-bridge.exe(full name)
If no release asset is available, use Section 4 (source install and local packaging).
Recommended location: C:\Tools\guib
New-Item -ItemType Directory -Path C:\Tools\guib -Force | Out-Null
Copy-Item "$HOME\Downloads\guib.exe" "C:\Tools\guib\guib.exe" -ForcePowerShell (current user, recommended):
$target = "C:\Tools\guib"
$current = [Environment]::GetEnvironmentVariable("Path", "User")
if ([string]::IsNullOrWhiteSpace($current)) {
[Environment]::SetEnvironmentVariable("Path", $target, "User")
} elseif ($current -notlike "*$target*") {
[Environment]::SetEnvironmentVariable("Path", "$current;$target", "User")
}
if ($env:Path -notlike "*$target*") { $env:Path = "$env:Path;$target" }CMD (current user, persistent):
setx PATH "%PATH%;C:\Tools\guib"Note: setx does not refresh the current terminal session. Open a new terminal window.
where guib
guib env check --json
guib get scanIf where guib shows C:\Tools\guib\guib.exe, PATH is configured correctly.
guib get scan
guib get screen --compact-tree --a11y-backend auto
guib get screen --jsonget screen output includes:
Snapshot Summary: node-level summaryBackend Quality: backend quality scoresSemantic Digest: page semantic digestActionable Targets Top: high-priority actionable targets
guib do click "Create repository" --window-target "edge" --a11y-backend auto
guib do type "hello" --window-target "edge"
guib do hotkey "ctrl+l" --window-target "edge"
guib do swipe down --window-target "edge" --distance 500
guib do drag 240 300 240 700 --window-target "edge"guib do instruction "click search box then type \"Python GUI\" and press enter" --window-target "msedge.exe"
guib do instruction "press ctrl+l" --window-target "msedge.exe"Use this section when no release binary is available or when you need to build locally.
- Python 3.10+
- Windows 10/11
cd d:\python_code\PythonApplication_guibridge
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -r requirements-windows.txtIf you see externally-managed-environment, install in .venv instead of system Python.
pytest -qpython -m PyInstaller --noconfirm --clean guib.specOutput binary:
dist\guib.exe
Then return to Sections 2.3 and 2.4 to move the EXE and add PATH.
-
Command not found (
guibis not recognized)- Run
where guib - Confirm
C:\Tools\guibis in PATH - Restart terminal
- Run
-
E_PERMISSION- Current build is Windows-only
- Check privilege mismatch between target app and
guib
-
E_NOT_FOUND- Run
guib get screen --compact-tree --a11y-backend autofirst - Retry backends in order:
cdp -> ia2 -> msaa -> uia -> hwnd
- Run
-
Too little semantic output
- Switch backend and capture again
- Use
--full-treewhen needed
- For complex pages, use the loop: capture -> small action -> capture again.
- For browser and Electron apps, avoid repeating the same backend endlessly.
- For a detailed execution playbook, see
guide.md.
- Snapshot Summary: 简短的节点统计(总节点、命名节点、可操作节点、输入类节点)和一个短哈希,用于快速判断当前抓取是否有意义或与上次抓取是否相同。可用于快速去重与回放验证。
- Semantic Digest: 从可读节点抽取的语义行(例如按钮/标签/行条目),供 AI 模型快速理解页面关键信息;它不是完整树的替代,而是用于提示和优先级判断。
使用建议:
- 当 Snapshot Summary 显示命名节点很少或 actionable_nodes 很少时,优先切换后端或使用
--full-tree;仅在这些重试无效时再考虑最大化窗口。 - Semantic Digest 行数能帮助判断页面复杂度:若行数很少,说明页面语义信号薄弱,需要人工或坐标策略配合。
E_PERMISSION: 建议用户运行whoami /priv检查当前用户权限与目标应用权限是否匹配;若权限不一致,建议以相同权限级别(均为管理员或均为普通用户)重新运行guib。E_NOT_FOUND: 建议先运行guib get screen --full-tree --a11y-backend auto以收集更多调试信息,并按后端顺序逐一尝试(cdp -> ia2 -> msaa -> uia -> hwnd)。可通过--json导出完整树供离线分析。