Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
c662276
Sync docs to AudioCaptureConfig API after the SonarCloud refactor
JE-Chen Apr 26, 2026
1cadfa0
Redesign Remote Desktop tab around a connection-card layout
JE-Chen Apr 26, 2026
d3717a1
Split remote_desktop_tab.py into a gui/remote_desktop/ subpackage
JE-Chen Apr 26, 2026
141d1fe
Fix FakeState.mouse_actions Tuple annotation on Python 3.10
JE-Chen Apr 27, 2026
bee283f
Add operations layer + USB Phase 2 chain (rounds 22-47)
JE-Chen Apr 27, 2026
7fd79bb
Document operations layer + USB passthrough chain in /docs
JE-Chen Apr 27, 2026
3696152
Update READMEs (features + mermaid + directory tree) for rounds 22-47
JE-Chen Apr 27, 2026
fcd61b4
Stop je_auto_control_dev shadowing the editable install in CI
JE-Chen Apr 27, 2026
5916e50
Add Python 3.13 + 3.14 to CI matrices
JE-Chen Apr 27, 2026
fbd3b25
Fix two CI flakes surfaced by GitHub Windows runners
JE-Chen Apr 27, 2026
d4083b5
Bump WS auth timeouts to 60 s for slow CI runners
JE-Chen Apr 27, 2026
0929195
Mark WS handshake tests flaky with reruns=2 for slow CI runners
JE-Chen Apr 27, 2026
9b9b1ff
Cap pre-auth handshake recv at 5 s so bad-protocol clients fail fast
JE-Chen Apr 27, 2026
9a0924b
Revert "Cap pre-auth handshake recv at 5 s so bad-protocol clients fa…
JE-Chen Apr 27, 2026
fe957f1
Fix WS handshake over-reading and dropping the next protocol frame
JE-Chen Apr 27, 2026
cdd1f65
Address SonarCloud + Codacy BLOCKER and bug findings on PR #182
JE-Chen Apr 27, 2026
b447af6
Sweep SonarCloud minor lints on PR #182
JE-Chen Apr 27, 2026
e652904
Sweep JS / HTML smells in web_viewer + swagger + mic-worklet
JE-Chen Apr 27, 2026
c2be066
Triage SonarCloud security hotspots on PR #182
JE-Chen Apr 27, 2026
81b841f
Reduce cognitive complexity in WebRTC + REST hot paths (S3776)
JE-Chen Apr 27, 2026
38d238d
Tighten Sonar/Codacy markers and clear remaining PR #182 findings
JE-Chen Apr 27, 2026
0c60c8f
Pop the remote desktop into its own window (AnyDesk-style)
JE-Chen Apr 27, 2026
f444507
Hoist _CollapsibleSection import to fix ruff F821
JE-Chen Apr 27, 2026
03c9dfe
Make Remote Desktop tabs responsive at any window size
JE-Chen Apr 27, 2026
51fc521
Expose remote desktop over MCP and document recent UX changes
JE-Chen Apr 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.10", "3.11", "3.12" ]
python-version: [ "3.10", "3.11", "3.12", "3.13", "3.14" ]

steps:
- uses: actions/checkout@v4
Expand Down
87 changes: 87 additions & 0 deletions .github/workflows/quality.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: AutoControl Code Quality

# Static analysis (ruff, bandit) plus the headless pytest suite added in
# rounds 22-30. Decoupled from the existing dev/stable workflows, which
# run legacy standalone test scripts and exist for hardware integration
# coverage on Windows runners.

on:
push:
branches: [ "dev", "main", "stable" ]
pull_request:
branches: [ "dev", "main", "stable" ]
workflow_dispatch:

permissions:
contents: read

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: "pip"

- name: Install ruff
run: |
python -m pip install --upgrade pip
pip install ruff

- name: Run ruff
run: ruff check je_auto_control/

security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: "pip"

- name: Install bandit
run: |
python -m pip install --upgrade pip
pip install bandit

- name: Run bandit (recursive, skip tests + i18n dicts)
run: bandit -r je_auto_control/ -c pyproject.toml

pytest-headless:
runs-on: windows-2022
strategy:
fail-fast: false
matrix:
python-version: [ "3.10", "3.11", "3.12", "3.13", "3.14" ]
steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: "pip"

- name: Install dependencies
run: |
python -m pip install --upgrade pip wheel
# Install the editable package FIRST so its source dir is the
# one Python sees on subsequent imports. We deliberately
# avoid `pip install -r dev_requirements.txt` here because
# that file pulls in `je_auto_control_dev` (a separate PyPI
# package), which ships its own snapshot of `je_auto_control/`
# straight into site-packages and masks the editable install
# for any sub-package the snapshot doesn't include
# (admin, usb, remote_desktop, vision, …).
pip install -e .
pip install ruff==0.15.9 bandit==1.9.4 pytest==9.0.2 pytest-timeout==2.4.0 pytest-rerunfailures==15.1 PySide6==6.11.0

- name: Run headless pytest suite
run: pytest test/unit_test/headless/ -v --tb=short --timeout=120
2 changes: 1 addition & 1 deletion .github/workflows/stable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.10", "3.11", "3.12" ]
python-version: [ "3.10", "3.11", "3.12", "3.13", "3.14" ]

steps:
- uses: actions/checkout@v4
Expand Down
76 changes: 68 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
- **OCR** — extract text from screen regions using Tesseract; wait for, click, or locate rendered text; regex search and full-region dump
- **LLM Action Planner** — translate a plain-language description into a validated `AC_*` action list using Claude
- **Runtime Variables & Control Flow** — `${var}` substitution at execution time, plus `AC_set_var` / `AC_inc_var` / `AC_if_var` / `AC_for_each` / `AC_loop` / `AC_retry` for data-driven scripts
- **Remote Desktop** — stream this machine's screen and accept remote input over a token-authenticated TCP protocol, *or* connect to another machine and view + control it (host + viewer GUIs included). Optional TLS (HTTPS-grade encryption), WebSocket transport (ws:// + wss:// for browser / firewall-friendly clients), persistent 9-digit Host ID, host→viewer audio streaming, bidirectional clipboard sync (text + image), and chunked file transfer (drag-drop + progress bar; arbitrary destination path; no size cap)
- **Remote Desktop** — stream this machine's screen and accept remote input over a token-authenticated TCP protocol, *or* connect to another machine and view + control it (host + viewer GUIs included). Optional TLS (HTTPS-grade encryption), WebSocket transport (ws:// + wss:// for browser / firewall-friendly clients), persistent 9-digit Host ID, host→viewer audio streaming, bidirectional clipboard sync (text + image), and chunked file transfer (drag-drop + progress bar; arbitrary destination path; no size cap). Plus folder sync (additive mirror — local deletions never propagate) and a self-hosted coturn TURN config bundle generator (turnserver.conf + systemd unit + docker-compose + README). **AnyDesk-style popout**: when the viewer authenticates, the live remote desktop opens in its own resizable top-level window so the control panel stays uncluttered. The Remote Desktop tabs are wrapped in `QScrollArea` so the panel stays usable on small windows and stretches edge-to-edge on 4K displays. Driveable headlessly via `je_auto_control` and over MCP through the new `ac_remote_*` tools
- **Clipboard** — read/write system clipboard text on Windows, macOS, and Linux
- **Screenshot & Screen Recording** — capture full screen or regions as images, record screen to video (AVI/MP4)
- **Action Recording & Playback** — record mouse/keyboard events and replay them
Expand All @@ -73,8 +73,8 @@
- **Event Triggers** — fire scripts when an image appears, a window opens, a pixel changes, or a file is modified
- **Run History** — SQLite-backed run log across scheduler / triggers / hotkeys / REST with auto error-screenshot artifacts
- **Report Generation** — export test records as HTML, JSON, or XML reports with success/failure status
- **MCP Server** — JSON-RPC 2.0 Model Context Protocol server (stdio + HTTP/SSE) so Claude Desktop / Claude Code / custom tool-use loops can drive AutoControl. ~90 tools, full protocol coverage (resources, prompts, sampling, roots, logging, progress, cancellation, elicitation), bearer-token auth + TLS, audit log, rate limit, plugin hot-reload, CI fake backend
- **Remote Automation** — TCP socket server **and** REST API server to receive automation commands
- **MCP Server** — JSON-RPC 2.0 Model Context Protocol server (stdio + HTTP/SSE) so Claude Desktop / Claude Code / custom tool-use loops can drive AutoControl. ~100 tools, full protocol coverage (resources, prompts, sampling, roots, logging, progress, cancellation, elicitation), bearer-token auth + TLS, audit log, rate limit, plugin hot-reload, CI fake backend. New in this release: `ac_remote_host_start` / `ac_remote_host_stop` / `ac_remote_host_status` / `ac_remote_viewer_connect` / `ac_remote_viewer_disconnect` / `ac_remote_viewer_status` / `ac_remote_viewer_send_input` wrap the same singleton remote-desktop registry the GUI uses, so a model can spin up a host, open a viewer to another machine, and forward mouse / keyboard / type / hotkey actions through the active session
- **Remote Automation** — TCP socket server **and** hardened REST API: bearer-token auth, per-IP rate limit + lockout, SQLite audit hook, Prometheus `/metrics`, OpenAPI-style endpoint table (`/health`, `/screen_size`, `/sessions`, `/screenshot`, `/execute`, `/audit/list`, `/audit/verify`, `/inspector/recent`, `/usb/devices`, `/diagnose`, ...), and a vanilla-JS browser dashboard at `/dashboard` (any phone with HTTP reach can monitor the host)
- **Plugin Loader** — drop `.py` files exposing `AC_*` callables into a directory and register them as executor commands at runtime
- **Shell Integration** — execute shell commands within automation workflows with async output capture
- **Callback Executor** — trigger automation functions with callback hooks for chaining operations
Expand All @@ -84,6 +84,15 @@
- **GUI Application** — built-in PySide6 graphical interface with live language switching (English / 繁體中文 / 简体中文 / 日本語)
- **CLI Runner** — `python -m je_auto_control.cli run|list-jobs|start-server|start-rest`
- **Cross-Platform** — unified API across Windows, macOS, and Linux (X11)
- **Multi-Host Admin Console** — register N AutoControl REST endpoints in one address book, poll them in parallel for health/sessions/jobs, broadcast actions to all of them. Persisted to `~/.je_auto_control/admin_hosts.json` (mode 0600 on POSIX). Bad-token hosts surface as unhealthy with the actual HTTP error
- **Tamper-Evident Audit Log** — SQLite events table with SHA-256 hash chain (`prev_hash` + `row_hash` per row); editing any past row breaks the chain. `verify_chain()` walks rows top-down and reports the first broken link. Legacy tables get backfilled at startup ("trust on first use")
- **WebRTC Packet Inspector** — process-global rolling window of `StatsSnapshot` samples (default 600 / ~10 min @ 1Hz) fed by the existing WebRTC stats pollers. Per-metric `last/min/max/avg/p95` for RTT, FPS, bitrate, packet loss, jitter
- **USB Device Enumeration** — read-only cross-platform device listing. Tries pyusb (libusb) first; falls back to platform-specific (Windows `Get-PnpDevice`, macOS `system_profiler`, Linux `/sys/bus/usb/devices`). Phase 2 (passthrough) intentionally deferred pending design review
- **System Diagnostics** — single-command "is everything OK?" probe across platform, optional deps, executor command count, audit chain, screenshot, mouse, disk space, REST registry. CLI exits 0 if all green / 1 otherwise; REST `/diagnose`; severity-tagged GUI tab
- **USB Hotplug Events** — polling-based hotplug watcher (`UsbHotplugWatcher`) with bounded ring buffer + sequence-numbered events; `GET /usb/events?since=N` lets late subscribers catch up. GUI auto-refresh toggle on the USB tab.
- **OpenAPI 3.1 + Swagger UI** — `GET /openapi.json` (auth-gated, generated from the live route table) + `GET /docs` (browser Swagger UI with bearer token bar). Drift test in CI catches new routes added without metadata.
- **Configuration Bundle** — single-file JSON export/import of user config (admin hosts, address book, trusted viewers, known hosts, host service, IDs). Atomic write with `<name>.bak.<timestamp>` backups; CLI `python -m je_auto_control.utils.config_bundle export|import`; `POST /config/{export,import}`; GUI buttons on the REST API tab.
- **USB Passthrough (experimental, opt-in)** — wire-level protocol over a WebRTC `usb` DataChannel (10 opcodes, CREDIT-based flow control, 16 KiB payload cap). Host-side `UsbPassthroughSession` end-to-end on the Linux libusb backend; Windows `WinUSB` backend with full ctypes wiring (hardware-unverified); macOS `IOKit` skeleton. Viewer-side blocking client (`UsbPassthroughClient` → `ClientHandle.control_transfer / bulk_transfer / interrupt_transfer`). Persistent ACL (`~/.je_auto_control/usb_acl.json`, default deny, mode 0600) with host-side prompt QDialog and tamper-evident audit-log integration. Default off — opt-in via `enable_usb_passthrough(True)` or `JE_AUTOCONTROL_USB_PASSTHROUGH=1`. Phase 2e external security review checklist included; default-on requires sign-off.

---

Expand All @@ -105,6 +114,7 @@ flowchart LR
APIUser[["Custom Anthropic /<br/>OpenAI tool loops"]]
HTTPClient[["HTTP / SSE clients"]]
TCPClient[["Socket / REST clients"]]
Browser[["Browser<br/>(/dashboard · /docs)"]]
GUIUser[["PySide6 GUI"]]
CLIUser[["python -m<br/>je_auto_control[.cli]"]]
Library[["Library users<br/>(import je_auto_control)"]]
Expand All @@ -114,8 +124,9 @@ flowchart LR
direction TB
Stdio["MCP stdio<br/>JSON-RPC 2.0"]
HTTPMCP["MCP HTTP /<br/>SSE + auth + TLS"]
REST["REST server<br/>:9939"]
REST["REST server :9939<br/>bearer auth · rate-limit ·<br/>OpenAPI · /metrics · /dashboard"]
Socket["Socket server<br/>:9938"]
WebRTC["WebRTC sessions<br/>(remote desktop ·<br/>files · audio · USB)"]
end

subgraph MCP["mcp_server/"]
Expand All @@ -137,6 +148,28 @@ flowchart LR
IOUtils["clipboard/ · cv2_utils/ ·<br/>shell_process/ · json/"]
end

subgraph Ops["Operations Layer (utils/)"]
direction TB
Admin["admin/<br/>multi-host poll +<br/>broadcast"]
Audit["remote_desktop/<br/>audit_log<br/>(SHA-256 chain)"]
Inspector["remote_desktop/<br/>webrtc_inspector"]
Diag["diagnostics/<br/>self-test"]
ConfigB["config_bundle/<br/>export/import"]
end

subgraph USB["USB"]
direction TB
UsbEnum["usb/<br/>list + hotplug events"]
UsbPass["usb/passthrough/<br/>session · client · ACL ·<br/>libusb · WinUSB · IOKit"]
end

subgraph Remote["Remote Desktop (utils/remote_desktop/)"]
direction TB
RDHost["host · webrtc_host ·<br/>signaling · multi_viewer"]
RDFiles["webrtc_files · file_sync ·<br/>clipboard_sync · audio"]
RDTrust["trust_list · fingerprint ·<br/>turn_config · lan_discovery"]
end

subgraph Backends["Per-OS Backends"]
direction TB
Win["windows/<br/>Win32 ctypes"]
Expand All @@ -149,6 +182,7 @@ flowchart LR
HTTPClient --> HTTPMCP
TCPClient --> Socket
TCPClient --> REST
Browser --> REST

Stdio --> Dispatcher
HTTPMCP --> Dispatcher
Expand All @@ -167,13 +201,27 @@ flowchart LR
Resources --> Wrapper

REST --> Executor
REST --> Ops
REST --> USB
Socket --> Executor
WebRTC --> Remote
WebRTC --> UsbPass

GUIUser --> Wrapper
GUIUser --> Recorder
GUIUser --> Ops
GUIUser --> USB
GUIUser --> Remote
CLIUser --> Executor
Library --> Wrapper
Library --> Executor
Library --> Ops

Admin --> REST
Inspector -.- WebRTC
Audit -.- REST
Audit -.- USB
UsbPass --> Backends

Wrapper --> Backends
Vision -.- Wrapper
Expand Down Expand Up @@ -203,11 +251,17 @@ je_auto_control/
├── vision/ # VLM-based locator (Anthropic / OpenAI backends)
├── ocr/ # Tesseract-backed text locator
├── clipboard/ # Cross-platform clipboard (text + image)
├── llm/ # Plain-language → AC_* action planner
├── scheduler/ # Interval + cron scheduler
├── hotkey/ # Global hotkey daemon
├── triggers/ # Image/window/pixel/file triggers
├── run_history/ # SQLite run log + error-screenshot artifacts
├── rest_api/ # Stdlib HTTP/REST server
├── rest_api/ # Stdlib HTTP/REST server — auth · audit · rate-limit · OpenAPI · /metrics · dashboard · Swagger UI
├── admin/ # Multi-host AdminConsoleClient (poll + broadcast)
├── diagnostics/ # System self-test runner + CLI
├── config_bundle/ # Single-file user-config export / import
├── usb/ # Cross-platform enumeration, hotplug events, passthrough/{protocol, session, viewer client, ACL, libusb / WinUSB / IOKit}
├── remote_desktop/ # WebRTC host + viewer, signalling, multi-viewer, file/clipboard/audio sync, audit log (hash chain), trust list, TURN config, mDNS discovery, WebRTC stats inspector
├── plugin_loader/ # Dynamic AC_* plugin discovery
├── socket_server/ # TCP socket server for remote automation
├── shell_process/ # Shell command manager
Expand Down Expand Up @@ -570,11 +624,17 @@ viewer = RemoteDesktopViewer(
```

**Audio streaming (host → viewer).** Optional `sounddevice` dep; opt
in with `enable_audio=True` on the host, attach an `AudioPlayer` (or
your own callback) on the viewer:
in with an `AudioCaptureConfig` on the host, attach an `AudioPlayer`
(or your own callback) on the viewer:

```python
host = RemoteDesktopHost(token="tok", enable_audio=True)
from je_auto_control.utils.remote_desktop import AudioCaptureConfig
host = RemoteDesktopHost(
token="tok",
audio_config=AudioCaptureConfig(enabled=True), # default mic
)
# Or pick a loopback / monitor device:
# audio_config=AudioCaptureConfig(enabled=True, device=12)

from je_auto_control.utils.remote_desktop import AudioPlayer
player = AudioPlayer(); player.start()
Expand Down
Loading
Loading