Skip to content

auxten/handson

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HandsOn

A unified remote machine control layer for AI agents. Control any computer — at any level from BIOS to desktop — through a single MCP (Model Context Protocol) interface.

HandsOn supports multiple backends: macOS native (Peekaboo), IP-KVM hardware (Rock5B, PiKVM, NanoKVM). AI agents get the same tools regardless of backend, and Peekaboo provides bonus semantic UI element detection.

Why HandsOn Exists

Software-based remote control (VNC, SSH, screen sharing) fails in many critical scenarios:

  • macOS SIP blocks software virtual HID devices; VNC can't reach FileVault, Recovery Mode, or firmware screens.
  • BIOS/UEFI/boot menus require a real USB keyboard — no software solution works before the OS boots.
  • AI agents need a consistent API to see screens and send input, regardless of the target machine or access method.

HandsOn solves all three: hardware-level KVM when you need it, macOS accessibility when you have it, and one MCP interface for the agent.

Architecture

┌──────────────────────────────────────────────────────┐
│  AI Agent (Claude Code / Claude Desktop / Cursor)    │
│  Sees screenshots, reasons about UI, sends commands  │
└──────────────┬───────────────────────────────────────┘
               │ MCP (stdio or streamable-http)
┌──────────────▼───────────────────────────────────────┐
│  handson MCP Server (python -m handson.mcp_server)   │
│                                                      │
│  Core tools (all backends):                          │
│    screenshot, mouse_click, type_text, press_key,    │
│    hotkey, mouse_scroll, mouse_move, status          │
│                                                      │
│  Extended tools (Peekaboo only):                     │
│    see (element detection), click_element,            │
│    type_in_element                                   │
│                                                      │
│  ┌──────────┐ ┌────────┐ ┌────────┐ ┌─────────────┐ │
│  │ peekaboo │ │ rock5b │ │ pikvm  │ │ nanokvm     │ │
│  │ backend  │ │backend │ │backend │ │ backend     │ │
│  └────┬─────┘ └───┬────┘ └───┬────┘ └──────┬──────┘ │
└───────┼────────────┼─────────┼──────────────┼────────┘
        │ CLI/pipe   │ HTTP+WS │ HTTPS REST   │ HTTP+WS
        ▼            ▼         ▼              ▼
    Peekaboo      Rock5B    PiKVM          NanoKVM
    (macOS)      (local)   (remote)       (remote)

Backend Tiers

Tier Backend Strengths Limitations
Tier 1: Semantic Peekaboo (macOS) Accessibility API — free, instant, accurate element detection; element-level click/type macOS only; no BIOS/boot/FileVault
Tier 2: Pixel Rock5B, PiKVM, NanoKVM Works everywhere: BIOS, bootloader, disk encryption, any OS Screenshot-only vision; agent uses its own eyes to pick coordinates

How the agent works with each tier:

  • Peekaboo: see → annotated screenshot + element map {B1: "Save", T1: "Name field"}click_element("B1")
  • IP-KVM: screenshot → agent sees the JPEG (multimodal vision) → mouse_click(x=0.8, y=0.3)

Quick Start

As an MCP Server (for AI agents)

pip install aiohttp "mcp[cli]"
# macOS with Peekaboo
python -m handson.mcp_server --backend peekaboo

# Rock5B (connect to local KVM server)
python -m handson.mcp_server --backend rock5b --url http://localhost:8080

# PiKVM (remote)
python -m handson.mcp_server --backend pikvm --url https://pikvm.local \
  --user admin --password admin

# NanoKVM (remote)
python -m handson.mcp_server --backend nanokvm --url http://nanokvm.local \
  --user admin --password admin

As a Rock5B KVM Web Server

See docs/ROCK5B.md for hardware setup, wiring, configuration, and troubleshooting.

MCP Tools Reference

Core Tools (all backends)

Tool Parameters Description
screenshot Capture current screen (JPEG)
mouse_move x: float, y: float Move cursor (0.0-1.0 normalized)
mouse_click x: float, y: float, button: str Click at position
mouse_double_click x: float, y: float Double-click
mouse_scroll x: float, y: float, clicks: int Scroll (positive=up)
type_text text: str Type a string
press_key key: str Press and release a key (enter, tab, f1, KeyA, etc.)
hotkey keys: str Combo, comma-separated (ctrl,alt,delete, cmd,shift,t)
status Backend info (JSON)

Extended Tools (Peekaboo backend only)

Tool Parameters Description
see Detect UI elements → annotated screenshot + element map
click_element element_id: str Click element by ID (e.g. B1)
type_in_element element_id: str, text: str Type into element (e.g. T1)

Configuration

Variable Default Description
HANDSON_BACKEND rock5b Backend type (peekaboo, rock5b, pikvm, nanokvm)
HANDSON_KVM_URL Backend URL
HANDSON_KVM_USER Backend username
HANDSON_KVM_PASSWORD Backend password

License

Apache License 2.0. See LICENSE.

About

Let agents control your computer like you!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors