Pupil

Pupil is a Windows MCP server that lets an AI agent perceive your UI as structured data, indicate with an on-screen overlay, and act on the desktop when you accept.

Demo where the human operator only does : Tab, Tab, Tab...

_{Click the GIF for the full-quality MP4.}

Why Pupil

Today, working with an agent on a real desktop usually means a chat back-and-forth: you describe what you see, the agent describes what to do, you do it, you describe again. It works, but it's slow and a lot gets lost in translation.

Two things make that loop hard:

agents can't reliably see what's on screen, so context comes from your words (or repeated screenshots sent through the model);
and for many steps they need you to act — clicking a specific button, typing into a specific field, confirming a dialog — because they don't have hands on your machine.

Pupil turns that chat into something more like working side by side. The agent gets a structured view of the UI instead of guessing from screenshots, and when it needs you it draws an overlay card on the exact control to click or field to fill. You stay in charge — you can accept, skip, or ignore — and as a bonus the agent can also execute the action itself when you let it, so the same channel covers "show me", "do this", and "let me do it for you".

It's not a full autopilot. It's a tighter loop between what the agent sees, what it asks for, and what actually happens on your screen.

Examples

Overlay cards for each indicate type (info, warning, wait, danger, click, action, input):

Quick start (Windows)

From the repo root, run .\scripts\build.ps1 — builds the .NET core, copies pupil-core.exe into app\vendor\win32-x64\, then runs pnpm install and pnpm rebuild electron under app\.
Point Cursor’s MCP config at the Node entrypoint. Replace <path-to-pupil-repo> with the absolute path to your clone (forward slashes are fine on Windows):

{
  "mcpServers": {
    "pupil": {
      "command": "node",
      "args": ["<path-to-pupil-repo>/app/bin/pupil-mcp.js"]
    }
  }
}

Reload MCP or restart Cursor so the server starts.

If native binaries are missing or locked, run .\scripts\kill.ps1 before rebuilding. .\scripts\smoke.ps1 runs a basic syntax + bridge check. A legacy Python server in mcp/main.py exists for reference — see docs/MCP.md.

How it works

flowchart LR
    Agent[AI_Agent] -->|MCP_stdio| Shim[pupil-mcp.js]
    Shim -->|IPC| Daemon[Electron_daemon]
    Daemon -->|spawn| Core[pupil_core]

app/bin/pupil-mcp.js — MCP stdio entry; loads app/src/shim and the Electron overlay daemon.
core/ — .NET native sidecar (pupil-core.exe) that does the actual perception.
mcp/ — optional Python server (mcp/main.py) — legacy / minimal; most setups use Node only.
scripts/ — Windows build, smoke, and kill helpers.

Documentation

docs/MCP.md — full MCP & indicator contract (perceive / indicate, types, accept semantics, response shape).

Status & community

Early development, Windows-focused today. This is my first open-source project — feedback, bug reports, and questions are very welcome via GitHub Issues.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.cursor/skills/pupil		.cursor/skills/pupil
app		app
core		core
docs		docs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pupil

Why Pupil

Examples

Quick start (Windows)

How it works

Documentation

Status & community

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pupil

Why Pupil

Examples

Quick start (Windows)

How it works

Documentation

Status & community

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages