mac-automation is a local command line tool for automating a Mac desktop session. It can inspect Microsoft Edge tabs, read the active page, capture scoped screenshots, query the macOS Accessibility tree, send clicks and keystrokes, run simple scheduled workflows, and smoke-test Swift apps.
It is built for agent workflows, but it is not an agent. It is a small control plane with plain JSON output and artifact files you can inspect.
The sharp edge is intentional: this tool works against your real desktop and your real browser profile. Start with read-only commands, use dry-runs for anything that writes, and do not run it on accounts you are not willing to automate.
- Check macOS permissions for Accessibility, Screen Recording, Edge automation, OCR, Playwright, and Xcode tooling.
- List and activate Microsoft Edge tabs from the current profile.
- Run a small allowlist of read-only JavaScript snippets in Edge.
- Read the active Edge page without dumping unrelated tabs.
- Capture the active window by default, with Retina and display-origin handling.
- Search the Accessibility tree and click/type by coordinates.
- Check LinkedIn notifications through Edge, with DOM, Accessibility, and OCR fallbacks.
- Prepare or submit exact LinkedIn posts, with dry-run support and write audit logs.
- Install and run a
launchdnotification-check workflow. - Build/test Swift apps with
xcodebuild, collect result bundles, and capture simulator or app UI evidence. - Expose a stdin/stdout JSON-RPC helper called
AutomationHub.
Requirements:
- macOS
- Bun
- Microsoft Edge for browser-profile workflows
- Xcode command line tools for Swift and app smoke tests
bun install
bun run automationctl doctor --jsondoctor tells you what is ready and what needs permission. macOS will usually prompt for Automation permission the first time the tool controls Edge or System Events.
Edge DOM reads require this Edge setting:
View > Developer > Allow JavaScript from Apple Events
bun run automationctl permissions status --json
bun run automationctl edge tabs --domain linkedin.com --json
bun run automationctl edge open https://www.linkedin.com/notifications/ --reuse-tab --json
bun run automationctl edge js 'document.title' --json
bun run automationctl edge read-active --max-chars 12000 --json
bun run automationctl screen screenshot --jsonBroad tab listings are redacted by default. If you ask for every tab, you will get counts unless you narrow the selector or explicitly request more detail.
Read notifications:
bun run automationctl linkedin notifications --jsonPrepare a post without submitting it:
bun run automationctl linkedin post --text "Posting test from mac-automation." --dry-run --jsonSubmit only when you have checked the dry-run:
bun run automationctl linkedin post --text "Posting test from mac-automation." --confirm-public-write --jsonBy default, audit logs store hashes and outcomes, not the full post text.
Preview the LaunchAgent:
bun run automationctl workflow install-launchd notification-check --interval-minutes 60 --dry-run --jsonInstall it:
bun run automationctl workflow install-launchd notification-check --interval-minutes 60 --jsonRun and inspect the latest result:
bun run automationctl workflow run notification-check --json
bun run automationctl workflow last notification-check --jsonWorkflow output is written under ~/.automationhub/runs/ and ~/.automationhub/latest/.
Build:
bun run automationctl app smoke --project ./App.xcodeproj --scheme App --jsonRun XCTest on a simulator and collect evidence:
bun run automationctl app smoke \
--project ./App.xcodeproj \
--scheme App \
--test \
--destination 'platform=iOS Simulator,name=iPhone 17' \
--bundle-id com.example.App \
--jsonDrive a local macOS app by accessibility identifier:
bun run automationctl app smoke \
--project ./App.xcodeproj \
--scheme App \
--bundle-id com.example.App \
--app-path /Applications/App.app \
--tap-id sample.saveButton \
--jsonSee examples/SampleSwiftUIApp.swift for the identifier pattern expected by app smoke and ax search.
AutomationHub accepts one JSON-RPC request per line on stdin:
printf '%s\n' '{"jsonrpc":"2.0","id":1,"method":"permissions.status"}' | ./bin/AutomationHub
printf '%s\n' '{"jsonrpc":"2.0","id":1,"method":"edge.tabs.list"}' | ./bin/AutomationHubThe helper is for local process integration. It does not start a network server.
- Generic
edge jsis read-only and allowlisted. - Cookie, storage, password, token, and authorization data are blocked in generic browser helpers.
- Screenshots are scoped to the active window unless you pass
--full-screen. - Public writes require exact text and support
--dry-run. - Workflow failures keep artifacts but still exit nonzero.
- The tool does not bypass CAPTCHAs, bot detection, platform limits, or terms of service.
The repo includes a Codex skill at skills/automationhub-multi-harness. Copy it into your Codex skills directory if you want future sessions to know how to use and maintain this tool:
mkdir -p "${CODEX_HOME:-$HOME/.codex}/skills"
cp -R skills/automationhub-multi-harness "${CODEX_HOME:-$HOME/.codex}/skills/"The skill covers CLI, JSON-RPC, Edge, Accessibility, screenshots, LinkedIn, launchd, and app smoke-test harnesses.
bun run typecheck
bun test
bash skills/automationhub-multi-harness/scripts/smoke_automationhub.shIf you change a command, flag, JSON-RPC method, result shape, artifact, or safety behavior, update the skill references in the same change.
MIT.