Skip to content

Swader/mac-automation

Repository files navigation

mac-automation

mac-automation is a local command line tool for automating a Mac desktop session. It can inspect Microsoft Edge tabs, read the active page, capture scoped screenshots, query the macOS Accessibility tree, send clicks and keystrokes, run simple scheduled workflows, and smoke-test Swift apps.

It is built for agent workflows, but it is not an agent. It is a small control plane with plain JSON output and artifact files you can inspect.

The sharp edge is intentional: this tool works against your real desktop and your real browser profile. Start with read-only commands, use dry-runs for anything that writes, and do not run it on accounts you are not willing to automate.

What it can do

  • Check macOS permissions for Accessibility, Screen Recording, Edge automation, OCR, Playwright, and Xcode tooling.
  • List and activate Microsoft Edge tabs from the current profile.
  • Run a small allowlist of read-only JavaScript snippets in Edge.
  • Read the active Edge page without dumping unrelated tabs.
  • Capture the active window by default, with Retina and display-origin handling.
  • Search the Accessibility tree and click/type by coordinates.
  • Check LinkedIn notifications through Edge, with DOM, Accessibility, and OCR fallbacks.
  • Prepare or submit exact LinkedIn posts, with dry-run support and write audit logs.
  • Install and run a launchd notification-check workflow.
  • Build/test Swift apps with xcodebuild, collect result bundles, and capture simulator or app UI evidence.
  • Expose a stdin/stdout JSON-RPC helper called AutomationHub.

Install

Requirements:

  • macOS
  • Bun
  • Microsoft Edge for browser-profile workflows
  • Xcode command line tools for Swift and app smoke tests
bun install
bun run automationctl doctor --json

doctor tells you what is ready and what needs permission. macOS will usually prompt for Automation permission the first time the tool controls Edge or System Events.

Edge DOM reads require this Edge setting:

View > Developer > Allow JavaScript from Apple Events

Basic commands

bun run automationctl permissions status --json
bun run automationctl edge tabs --domain linkedin.com --json
bun run automationctl edge open https://www.linkedin.com/notifications/ --reuse-tab --json
bun run automationctl edge js 'document.title' --json
bun run automationctl edge read-active --max-chars 12000 --json
bun run automationctl screen screenshot --json

Broad tab listings are redacted by default. If you ask for every tab, you will get counts unless you narrow the selector or explicitly request more detail.

LinkedIn workflows

Read notifications:

bun run automationctl linkedin notifications --json

Prepare a post without submitting it:

bun run automationctl linkedin post --text "Posting test from mac-automation." --dry-run --json

Submit only when you have checked the dry-run:

bun run automationctl linkedin post --text "Posting test from mac-automation." --confirm-public-write --json

By default, audit logs store hashes and outcomes, not the full post text.

Scheduling

Preview the LaunchAgent:

bun run automationctl workflow install-launchd notification-check --interval-minutes 60 --dry-run --json

Install it:

bun run automationctl workflow install-launchd notification-check --interval-minutes 60 --json

Run and inspect the latest result:

bun run automationctl workflow run notification-check --json
bun run automationctl workflow last notification-check --json

Workflow output is written under ~/.automationhub/runs/ and ~/.automationhub/latest/.

Swift app smoke tests

Build:

bun run automationctl app smoke --project ./App.xcodeproj --scheme App --json

Run XCTest on a simulator and collect evidence:

bun run automationctl app smoke \
  --project ./App.xcodeproj \
  --scheme App \
  --test \
  --destination 'platform=iOS Simulator,name=iPhone 17' \
  --bundle-id com.example.App \
  --json

Drive a local macOS app by accessibility identifier:

bun run automationctl app smoke \
  --project ./App.xcodeproj \
  --scheme App \
  --bundle-id com.example.App \
  --app-path /Applications/App.app \
  --tap-id sample.saveButton \
  --json

See examples/SampleSwiftUIApp.swift for the identifier pattern expected by app smoke and ax search.

JSON-RPC helper

AutomationHub accepts one JSON-RPC request per line on stdin:

printf '%s\n' '{"jsonrpc":"2.0","id":1,"method":"permissions.status"}' | ./bin/AutomationHub
printf '%s\n' '{"jsonrpc":"2.0","id":1,"method":"edge.tabs.list"}' | ./bin/AutomationHub

The helper is for local process integration. It does not start a network server.

Safety model

  • Generic edge js is read-only and allowlisted.
  • Cookie, storage, password, token, and authorization data are blocked in generic browser helpers.
  • Screenshots are scoped to the active window unless you pass --full-screen.
  • Public writes require exact text and support --dry-run.
  • Workflow failures keep artifacts but still exit nonzero.
  • The tool does not bypass CAPTCHAs, bot detection, platform limits, or terms of service.

Skill for Codex

The repo includes a Codex skill at skills/automationhub-multi-harness. Copy it into your Codex skills directory if you want future sessions to know how to use and maintain this tool:

mkdir -p "${CODEX_HOME:-$HOME/.codex}/skills"
cp -R skills/automationhub-multi-harness "${CODEX_HOME:-$HOME/.codex}/skills/"

The skill covers CLI, JSON-RPC, Edge, Accessibility, screenshots, LinkedIn, launchd, and app smoke-test harnesses.

Development

bun run typecheck
bun test
bash skills/automationhub-multi-harness/scripts/smoke_automationhub.sh

If you change a command, flag, JSON-RPC method, result shape, artifact, or safety behavior, update the skill references in the same change.

License

MIT.

About

Local-first macOS automation for Edge, Accessibility, screenshots, launchd workflows, and app smoke tests.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors