feat: Phase 2 & 3 — Memory, API, MCP, Takeover, Skills, Menu Bar, SDK Ecosystem by terryso · Pull Request #1 · terryso/axion

terryso · 2026-05-16T12:22:46Z

Summary

This PR merges Phase 2 (Growth Features) and Phase 3 (Vision Features) into master, bringing Axion from an MVP CLI tool to a full desktop automation platform. 50 commits, 569 files changed, 71k+ lines added.

Phase 2 — Growth Features (Epic 4–7)

Epic 4: Cross-run Memory — Auto-extract app operation patterns after each run; Planner injects historical experience for more accurate plans (axion memory list/clear, --no-memory)
Epic 5: HTTP API Server — REST API + SSE event stream for external integrations (axion server, task submission, real-time progress, auth, concurrency)
Epic 6: MCP Server Mode — Act as MCP stdio server for external agents like Claude Code (axion mcp)
Epic 7: Takeover & Fast Mode — Pause/resume when automation gets stuck; --fast mode reduces LLM calls for simple tasks

Phase 3 — Vision Features (Epic 8–11)

Epic 8: Multi-window Workflows — Cross-app coordination, window layout management (arrange_windows with tile/cascade), blocking dialog detection
Epic 9: Record → Compile → Skill Reuse — Record desktop operations and compile into reusable skills with zero LLM cost (axion record, axion skill compile/run/list/delete)
Epic 10: Menu Bar App (AxionBar) — Native macOS menu bar app with task panel, SSE real-time progress, global hotkeys, skill quick trigger
Epic 11: Third-party SDK Ecosystem — Agent project template, plugin tool registration (@Tool macro), developer documentation and examples

Other Changes

Test migration to Swift Testing framework (all test targets)
README rewritten in English and Chinese with accurate tool list (21 tools)
Playwright MCP server integration for web automation
SIGINT handler for graceful Helper process cleanup
SDK dependency bumped to 0.3.2

Test plan

All unit tests pass (swift test --filter for Tools/Models/MCP/Services/Core/CLI)
Integration tests verified on macOS with AX permissions
CI pipeline green
Manual acceptance tests passed for Epic 4–9
Smoke test: axion run "Open Calculator" on clean master merge
Verify axion server, axion mcp, axion record commands work end-to-end

🤖 Generated with Claude Code

…Story 4.1) Add AppMemoryExtractor to extract operation summaries from SDK message streams, MemoryCleanupService for 30-day expiry cleanup, and Memory status check in axion doctor. RunCommand now collects tool pairs during execution and persists App knowledge entries organized by bundle identifier domain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…assed) Verified Memory extraction, domain organization, expiry cleanup, corruption resilience, and doctor status reporting with real CLI commands against live API. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…amiliarity tracking (Story 4.2) Implement cross-run learning system that extracts AX tree structure features, identifies high-frequency operation patterns, marks failure experiences, and auto-marks familiar apps after 3+ successful runs. Fix tool name mismatch (get_ax_tree → get_accessibility_tree) found during acceptance testing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…management commands (Story 4.3) Inject accumulated App Memory (profiles, patterns, failures, familiarity) into Planner system prompt for more accurate plans. Add axion memory list/clear commands and --no-memory flag. All 6 ACs verified via manual acceptance testing (11/11 pass) and 578 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add Hummingbird-based HTTP API server with REST endpoints for submitting and querying desktop automation tasks. Includes server subcommand, async task execution via AgentRunner, actor-based RunTracker, and comprehensive unit tests (624 tests, 0 failures). Manual acceptance: 10/10 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The open-agent-sdk-swift on GitHub now requires swift-mcp 2.0.0. Update axion to match and adapt to breaking API changes: - Remove ParameterValue conformance (replaced by @Schemable macro) - Change Tool.Content to ContentBlock (renamed in swift-mcp 2.0) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Implement Server-Sent Events (SSE) endpoint for real-time monitoring of agent task execution. Includes EventBroadcaster actor for multi-client pub/sub, SSE event models, replay buffer for completed runs, and integration with AgentRunner for step-level event emission. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…urrency (Story 5.3)

…tdio (Story 6.1)

…on (Story 6.2)

…Story 7.1)

…y 7.2)

Add hand-on acceptance testing document covering Epic 4 (Memory), Epic 5 (HTTP API), Epic 6 (MCP Server), and Epic 7 (Takeover/Fast Mode) with real commands for verification. Includes review patches from Story 4.1-4.3 (code fixes, test improvements, security hardening), epic retrospectives, and sprint status updates. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

- Add 30 new QA automate tests across Stories 8.1/8.2/8.3 (CrossAppWorkflowTests, PlannerPromptMultiWindowTests, TraceWindowContextTests, WindowManagementToolTests additions) - Add Epic 8 retrospective document - Update README tool count from 16 to 22 - Update sprint-status: epic-8-retrospective done Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…kill

…ecution

- Fix SkillRunCommand --allow-foreground flag defaulting to true (ArgumentParser validation error) - Add Epic 9 manual acceptance test document with real command verification - Update manual-e2e-test-checklist.md with recording/skill test cases - Update README with Record and Replay Skills section - Update sprint-status, project-context, and epics for Epic 9 completion Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…mmunication

…panel

… trigger

…affold CLI

…tool registration Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…agent extension

…xamples

StatusBarController.sendNotification used UNUserNotificationCenter directly, which crashes in test environment without an app bundle. Extract NotificationSending protocol with injectable mock to enable isolated testing. Also includes Epic 10/11 retrospective updates and sprint-status sync. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@suite

… Testing Migrate 24 test files from XCTest to Swift Testing framework: - TM-1: AxionCoreTests (11 files) - TM-2: AxionHelperTests Models (4), Services (7), MCP (1) Key changes: import XCTest → import Testing, class → @suite struct, func test_xxx → @test("xxx") func xxx(), XCTAssert* → #expect(), XCTUnwrap → try #require(), XCTAssertThrowsError → #expect(throws:). ServiceContainerTests uses .serialized to prevent parallel race conditions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Migrate HelperMCPServerTests and HelperProcessSmokeTests from XCTest to Swift Testing (@Suite/@Test/#expect). Includes prior TM-2/TM-3 changes for ServiceContainerTests and all Tool test files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Migrate Config, Planner, Executor, Engine, Verifier, IO, Helper, and Trace test files from XCTest to Swift Testing framework. setUp/tearDown converted to init/deinit with ~Copyable structs. Env-var-dependent tests use EnvGate actor to isolate global state. Add --no-parallel --quiet to both Makefile and CI to prevent parallel test races. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add --quiet, --skip AxionCLIIntegrationTests, --skip AxionE2ETests to match make test exactly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…t Testing Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Migrate 10 final XCTest files to Swift Testing: - AxionE2ETests (5): CorePipeline, HelperLifecycle, MockLLM, RealLLM, Helpers - AxionCLITests/Memory (5): AppMemoryExtractor, AppProfileAnalyzer, FamiliarityTracker, MemoryCleanupService, MemoryContextProvider Fix CI failure: DocumentationTests now skips when SDK repo is unavailable. `import XCTest` fully eliminated from Tests/. All 1561 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

These tests validated docs in a sibling open-agent-sdk-swift repo, not Axion's own code. Not relevant for unit test suite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The arrange_windows implementation uses NSScreen.visibleFrame whose origin.y includes the menu bar height (62px on CI). Changed absolute y==0 assertions to relative offsets (y differences) so tests pass regardless of screen configuration. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…TaskCancellation withTaskCancellationHandler does not fire on SIGINT — it only responds to cooperative Swift Task cancellation. Replace with DispatchSource signal handler so Ctrl-C properly stops recording and saves the file. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…nner Document-based apps (TextEdit, Pages) show an Open panel on launch that blocks all automation. Instead of auto-dismissing (which would interfere with tasks that need to open a specific file), detect the blocking dialog via window title keywords and include it in the launch_app result as `blocking_dialog: { window_id, title }`. The Planner then decides whether to dismiss (Cmd+N, Escape) or interact with the dialog based on the task. Follows OpenClick's detection approach: title keyword matching (open/save/ import/export + Chinese equivalents) with minimum window size filter. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

When recording on non-English macOS, app_switch events captured localized names (e.g. "计算器") which launch_app couldn't resolve since the actual file is Calculator.app. Now records bundle_id alongside localized name, and skill compiler prefers bundle_id for launch_app arguments. Also adds localized display name fallback matching in AppLauncherService. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

UNUserNotificationCenter.current() crashes with NSException when the process has no valid bundle proxy (e.g. running from swift build debug directory). Add a guard to skip notification setup in non-bundle mode. Also add scripts/build-bar-bundle.sh to create a proper .app bundle for development and testing of the AxionBar macOS menu bar app. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

AppMemoryExtractor relied solely on toolResult.isError to classify runs as success/failure, but AxionHelper tools (e.g. launch_app) catch errors and return structured JSON with "error" and "message" fields instead of throwing. This caused the MCP framework to leave isError false, so failures were recorded as successes in App Memory. Add contentContainsErrorPayload() to also detect error payloads in result JSON content, fixing failure tagging in memory entries and profiles. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…LM timeout - Pass user's actual input (e.g. credentials) as resume context instead of fixed "用户已完成手动操作" string - Improve takeover prompt to explain both manual-desktop and text-input options - Add 90s resume watchdog: if LLM doesn't respond after takeover resume, interrupt and suggest running a follow-up task - Fix AxionHelper crash when AX elements return NaN/Infinity bounds - Fix AppBundleTests bundle ID expectation mismatch Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…uming old agent After takeover, close the current agent and create a new one with a minimal context (original task + takeover summary + current screen state). This avoids the 25K+ token context that caused LLM API timeouts with the old resume approach. Inspired by openclick's replan-after-takeover pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Built-in Playwright MCP alongside axion-helper so agents can use DOM-level web interactions (form filling, clicking, navigation) instead of relying on AX tree which doesn't expose web form elements. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… for web tasks Revert the replan-with-fresh-context approach back to agent.resume() since Playwright MCP keeps context small. Update planner prompt to instruct agent to prefer Playwright for any web/URL/browser task, which eliminates the AX tree limitation that caused takeovers in the first place. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Rewrite English and Chinese READMEs with Phase 2 (Memory, HTTP API, MCP Server, Takeover, Fast Mode) and Phase 3 (multi-window, record & skills, menu bar app, SDK ecosystem) features - Correct MCP tool count from 22 to 21, replace non-existent tools with actual tools (start_recording, stop_recording) - Add SIGINT handler in RunCommand for graceful Helper cleanup - Bump open-agent-sdk-swift to 0.3.2, add .playwright-mcp to gitignore Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

terryso and others added 30 commits May 13, 2026 10:09

chore: update Package.resolved for swift-mcp 2.0.4

fee8c07

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chore: update BMad configs and add story automator tooling

9f712c1

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat(story-5.3): feat: add server command API authentication and conc…

db81bb6

…urrency (Story 5.3)

feat(story-6.1): feat: add MCP server mode exposing Axion tools via s…

9cf6f8e

…tdio (Story 6.1)

feat(story-6.2): feat: add axion mcp CLI command with agent integrati…

9f21341

…on (Story 6.2)

feat(story-7.1): feat: add pause protocol and user takeover support (…

6bbee7a

…Story 7.1)

feat(story-7.2): feat: add fast mode for quick automation tasks (Stor…

6a7be0b

…y 7.2)

feat(story-8.3): feat(story-8.3): add window layout management tools

e8172cf

feat(story-9.1): feat(story-9.1): add operation recording engine

491f5cb

feat(story-9.2): feat(story-9.2): add recording compile to reusable s…

bbff9a5

…kill

feat(story-9.3): feat(story-9.3): add skill library management and ex…

28efb08

…ecution

feat(story-10.1): feat(story-10.1): add menubar status and service co…

c7f51ca

…mmunication

feat(story-10.2): feat(story-10.2): add task management and realtime …

ea02d61

…panel

feat(story-10.3): feat(story-10.3): add global hotkey and skill quick…

775004b

… trigger

feat(story-11.1): feat(story-11.1): add agent project template and sc…

85e6bde

…affold CLI

feat(story-11.2): update sprint-status and add story spec for plugin …

6cd0311

…tool registration Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat(story-11.2): feat(story-11.2): add plugin tool registration and …

01d2358

…agent extension

feat(story-11.3): feat(story-11.3): add developer documentation and e…

a26156b

…xamples

terryso and others added 20 commits May 15, 2026 19:49

fix(ci): align test command with Makefile flags

4a52b02

Add --quiet, --skip AxionCLIIntegrationTests, --skip AxionE2ETests to match make test exactly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix(ci): remove --quiet flag for coverage output

a94f7b9

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat(tm-6): migrate AxionCLITests API, MCP, Output (17 files) to Swif…

799019c

…t Testing Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat(tm-7): migrate AxionCLITests Commands (14 files) to Swift Testing

3f67314

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat(tm-8): migrate all Integration tests (15 files) to Swift Testing

25a772b

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chore: remove DocumentationTests (external SDK doc validation)

240e3be

These tests validated docs in a sibling open-agent-sdk-swift repo, not Axion's own code. Not relevant for unit test suite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

terryso merged commit 14e6208 into master May 16, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Phase 2 & 3 — Memory, API, MCP, Takeover, Skills, Menu Bar, SDK Ecosystem#1

feat: Phase 2 & 3 — Memory, API, MCP, Takeover, Skills, Menu Bar, SDK Ecosystem#1
terryso merged 50 commits into
masterfrom
feature/phase3-vision-features

terryso commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

terryso commented May 16, 2026

Summary

Phase 2 — Growth Features (Epic 4–7)

Phase 3 — Vision Features (Epic 8–11)

Other Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant