feat: Phase 2 & 3 — Memory, API, MCP, Takeover, Skills, Menu Bar, SDK Ecosystem#1
Merged
Merged
Conversation
…Story 4.1) Add AppMemoryExtractor to extract operation summaries from SDK message streams, MemoryCleanupService for 30-day expiry cleanup, and Memory status check in axion doctor. RunCommand now collects tool pairs during execution and persists App knowledge entries organized by bundle identifier domain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…assed) Verified Memory extraction, domain organization, expiry cleanup, corruption resilience, and doctor status reporting with real CLI commands against live API. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…amiliarity tracking (Story 4.2) Implement cross-run learning system that extracts AX tree structure features, identifies high-frequency operation patterns, marks failure experiences, and auto-marks familiar apps after 3+ successful runs. Fix tool name mismatch (get_ax_tree → get_accessibility_tree) found during acceptance testing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…management commands (Story 4.3) Inject accumulated App Memory (profiles, patterns, failures, familiarity) into Planner system prompt for more accurate plans. Add axion memory list/clear commands and --no-memory flag. All 6 ACs verified via manual acceptance testing (11/11 pass) and 578 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add Hummingbird-based HTTP API server with REST endpoints for submitting and querying desktop automation tasks. Includes server subcommand, async task execution via AgentRunner, actor-based RunTracker, and comprehensive unit tests (624 tests, 0 failures). Manual acceptance: 10/10 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The open-agent-sdk-swift on GitHub now requires swift-mcp 2.0.0. Update axion to match and adapt to breaking API changes: - Remove ParameterValue conformance (replaced by @Schemable macro) - Change Tool.Content to ContentBlock (renamed in swift-mcp 2.0) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement Server-Sent Events (SSE) endpoint for real-time monitoring of agent task execution. Includes EventBroadcaster actor for multi-client pub/sub, SSE event models, replay buffer for completed runs, and integration with AgentRunner for step-level event emission. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…urrency (Story 5.3)
Add hand-on acceptance testing document covering Epic 4 (Memory), Epic 5 (HTTP API), Epic 6 (MCP Server), and Epic 7 (Takeover/Fast Mode) with real commands for verification. Includes review patches from Story 4.1-4.3 (code fixes, test improvements, security hardening), epic retrospectives, and sprint status updates. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
- Add 30 new QA automate tests across Stories 8.1/8.2/8.3 (CrossAppWorkflowTests, PlannerPromptMultiWindowTests, TraceWindowContextTests, WindowManagementToolTests additions) - Add Epic 8 retrospective document - Update README tool count from 16 to 22 - Update sprint-status: epic-8-retrospective done Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Fix SkillRunCommand --allow-foreground flag defaulting to true (ArgumentParser validation error) - Add Epic 9 manual acceptance test document with real command verification - Update manual-e2e-test-checklist.md with recording/skill test cases - Update README with Record and Replay Skills section - Update sprint-status, project-context, and epics for Epic 9 completion Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tool registration Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
StatusBarController.sendNotification used UNUserNotificationCenter directly, which crashes in test environment without an app bundle. Extract NotificationSending protocol with injectable mock to enable isolated testing. Also includes Epic 10/11 retrospective updates and sprint-status sync. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… Testing Migrate 24 test files from XCTest to Swift Testing framework: - TM-1: AxionCoreTests (11 files) - TM-2: AxionHelperTests Models (4), Services (7), MCP (1) Key changes: import XCTest → import Testing, class → @suite struct, func test_xxx → @test("xxx") func xxx(), XCTAssert* → #expect(), XCTUnwrap → try #require(), XCTAssertThrowsError → #expect(throws:). ServiceContainerTests uses .serialized to prevent parallel race conditions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Migrate HelperMCPServerTests and HelperProcessSmokeTests from XCTest to Swift Testing (@Suite/@Test/#expect). Includes prior TM-2/TM-3 changes for ServiceContainerTests and all Tool test files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Migrate Config, Planner, Executor, Engine, Verifier, IO, Helper, and Trace test files from XCTest to Swift Testing framework. setUp/tearDown converted to init/deinit with ~Copyable structs. Env-var-dependent tests use EnvGate actor to isolate global state. Add --no-parallel --quiet to both Makefile and CI to prevent parallel test races. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add --quiet, --skip AxionCLIIntegrationTests, --skip AxionE2ETests to match make test exactly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…t Testing Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Migrate 10 final XCTest files to Swift Testing: - AxionE2ETests (5): CorePipeline, HelperLifecycle, MockLLM, RealLLM, Helpers - AxionCLITests/Memory (5): AppMemoryExtractor, AppProfileAnalyzer, FamiliarityTracker, MemoryCleanupService, MemoryContextProvider Fix CI failure: DocumentationTests now skips when SDK repo is unavailable. `import XCTest` fully eliminated from Tests/. All 1561 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
These tests validated docs in a sibling open-agent-sdk-swift repo, not Axion's own code. Not relevant for unit test suite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The arrange_windows implementation uses NSScreen.visibleFrame whose origin.y includes the menu bar height (62px on CI). Changed absolute y==0 assertions to relative offsets (y differences) so tests pass regardless of screen configuration. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…TaskCancellation withTaskCancellationHandler does not fire on SIGINT — it only responds to cooperative Swift Task cancellation. Replace with DispatchSource signal handler so Ctrl-C properly stops recording and saves the file. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nner
Document-based apps (TextEdit, Pages) show an Open panel on launch that
blocks all automation. Instead of auto-dismissing (which would interfere
with tasks that need to open a specific file), detect the blocking dialog
via window title keywords and include it in the launch_app result as
`blocking_dialog: { window_id, title }`. The Planner then decides whether
to dismiss (Cmd+N, Escape) or interact with the dialog based on the task.
Follows OpenClick's detection approach: title keyword matching (open/save/
import/export + Chinese equivalents) with minimum window size filter.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When recording on non-English macOS, app_switch events captured localized names (e.g. "计算器") which launch_app couldn't resolve since the actual file is Calculator.app. Now records bundle_id alongside localized name, and skill compiler prefers bundle_id for launch_app arguments. Also adds localized display name fallback matching in AppLauncherService. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
UNUserNotificationCenter.current() crashes with NSException when the process has no valid bundle proxy (e.g. running from swift build debug directory). Add a guard to skip notification setup in non-bundle mode. Also add scripts/build-bar-bundle.sh to create a proper .app bundle for development and testing of the AxionBar macOS menu bar app. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AppMemoryExtractor relied solely on toolResult.isError to classify runs as success/failure, but AxionHelper tools (e.g. launch_app) catch errors and return structured JSON with "error" and "message" fields instead of throwing. This caused the MCP framework to leave isError false, so failures were recorded as successes in App Memory. Add contentContainsErrorPayload() to also detect error payloads in result JSON content, fixing failure tagging in memory entries and profiles. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…LM timeout - Pass user's actual input (e.g. credentials) as resume context instead of fixed "用户已完成手动操作" string - Improve takeover prompt to explain both manual-desktop and text-input options - Add 90s resume watchdog: if LLM doesn't respond after takeover resume, interrupt and suggest running a follow-up task - Fix AxionHelper crash when AX elements return NaN/Infinity bounds - Fix AppBundleTests bundle ID expectation mismatch Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…uming old agent After takeover, close the current agent and create a new one with a minimal context (original task + takeover summary + current screen state). This avoids the 25K+ token context that caused LLM API timeouts with the old resume approach. Inspired by openclick's replan-after-takeover pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Built-in Playwright MCP alongside axion-helper so agents can use DOM-level web interactions (form filling, clicking, navigation) instead of relying on AX tree which doesn't expose web form elements. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… for web tasks Revert the replan-with-fresh-context approach back to agent.resume() since Playwright MCP keeps context small. Update planner prompt to instruct agent to prefer Playwright for any web/URL/browser task, which eliminates the AX tree limitation that caused takeovers in the first place. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Rewrite English and Chinese READMEs with Phase 2 (Memory, HTTP API, MCP Server, Takeover, Fast Mode) and Phase 3 (multi-window, record & skills, menu bar app, SDK ecosystem) features - Correct MCP tool count from 22 to 21, replace non-existent tools with actual tools (start_recording, stop_recording) - Add SIGINT handler in RunCommand for graceful Helper cleanup - Bump open-agent-sdk-swift to 0.3.2, add .playwright-mcp to gitignore Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR merges Phase 2 (Growth Features) and Phase 3 (Vision Features) into master, bringing Axion from an MVP CLI tool to a full desktop automation platform. 50 commits, 569 files changed, 71k+ lines added.
Phase 2 — Growth Features (Epic 4–7)
axion memory list/clear,--no-memory)axion server, task submission, real-time progress, auth, concurrency)axion mcp)--fastmode reduces LLM calls for simple tasksPhase 3 — Vision Features (Epic 8–11)
arrange_windowswith tile/cascade), blocking dialog detectionaxion record,axion skill compile/run/list/delete)@Toolmacro), developer documentation and examplesOther Changes
Test plan
swift test --filterfor Tools/Models/MCP/Services/Core/CLI)axion run "Open Calculator"on clean master mergeaxion server,axion mcp,axion recordcommands work end-to-end🤖 Generated with Claude Code