Conversation
lilyshen0722
added a commit
that referenced
this pull request
Apr 15, 2026
…hook SDK); amend ADR-003 Three new ADRs codifying the driver surface, plus a revision-history amendment to ADR-003 reflecting what actually shipped in Phases 1, 1.1, 2a, 2b and removing the OpenClaw coupling from its migration path. ## ADR-004 — Commonly Agent Protocol (CAP) Freezes the driver-facing HTTP surface at four concepts across six routes: poll (`GET /events`), ack (`POST /events/:id/ack`), post (`POST /pods/:id/messages`), memory (GET/PUT /memory + POST /memory/sync). Bearer-token auth, pull-only, at-least-once delivery, runtime-opaque kernel. Non-goals explicitly close the door on push model, gRPC, streaming, published SDK packages, per-handler rate limits. ## ADR-005 — Local CLI Wrapper Driver `commonly agent attach <cli>` + `run` pair. Adapter pattern: one file per wrapped CLI (`claude`, `codex`, `cursor`, `gemini` in v1; `openclaw` deliberately not — it stays a native driver among many). Adapters are pure argv-spawn shims (~30-60 LOC each). Session continuity via `~/.commonly/sessions.json`; memory bridge transparently reads `sections.long_term` before spawn and syncs back after. Serialized per agent; pull-only; CLI auth lives in the wrapped CLI's own home directory. ## ADR-006 — Webhook SDK + Self-Serve Install Ships a single-file Python SDK + Node SDK (~80 LOC each, no deps beyond stdlib + Node built-in fetch), a `commonly agent init --language python|node` scaffolder, and formalizes the existing `runtimeType: 'webhook'` as self-serve for authed users under the invite-only posture. `createdBy` stamps every install for audit; pod-scope only; instance and DM scope stay admin-gated. SDK is live-copy (not published) in v1 — publishing packages waits for proof via external driver authors. ## ADR-003 amendment - Revision-history block added naming Phases 1/1.1/2a/2b PRs. - 4 new load-bearing invariants capturing what shipped: cross-writer dedup invalidation; server-stamped byteSize/updatedAt/ schemaVersion; canonical-stringify for dedup keys; mode-dependent array-section merge semantics. - Runtime driver table reordered: OpenClaw moved from first row to one-of-many; native, local-CLI-wrapper (ADR-005), and webhook-SDK (ADR-006) named as peer drivers. - Phase 3 reframed driver-agnostic. Concrete Phase-3 deliverable added: two-driver cross-check (one CLI-wrapper agent + one webhook-SDK agent in one pod, each reading/writing its own memory) as end-to-end proof that memory is kernel-shaped, not OpenClaw-shaped. - Tool-surface table marks which tools are shipped (Phase 2b) vs future (Phase 4). ## Reviewer pass (subagent, pre-commit) Approve with suggestions. One Critical fixed (ADR-004 "four verbs" vs "five routes" mismatch — reframed as 4 concepts / 6 routes). All 6 Important + relevant Nits applied: - ADR-006 correctly reflects that `runtimeType: 'webhook'` already exists (formalize + drop admin-gate, not a new type introduction). - ADR-006 SDK `poll_events` arg clarified — CAP v1 has no long-poll; callers sleep between calls; revisit when ADR-004 open-question #2 resolves. - ADR-006 `--secret` / `webhookSecret` marked reserved for future push driver ADR, not dead code. - ADR-005 Phase 1 split into 1a (skeleton + run-loop) and 1b (claude adapter + memory bridge) so the memory bridge gets an isolated review pass as the first cross-driver memory consumer. - ADR-005 §Memory bridge now explicitly calls out that the wrapper supplies `content` + `visibility` only; byteSize/updatedAt/ schemaVersion are server-stamped (per ADR-003 invariant #9). - ADR-004 rate-limiting non-goal rewritten without PR reference; policy stated directly. CLAUDE.md Key Documentation Files updated with ADR-004/005/006 entries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
lilyshen0722
pushed a commit
that referenced
this pull request
Apr 15, 2026
Builds on Phase 1a (#194). Wires the first real CLI adapter and the ADR-003 memory bridge into the local-CLI wrapper's run loop. - cli/src/lib/adapters/claude.js: subprocess wrapper for the `claude` CLI. detect() runs `claude --version` (+ best-effort `which` for path display). spawn() builds argv `-p <prompt> --output-format text --session-id <sid>`, prepends the §Memory bridge preamble when ctx.memoryLongTerm is non-empty, mints a UUID on first turn and returns it as newSessionId so the run loop persists it, 5-min SIGTERM timeout. Test seam: `ctx._spawnImpl` swaps out child_process.spawn for unit tests — ADR-005 §Adapter pattern now documents the pattern so future adapters don't each invent a different one. - cli/src/lib/memory-bridge.js: two CAP shims the run loop calls around every spawn. readLongTerm() extracts sections.long_term.content from GET /memory (returns '' for 404/network errors; non-404 HTTP errors surface via onError instead of swallowing). syncBack() POSTs /memory/sync with mode:'patch', sourceRuntime:'local-cli', content+visibility ONLY per ADR-003 invariant #9. No-op on empty summary. - cli/src/commands/agent.js: performRun now calls readLongTerm before spawn → injects ctx.memoryLongTerm; if adapter returns memorySummary, calls syncBack after post. Sync failure is non-fatal (pod message already posted; original err preserved via `{ cause }` for CLI debug). - Tests: 3 new suites — adapters.claude.test.mjs (detect + spawn argv + preamble + timeout), memory-bridge.test.mjs (including the 404-vs-non-404 split), run-loop.test.mjs extended to verify end-to-end memory wiring. 52/52 passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
samxu01
pushed a commit
that referenced
this pull request
Apr 15, 2026
Builds on Phase 1a (#194). Wires the first real CLI adapter and the ADR-003 memory bridge into the local-CLI wrapper's run loop. - cli/src/lib/adapters/claude.js: subprocess wrapper for the `claude` CLI. detect() runs `claude --version` (+ best-effort `which` for path display). spawn() builds argv `-p <prompt> --output-format text --session-id <sid>`, prepends the §Memory bridge preamble when ctx.memoryLongTerm is non-empty, mints a UUID on first turn and returns it as newSessionId so the run loop persists it, 5-min SIGTERM timeout. Test seam: `ctx._spawnImpl` swaps out child_process.spawn for unit tests — ADR-005 §Adapter pattern now documents the pattern so future adapters don't each invent a different one. - cli/src/lib/memory-bridge.js: two CAP shims the run loop calls around every spawn. readLongTerm() extracts sections.long_term.content from GET /memory (returns '' for 404/network errors; non-404 HTTP errors surface via onError instead of swallowing). syncBack() POSTs /memory/sync with mode:'patch', sourceRuntime:'local-cli', content+visibility ONLY per ADR-003 invariant #9. No-op on empty summary. - cli/src/commands/agent.js: performRun now calls readLongTerm before spawn → injects ctx.memoryLongTerm; if adapter returns memorySummary, calls syncBack after post. Sync failure is non-fatal (pod message already posted; original err preserved via `{ cause }` for CLI debug). - Tests: 3 new suites — adapters.claude.test.mjs (detect + spawn argv + preamble + timeout), memory-bridge.test.mjs (including the 404-vs-non-404 split), run-loop.test.mjs extended to verify end-to-end memory wiring. 52/52 passing. Co-authored-by: Lily Shen <lilyshen20021002@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
lilyshen0722
pushed a commit
that referenced
this pull request
Apr 15, 2026
… install
Three artifacts in one PR per ADR-006 §Migration path Phase 1:
1. Backend self-serve webhook install
- models/AgentRegistry.ts: `ephemeral?: boolean`; search() filters ephemeral
rows so they never appear in the marketplace catalog. Direct getByName()
still resolves them for install/uninstall.
- routes/registry/install.ts: pod-member who installs a webhook agent with
no published manifest gets an ephemeral registry row synthesized in
their name. Membership check fronts everything; non-webhook installs
without manifest still 404; podId required (ADR-006 invariant 7);
agentName regex-validated (returns 400 vs Mongoose 500); structured
[cap self-serve-install] log fires on every synthesis.
- routes/registry/pod-agents.ts: comment noting the ephemeral GC gap
(ADR-006 OQ #1) for the day someone wires the janitor.
- __tests__/integration/self-serve-install.test.js: 5 cases — happy path
(token works), non-webhook 404, malformed name 400, non-member 403,
catalog excludes ephemeral.
2. Python SDK
- examples/sdk/python/commonly.py: ~150 LOC, stdlib only. Class Commonly
with the four CAP verbs (poll_events / ack / post_message / get_memory /
sync_memory) + a run() loop. CommonlyError carries .status + .body.
ADR-003 invariant #9 documented in sync_memory; run() docstring matches
the actual no-ack-on-handler-error semantics (kernel re-delivers).
- examples/hello-world-python/bot.py: ~50 LOC echo template; reads token
from COMMONLY_TOKEN env or KEY=VALUE-formatted .commonly-env file.
3. CLI scaffolder
- cli/src/commands/agent.js: `init --language python --name <n> --pod <id>`
subcommand and an exported performInit core. Refuses to clobber any of
the 3 output files (`commonly.py`, `<name>.py`, `.commonly-env`); writes
.commonly-env with mode 0600 in KEY=VALUE format. Reads SDK + template
via import.meta.url (works from any cwd; will need bundling at Phase 4
publish — comment links to ADR section).
- cli/__tests__/agent-init.test.mjs: 4 cases — full happy path with
byte-for-byte SDK/template equality, /runtime-tokens fallback, clobber
refusal (asserts NO install POST issued), unknown-language reject.
57/57 CLI tests, 5/5 backend integration tests, type-check clean.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 tasks
8 tasks
samxu01
added a commit
that referenced
this pull request
May 4, 2026
…end state v3 and v4 made two assumptions that turned out to be wrong: 1. Designed a new /v2/marketplace page from scratch — but the route already exists and renders v1 AppsMarketplacePage (771 lines) with Discover|Installed sub-tabs. /v2/agents/browse and /v2/skills also exist as separate v2 routes rendering v1 components. The right move is to extend AppsMarketplacePage in place with kind tabs (Apps · Agents · Skills · Integrations), not author a new file. 2. Framed /api/marketplace/* (Randy, PR #215+#230) as legacy AR — but per #215's description, it's "ADR-001 Phase 2 for marketplace operations — first dual-write path between Installable (canonical) and AgentRegistry (compat shim)." The marketplace backend is Installable-canonical TODAY; what was missing is frontend wiring. v5 corrections: - Part 4a rewritten: extend existing AppsMarketplacePage with kind tabs, wire each tab to the right backend (Agents → Installable, Skills → AR for now, Apps/Integrations unchanged). Soft-redirect /v2/agents/browse and /v2/skills with deep-link query params. - Part 4d shrunk: the shared piece is a small <MarketplaceCardList> subcomponent (~250 lines) extracted FROM the page in Phase 3 and reused BY the inspector tabs in Phase 4. v4's heavyweight <V2MarketplaceList> was over-engineering. - Phasing collapsed from 8 active to 5 active. No new top-level page to design; v1 deprecation simplifies to one cleanup phase. - §"Relationship to Installable taxonomy" rewritten to reflect the per-kind split — agents are already on Installable, skills are still on AR. ADR-013 is the marketplace-frontend track from ADR-011 by definition for the agent surface. - Open question #7 (accelerate Phase 3?) closed/re-framed: the agent read-path switch is implicitly done by Randy's dual-write; the skills read-path switch defers until separately motivated. - Open question #8 (nav-rail placement) closed: page already mounted. - Open question #9 (fork ownership) narrowed: Randy's /api/marketplace/mine likely covers it; just verify response shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samxu01
added a commit
that referenced
this pull request
May 4, 2026
…ll surfaces (Marketplace + inspector tabs) (#287) * docs(adr): ADR-013 agent file production — extension tool, dev skill bundles, v2 install UI Decision spans four coordinated changes: 1. `commonly_attach_file` extension tool wrapping the upload endpoint and `[[upload:...]]` directive — protocol glue, not a skill, lives alongside `commonly_post_message` in the OpenClaw `commonly` extension. 2. Toolchain in `clawdbot-gateway` Dockerfile centered on OfficeCLI (Apache-2.0, pinned 30 MB static binary) for DOCX/XLSX/PPTX, plus pandoc for md→PDF, markitdown + pypdf for parse direction. ~170 MB image growth, replacing the ~600 MB python-office + LibreOffice stack the first draft proposed before OfficeCLI discovery. 3. Default skill bundles for the four production dev presets (Theo / Nova / Pixel / Ops) — currently shipping with empty `defaultSkills: []`. Adds `github`, `officecli`, `pandic-office`, `markdown-converter`, `pdf` (plus `tmux` for the engineers). 4. V2 inspector Skills tab (<400 lines, vs v1's 2,061-line catalog page) — search + installed list + recommended row + escape hatch to v1 Browse. Includes "Relationship to Installable taxonomy (ADR-001)" mapping each piece to the future Installable model and the migration story when ADR-001 Phase 3 (currently paused per ADR-011) unpauses. None of the work in this ADR becomes throwaway when Phase 3 lands; the data source flips, the wire shape doesn't. Rejected alternatives captured: server-side `/render` service, copying Anthropic's proprietary skills bundle, authoring duplicate Commonly-specific office skills, sub-agents per format, skill-only approach (no kernel tool), and the python-office + LibreOffice stack the first draft proposed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): ADR-013 v4 — expand Part 4 to cover full v2 install surfaces v3 had Part 4 scoped to "inspector Skills tab only" with agent install/fork pushed entirely out of scope. That left the user-asked v2 marketplace page and Agents tab undesigned. v4 expands Part 4 into four sections: 4a. Top-level V2 Marketplace page (/v2/marketplace) — unified Agents + Skills browse, replaces v1 AgentsHub.tsx (4954 lines) and SkillsCatalogPage.tsx (2061 lines) over time. Wires on top of existing /api/marketplace/* (PR #215+#230) and /api/skills/*. 4b. Inspector Skills tab — per-pod contextual install. Shows which agents have which skills. 4c. Inspector Agents tab — per-pod agent membership + install/fork affordances. Per-installed-agent: Talk to / Configure / Fork. Per-browse-agent: Install / Fork. 4d. Shared <V2MarketplaceList> component (~400 lines) consumed by both inspector tabs and the Marketplace page. Future apps and widgets ports cleanly onto the same component. Phasing expanded from 4 phases to 8 active + 2 deferred to cover the new surfaces and a v1 deprecation cycle. Open questions #8–10 added covering nav-rail placement, fork ownership backend, and PersonalityBuilder UX migration. This ADR now also lands the active "Marketplace frontend" track from ADR-011 alongside the file-production work — they're the same surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): ADR-013 v5 — correct framing after verifying backend/frontend state v3 and v4 made two assumptions that turned out to be wrong: 1. Designed a new /v2/marketplace page from scratch — but the route already exists and renders v1 AppsMarketplacePage (771 lines) with Discover|Installed sub-tabs. /v2/agents/browse and /v2/skills also exist as separate v2 routes rendering v1 components. The right move is to extend AppsMarketplacePage in place with kind tabs (Apps · Agents · Skills · Integrations), not author a new file. 2. Framed /api/marketplace/* (Randy, PR #215+#230) as legacy AR — but per #215's description, it's "ADR-001 Phase 2 for marketplace operations — first dual-write path between Installable (canonical) and AgentRegistry (compat shim)." The marketplace backend is Installable-canonical TODAY; what was missing is frontend wiring. v5 corrections: - Part 4a rewritten: extend existing AppsMarketplacePage with kind tabs, wire each tab to the right backend (Agents → Installable, Skills → AR for now, Apps/Integrations unchanged). Soft-redirect /v2/agents/browse and /v2/skills with deep-link query params. - Part 4d shrunk: the shared piece is a small <MarketplaceCardList> subcomponent (~250 lines) extracted FROM the page in Phase 3 and reused BY the inspector tabs in Phase 4. v4's heavyweight <V2MarketplaceList> was over-engineering. - Phasing collapsed from 8 active to 5 active. No new top-level page to design; v1 deprecation simplifies to one cleanup phase. - §"Relationship to Installable taxonomy" rewritten to reflect the per-kind split — agents are already on Installable, skills are still on AR. ADR-013 is the marketplace-frontend track from ADR-011 by definition for the agent surface. - Open question #7 (accelerate Phase 3?) closed/re-framed: the agent read-path switch is implicitly done by Randy's dual-write; the skills read-path switch defers until separately motivated. - Open question #8 (nav-rail placement) closed: page already mounted. - Open question #9 (fork ownership) narrowed: Randy's /api/marketplace/mine likely covers it; just verify response shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): ADR-013 v5.1 — flag catalog index numbers as 2026-02-05 snapshot Catalog index (docs/skills/awesome-agent-skills-index.json) has updatedAt: 2026-02-05 — nearly 3 months stale at draft time. All numbers derived from it (~1,659 skills, `github` skill 603 stars, license fields) are point-in-time evidence, not live counts. Added §Caveat in the Context section spelling this out, plus inline annotations on each citation. Numbers fetched live via GitHub API today (OfficeCLI 2,768 stars + v1.0.70 release date) are explicitly timestamped 2026-05-03 and treated separately. Reviewers should re-sync the catalog before any sales-pitchy or external claims using the figures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): ADR-013 v5.2 — replace invented mockup numbers with placeholders The ASCII mockups for the marketplace page and inspector Agents tab had invented numbers that looked authoritative: - "Apps (12)", "Agents (78)" — kind-tab counts I made up - "★ 4.8 · 240 installs" (Liz), "★ 4.6 · 88 installs" (Theo) - "★ 4.5 · 175 installs" (Tarik) Reviewers could mistake these for projections or commitments. Replaced all with em-dash placeholders (—) and added a mockup note clarifying that live values come from the marketplace API at runtime. Same hygiene as v5.1's catalog-snapshot caveats — distinguish what's real (cited with timestamp) from what's illustrative. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): ADR-013 v5.3 — restore GitHub stars, add per-agent skill drawer v5.2 over-redacted: removed real GitHub star counts along with the fake Commonly ratings. v5.3 re-distinguishes: - GitHub stars (real upstream signal): render. Catalog snapshot for catalog skills; live GitHub API for locally-bundled (officecli). - Commonly review-style ratings (★ 4.8): don't render. The system doesn't exist; introducing it needs its own ADR. - Install counts (real Commonly metric): render where cheap; omit otherwise. Plus added §4d **Per-agent skill management drawer** — the friendly UX the user flagged was missing. The pod-centric Skills tab (4b) shows multi-agent rollups; the per-agent drawer is single-column, single-agent, with inline [×] removal and preset-aware "Recommended for ___" surfacing capability gaps. Configure drawer opens from any agent click (Members tab, Agents tab, Marketplace agent card). Tabs inside: Persona · Heartbeat · Skills · Memory. <300 lines target. Existing 4d (shared subcomponent) renumbered to 4e. Open question #11 added for per-agent install scoping in POST /api/skills/import. Phase 4 expanded to cover the drawer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): ADR-013 v5.4 — address 8 PR #287 review comments Eight inline comments + PR-level summary on PR #287. All addressed: 1. Line 155 — "fall back to python-office stack" was a lie (Part 2 removes it). Rewrote escape paths honestly: downgrade pin (cheap) → staged Dockerfile target (mid) → full re-author (expensive). Added binary-mirror requirement (AR/GCS) to insulate from upstream artifact swap. Documented operational ownership for 2am checksum-mismatch incidents. 2. Line 130 — SHA256SUMS placeholder. Confirmed unverified whether iOfficeAI publishes SHA256SUMS. Two paths: upstream artifact OR committed-constant-checksum. Resolved in Phase 0a (i). Open question #13 added. 3. Line 516 — Open question #11 escalated to Phase 0a (ii). If syncOpenClawSkills is keyed only on accountId and multiple agents share that, per-agent install scoping is a real backend change, not a parameter add. Phase 4 may split into 4a/4b. 4. Line 461 — Phase 0b verification rewritten to be skill-free (uses acpx_run to invoke pandoc directly) since pandic-office doesn't ship until Phase 1. Full skill-loop verification moves to end of Phase 1. 5. Line 348 — Inline [×] UX. Committed to per-agent reprovision endpoint as the target with "Queued — applies on next reprovision" wording as acceptable v1. Eliminated misleading "removing on next sync" copy. 6. Line 373 — Configure drawer permissions. Flipped default to admin-only after review feedback. Per-field "visible to members" is explicit opt-in; Memory tab is never member- readable in v1 (ADR-003 §Visibility). Added Invariant 7. 7. Line 227 — Live GitHub stars. Moved from client-side direct fetch (60/hr unauth limit, no token in browser) to server-side proxy endpoint with TTL cache. Failure mode: ★ — with tooltip. 8. Line 469 — Phase 5 deletion gate. Replaced "no fallback complaints" with concrete <N hits/week telemetry threshold on redirect path. ~5 lines added to Phase 3 scope. Plus Invariant 8 making Phase 0a prereqs non-optional, Open question #12 on workspace-helper extractability, and PR estimate widened from ~5 to 5–7 to reflect Phase 0a (ii) uncertainty. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): ADR-013 v5.5 — call out acpx_run deprecation + ADR-005 composition Per CLAUDE.md and ADR-005 Stage 3, acpx_run is being phased out in favor of A2A-via-DM (agent-to-agent delegation through 1:1 agent-rooms) using wrapper agents like sam-local-codex. ADR-013 doesn't block on the deprecation — it makes it easier: - commonly_attach_file is correctly placed at the kernel layer (CAP verb), not in any specific runtime. Wrapper agents speaking CAP get file-attach for free. - Phase 0b's acpx_run smoke test is a pragmatic bootstrap using the path that exists today. The shape of the test re-runs unchanged when ADR-005 Stage 3 completes. - Default skill bundles in Part 3 are runtime-agnostic by construction — SKILL.md instructs the model to invoke binaries; the agent decides how to actually run them (acpx_run today, wrapper-agent local subprocess tomorrow). Added §"Relationship to ADR-005" parallel to the existing ADR-001 relationship section. Listed ADR-005 in the companion header. Added a bullet in §"What this unlocks" calling out that ADR-013 reinforces ADR-005 Stage 3 rather than competing with it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
test:coveragescript to frontend package.jsonTesting
npm testinbackendCI=true npm run test:coverageinfrontend