Skip to content

feat: v1.6.0 — unified AI-agent interface (P0-P5)#5

Closed
chenliuyun wants to merge 17 commits intomainfrom
feat/ai-agent-unification
Closed

feat: v1.6.0 — unified AI-agent interface (P0-P5)#5
chenliuyun wants to merge 17 commits intomainfrom
feat/ai-agent-unification

Conversation

@chenliuyun
Copy link
Copy Markdown
Collaborator

Summary

Make the CLI the single entry point for AI agents (Claude Code, OpenClaw,
custom runners) controlling SwitchBot — replacing scattered skills, Channel
bots, and one-off scripts. Six coordinated phases, each one commit, each
independently shippable.

P0 — Unified JSON envelope. Every --json response now ships as
{schemaVersion,ok,data,meta:{command,durationMs,requestId}}. Errors also
use the envelope and go to stdout in JSON mode (not stderr) so
agents can parse one stream. --json-legacy is available as a one-release
escape hatch and is removed in v1.7.

P1 — MCP shadow-event subscription. New switchbot://events resource
pushes notifications/resources/updated on every MQTT shadow event via a
ref-counted subscription manager. events_recent tool exposes the
in-process 100-event ring buffer for polling-style clients.

P2 — MCP tool coverage parity. 8 → 15 tools. Adds devices_batch,
plan_run, webhook_setup/query/update/delete, quota_status — all go
through the same shared executors the CLI uses (no re-implementation).

P3 — MCP per-request profile (HTTP transport). Multi-tenant hosts can
route requests to different credential profiles via
x-switchbot-profile:<name> header or ?profile=<name> query. Stdio is
unchanged (boot-time loadConfig). createSwitchBotMcpServer accepts an
optional configResolver for dependency injection.

P4 — Server-authoritative quota. Response interceptor captures
X-Ratelimit-Remaining opportunistically; quota status now shows both
the local (authoritative for planning) and server (advisory) numbers, with
a freshness indicator.

P5 — Destructive-guard + audit trail. Refused destructive attempts
are recorded with result: "refused" (plus caller: "cli"|"mcp" and
destructive/confirmed flags) across single-command, batch, plan, and
every MCP tool. The audit log rotates at 10MB and writes with 0600 perms.

Test plan

  • npm run build clean
  • npx tsc --noEmit clean
  • npx vitest run — 725/725 pass
  • CLI --version guard passes (1.6.0)
  • Inspector: resources/list shows switchbot://events; subscribe fires on device change
  • Inspector: tools/list returns 15 tools
  • HTTP: x-switchbot-profile: <name> routes to the matching profile
  • --audit-log writes a refused entry when destructive command rejected

Notes

  • Backward compatible (human output unchanged; legacy JSON shape available via --json-legacy).
  • No new runtime dependencies.
  • Release notes link in README points to GitHub Releases (npm page does not show CHANGELOG).

chenliuyun added 17 commits April 19, 2026 11:38
…ackoff

Phase A implementation: MQTT infrastructure for event streaming.

- src/mqtt/types.ts: MqttCredential, DeviceShadowEvent, StreamFilter interfaces
- src/mqtt/credential.ts: fetch credential from /v1.1/iot/credential endpoint, cache with 1h TTL
- src/mqtt/client.ts: TLS client wrapper with exponential backoff reconnection
  - Parameters: initial 1s delay, 2x multiplier, 60s max, ±20% jitter, 5 max attempts
  - Connection stability tracking: reset attempt counter after 30s stable
  - AbortSignal support for graceful SIGINT handling
- package.json: version 1.4.0, add mqtt ^5.15.1 dependency
- package-lock.json: synced via npm install --package-lock-only

No observable behavior change yet; next phases add CLI commands.
Tests for Phase A MQTT infrastructure:
- credential.test.ts: credential fetching, TTL-based caching, cache invalidation
- client.test.ts: connection setup, exponential backoff configuration, jitter, cancellation

All tests pass; full test suite still green (672 passing + Phase A 14 passing).
Added parseEventStreamFilter and matchEventStreamFilter for MQTT event matching.
Supports simple deviceId/type filters like 'deviceId=ABC' or 'type=Motion\ Sensor'.
Separate from existing device list filter (which uses FilterClause).
Phase B implementation: 'switchbot events stream' for real-time device state.

- src/commands/events.ts: new stream subcommand alongside existing tail (webhook receiver)
  - Options: --filter deviceId=/type=X, --max N, --probe (connectivity check), --no-cache
  - JSONL output to stdout, human-readable status to stderr
  - MQTT message handler extracts shadow update and applies filters
  - Reuses AbortController + SIGINT/SIGTERM cleanup from events tail
- src/utils/filter.ts: added parseEventStreamFilter() and matchEventStreamFilter()
  - Simple key=value syntax separate from device list filters (FilterClause)

Feature depends on SwitchBot IoT MQTT service (non-standard, documented in help).
All tests still passing (673/673). No breaking changes to existing commands.
Phase C implementation: MQTT-backed device monitoring.

- src/commands/watch.ts: new --via-mqtt flag switches to MQTT push instead of polling
  - watchViaMqtt(): subscribes to MQTT shadow updates, emits same TickEvent format
  - Field-level diff tracking works identically to polling mode
  - Respects --max N and --include-unchanged flags
  - Uses shared AbortController/SIGINT cleanup pattern
- Extracted watchViaPolling() for code clarity (existing polling logic unchanged)

Note: MQTT mode does NOT fall back to polling on broker unavailable (per design).
If MQTT connection fails, user sees error; they can retry with --interval instead.

All tests passing (673/673).
Phase D implementation: documentation update.

- Add Release notes link to top navigation (per release rule: publish to npm requires README link)
- Add 'events' entry to Table of Contents
- New 'events — receive MQTT device updates' section with:
  - Command examples (stream, filter, probe, no-cache, JSON output)
  - Clear disclaimer: MQTT service is non-standard, undocumented, subject to change
  - Expected output format (JSONL with shadow update schema)
  - Credential caching info (1h TTL)

Explains the feature clearly without using "experimental" label (user requested).
- Import IClientOptions from mqtt for proper type checking
- Use Partial<IClientOptions> for TLS configuration
- Wrap Buffer instances in arrays (mqtt package requirement)
- Add explicit return types to event handlers
- Fix mqtt.end() call signature (force disconnect + callback)

All tests passing (673/673). Build clean.
Covers all seven findings from the post-implementation review plus
the /iot/credential integration fixes found during smoke testing:

1. events stream filter mismatch — `--filter deviceId=` and
   `--filter type=` silently matched nothing because the matcher read
   ctx.deviceMac/ctx.deviceType from the shadow payload, but those
   fields live on webhook bodies. Added matchShadowEventFilter that
   reads top-level deviceId/deviceType from the parsed shadow event.

2. ErrorSubKind extensions — introduced MqttError with subKind values
   mqtt-tls-failed, mqtt-connect-timeout, mqtt-disconnected (wired into
   ErrorSubKind union, buildErrorPayload, handleError). Credential
   fetch now maps 401/429 to ApiError(auth-failed/quota-exceeded) and
   network timeouts to MqttError(mqtt-connect-timeout). Classify
   initial connect failures via classifyMqttConnectError so cert
   errors surface as mqtt-tls-failed.

3. Runtime reconnect loop — previously MqttTlsClient only retried the
   initial connect; mid-session disconnects silently stopped streaming
   with reconnectAttempts and checkConnectionStability sitting as dead
   code. Now attaches a close handler post-connect that drives
   connectWithRetry on drop, exhausting 5 attempts before emitting an
   mqtt-disconnected MqttError to the registered runtime-error
   handler. Sleep between attempts is now abortable, so end() cancels
   a pending backoff immediately instead of waiting out the delay.

4. Credential cache path — replaced the literal '~' fallback with
   os.homedir(). Also clean up the .tmp file if the atomic
   rename fails so we do not leave orphan writes behind.

5. events stream smoke tests — added tests/commands/events-stream.test.ts
   covering extractShadowEvent parsing and the end-to-end filter path
   that broke before finding #1. Added tests/mqtt/errors.test.ts for
   MqttError classification and buildErrorPayload integration.
   Extended credential tests for 401/429/null-body handling.

6. Quota SIGINT/SIGTERM — quota.ts previously registered global
   signal handlers that called process.exit(130/143), short-circuiting
   command-layer cleanup (watch / events stream finally blocks). The
   handlers now only flush the counter; they fall back to the
   conventional exit code only when quota is the sole listener, so
   short one-shot commands keep their old behavior while long-running
   commands retain control of their own exit path.

7. Status cache cross-process staleness — status.json was read into
   a process-local hot cache with no invalidation, so a long-running
   MCP server could not see writes from a concurrent one-shot CLI.
   loadStatusCache now stats mtime before every read and reloads when
   the file has changed on disk; saveStatusCache/clearStatusCache/
   resetStatusCache update the tracked mtime accordingly. Same-process
   reads remain zero-IO when mtime is unchanged.

8. /iot/credential integration — the endpoint rejected POST {} with
   statusCode 190 "param is invalid". The signing convention differs
   from the public OpenAPI: nonce is the literal string "OpenClaw"
   (not a UUID), the HMAC signature is NOT uppercased, the `t` header
   is numeric, and the body requires a 12-char random `instanceId`.
   Response shape is also nested under body.channels.mqtt with
   topics: {status: string} (wrap to single-element array for our
   subscribe path). Error messages surface at the outer `message`
   field, not body.message. Split buildCredentialHeaders from the
   OpenAPI buildAuthHeaders to keep both conventions clean.

9. TLS material encoding — caBase64/certBase64/keyBase64 are a
   misnomer: the /iot/credential response carries literal PEM text
   in those fields. Decoding them as base64 corrupted the material
   ("PEM routines::no start line"). Pass the strings through to mqtt
   as-is. Also align connect options with OpenClaw's reference
   implementation (keepalive: 60, reschedulePings: true) and dispose
   the prior client before reconnecting so stale listeners from a
   dead TCP socket do not leak.

Tests: 697/697 passing (+24 new).
…credential preemptive refresh, error classification

- Extract shadow event parsing to shared src/mqtt/shadow.ts
- Add status cache writes for MQTT events stream + watch --via-mqtt
- Credential preemptive refresh (refresh 10min before expiry)
- Profile-aware credential cache path: mqtt-credential.<profile>.json
- SIGINT handler consistency in events stream (process.on + finally cleanup)
- Extend MQTT error classification: add mqtt-network-unreachable for ECONNREFUSED/EHOSTUNREACH/ENETUNREACH
- Verbose JSON parse error logging (--verbose shows malformed message info)
- Improve error messages (config file path + reason)
- Add JSDoc to MqttCredential.topics field
- package.json: exports field, version 1.5.0, typecheck script
- CI: add npm run typecheck step
- Update tests to match new error message format
Every --json response now wraps the payload in a stable envelope so agents
(OpenClaw, Claude Code, GPT-Actions, etc.) can parse one shape across every
command:

  { "schemaVersion": "1", "ok": true, "data": <payload>,
    "meta": { "command": "devices.status", "durationMs": 123 } }

Errors use the same envelope with ok=false and go to stdout (previously
stderr) so agents can consume a single stream for both success and failure.
Human mode output is unchanged.

Streaming commands (devices watch, events stream/tail) keep emitting bare
JSON per line — the envelope applies to one-shot responses only.

Back-compat: --json-legacy opts out of the envelope and restores the
v1.5.0 bare-payload shape. Planned removal in v1.7.0.

Bump package 1.5.0 -> 1.6.0.
Adds an MCP resource switchbot://events that agents can subscribe to via
resources/subscribe. A shared ref-counted MqttTlsClient starts on the first
subscriber, pushes notifications/resources/updated on every shadow event,
and tears down on the last unsubscribe.

A ring buffer keeps the last 100 events so a new events_recent tool can
serve agents that prefer polling over subscription. Shadow events are also
written through to the status cache for any downstream reader.

Advertises capabilities.resources.subscribe=true and tears down the
subscription manager on server close.
…6.0 P2)

Add MCP tools so AI agents no longer have to shell out for common tasks:

  - devices_batch — run one command across many devices (yes:true for destructive)
  - plan_run      — validate + execute a SwitchBot plan (v1.0)
  - webhook_setup / webhook_query / webhook_update / webhook_delete
  - quota_status  — today's local API counter

To avoid code duplication, the shared executors are extracted from the CLI
actions:
  - runBatchCommand() in commands/batch.ts — pool + destructive pre-flight
  - runPlan()         in commands/plan.ts  — step-by-step execution loop

Both helpers return structured results that the CLI and MCP paths format
the same way. Destructive guards are preserved in both surfaces (yes:true
via MCP maps to --yes via CLI) so agents cannot bypass them.

Catalog size: 8 tools -> 15 tools. No new runtime dependencies.
Multi-tenant MCP hosts can now route each request to a different SwitchBot
account by sending an `x-switchbot-profile: <name>` header (or `?profile=`
query string). The profile name maps to `~/.switchbot/profiles/<name>.json`
and is resolved fresh per request.

Changes:
  - createClient() accepts a `{ token, secret }` override, bypassing
    loadConfig() when provided.
  - loadConfigForProfile(profile?) — like loadConfig() but does NOT read
    process.argv, so it's safe for server contexts. Throws instead of
    calling process.exit so the HTTP handler can surface errors cleanly.
  - createSwitchBotMcpServer({ configResolver }) — optional resolver
    invoked lazily on every tool call. Defaults to loadConfigForProfile().
    Every tool handler now builds its axios client via getClient() so
    per-request credentials are honored.
  - HTTP transport reads x-switchbot-profile / ?profile= and wires the
    resolver. Stdio transport is unchanged (one session = one profile).
SwitchBot does not publish a dedicated quota endpoint, but some responses
include an X-Ratelimit-Remaining header. Capture it opportunistically in
the axios response interceptor and report both local + server-authoritative
numbers so agents can reason about the real remaining budget.

Changes:
  - utils/quota.ts: in-memory serverObservation with recordServerQuota /
    getServerQuota / clearServerQuota helpers. todayUsage() now includes
    an optional `server: { remaining, observedAt }` field.
  - api/client.ts: response interceptor reads the ratelimit header and
    calls recordServerQuota(). Local counter remains authoritative for
    planning since the header is not guaranteed.
  - commands/quota.ts: human mode prints "Server remaining: N (fresh)"
    when the observation is <10min old. JSON mode nests it under
    today.server.
  - commands/mcp.ts: quota_status tool now returns `serverQuotaKnown`
    as a real boolean and the server object when present.
Consolidate destructive-guard audit trail across CLI and MCP surfaces:

- audit.ts: rotate to <file>.1 at 10MB; enforce 0600 perms on every write;
  add optional destructive/confirmed/caller fields and a new
  'refused' result to AuditEntry; expose writeRefusalAudit helper.
- devices.ts (single command), batch.ts (both CLI and runBatchCommand),
  plan.ts (runPlan destructive-skip), mcp.ts (send_command, plan_run via
  caller threading): call writeRefusalAudit whenever a destructive call
  is blocked by missing --yes / confirm:true.
- lib/devices.ts executeCommand now records the destructive flag so the
  log shows which calls were against flagged commands.
- tests: rotation at >10MB moves the log to .1 and starts a fresh file;
  writeRefusalAudit emits a refused/destructive/caller entry.
- MCP server now exposes 15 tools, not 8 — list them grouped by
  concern (control/read, plans & events, webhooks, diagnostics)
  and note HTTP-transport profile routing + destructive-guard
  audit behavior.
- Test count bumped to 725 (was 592) to match the current suite.
… envelope

Review response for v1.6.0 PR #5.

- plan_run now threads an AxiosInstance factory through runPlan, so HTTP-
  transport callers with x-switchbot-profile route to the right tenant
  instead of the server-default credentials.
- EventSubscriptionManager takes an options-object constructor with a
  configResolver; when unset it falls back to loadConfig() for stdio.
- MQTT credential cache file is now content-addressed (sha256 of
  token + secret, truncated), so two tenants on one HTTP server can't
  share ~/.switchbot/mqtt-credential.json. The cache is written with
  0600 perms and an explicit post-rename chmod.
- Replace remaining bare stderr writes of {error:...} JSON in devices,
  history, expand, mcp port validation, batch, config, and format with
  printErrorEnvelope so JSON-mode errors reliably land on stdout.
- ErrorPayload.kind now includes 'guard' for the destructive-refusal
  branches.
- New regression tests: runPlan honors getClient, credential cache is
  content-addressed + 0600, EventSubscriptionManager constructor shape.
@chenliuyun chenliuyun closed this Apr 19, 2026
@chenliuyun chenliuyun deleted the feat/ai-agent-unification branch April 19, 2026 09:19
chenliuyun pushed a commit that referenced this pull request Apr 20, 2026
MCP initialize response now reports accurate serverInfo.version from package.json
instead of hardcoded '2.0.0'. This fixes misleading version to MCP clients that
gate capabilities on reported version. Updated four references in mcp.ts
(initialize response, account_overview, /healthz, /ready endpoints) to use
centralized VERSION constant imported from version.ts.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
chenliuyun pushed a commit that referenced this pull request Apr 20, 2026
Document every fix landed in this branch beyond the history-aggregate
feature: bugs #1, #4, #5, #6, #8, #9, #10, #11, #12, #13, #14, #15,
#16, #17, #18 from the OpenClaw v2.4.0 smoke-test report. Call out
the deferred items (#2, #7) explicitly so readers don't assume they
were overlooked.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
chenliuyun pushed a commit that referenced this pull request Apr 20, 2026
MCP initialize response now reports accurate serverInfo.version from package.json
instead of hardcoded '2.0.0'. This fixes misleading version to MCP clients that
gate capabilities on reported version. Updated four references in mcp.ts
(initialize response, account_overview, /healthz, /ready endpoints) to use
centralized VERSION constant imported from version.ts.
chenliuyun pushed a commit that referenced this pull request Apr 20, 2026
Document every fix landed in this branch beyond the history-aggregate
feature: bugs #1, #4, #5, #6, #8, #9, #10, #11, #12, #13, #14, #15,
#16, #17, #18 from the OpenClaw v2.4.0 smoke-test report. Call out
the deferred items (#2, #7) explicitly so readers don't assume they
were overlooked.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant