feat: v2.0.0 — MQTT events, agent hardening, and JSON contract#6
Merged
chenliuyun merged 16 commits intomainfrom Apr 19, 2026
Merged
feat: v2.0.0 — MQTT events, agent hardening, and JSON contract#6chenliuyun merged 16 commits intomainfrom
chenliuyun merged 16 commits intomainfrom
Conversation
added 16 commits
April 19, 2026 15:53
…ting - MCP HTTP now binds 127.0.0.1 by default (not 0.0.0.0) - Add --bind <host> flag to override (must have --auth-token for external) - Add --auth-token <token> flag for Bearer auth (fallback: SWITCHBOT_MCP_TOKEN env) - Add --cors-origin <url> flag (repeatable) for CORS preflight - Add --rate-limit <n> flag (default 60 req/min) per profile - Constant-time token comparison to prevent timing attacks - Graceful shutdown on SIGTERM/SIGINT with 30s drain timeout - Startup log now shows truth about binding (e.g. 'listening on http://127.0.0.1:3030/mcp') - All tests pass (659/659)
- New src/lib/idempotency.ts with LRU cache (1024 entries, 60s TTL)
- Modify executeCommand() to accept optional { idempotencyKey } param
- Thread cache through idempotencyCache.run() for transparent dedup
- No key = always execute (backward compat)
- Expired/new keys trigger fresh execution and cache update
- All tests pass (659/659)
…ency-key-prefix integration Thread idempotency keys through the CLI interface: - devices command: add --idempotency-key <key> to replay single commands safely - devices batch: add --idempotency-key-prefix <prefix> to derive per-device keys Examples: switchbot devices command BOT1 turnOn --idempotency-key abc123 switchbot devices batch turnOn --ids A,B,C --idempotency-key-prefix batch-001 All 659 tests passing. Backward compatible — idempotency is opt-in.
…ructure Lay foundation for real-time event streaming: - src/mqtt/client.ts: New MQTT client with reconnect logic, auth refresh callbacks, state management (connecting/connected/reconnecting/failed) - src/mcp/events-subscription.ts: Event subscription manager with ring buffer (1000 events), overflow detection, per-subscriber filtering, idle cleanup - src/commands/mcp.ts: Integrate shared EventSubscriptionManager into HTTP serve mode, with graceful shutdown Features: - Auth refresh callbacks on reconnect failure for cert rotation scenarios - Synthetic events for overflow notices (events.dropped) and reconnection (events.reconnected) - Per-subscriber event filtering using existing filter grammar - Idle subscriber cleanup after 10 minutes - Exponential backoff for reconnection (1s, 2s, 4s, ...30s) Note: MQTT credential resolution still TBD — awaiting SwitchBot MQTT endpoint documentation. All 659 tests passing. Foundation ready for event streaming integration.
Add detailed error information to help agents make intelligent retry decisions:
- ErrorPayload: new fields retryAfterMs, transient, errorClass
- ApiError: track Retry-After header value and classify transience
- batch command: failed[] now returns {deviceId, error: ErrorPayload} instead of {deviceId, error: string}
- schemaVersion bumped to "1.1" (backward-compatible additive change)
Error classification:
- transient: true for 429, 5xx, connection timeouts (can retry)
- errorClass: network|api|device-offline|device-busy|guard|usage
- retryAfterMs: parsed from Retry-After header when available
All 659 tests passing. Agents can now examine error.errorClass to branch on error type and use retryAfterMs to determine backoff.
Add account_overview MCP tool and CLI command for bootstrap initialization: - Bundles: device list, IR remotes, scenes, quota usage, cache status, MQTT state - Single call replaces: list_devices + list_scenes + quota status + cache show - Includes MQTT connection state in HTTP mode (eventManager.getState()) - schemaVersion 1.1, version 1.7.0 in response Useful for: - Agent cold-start (one call to understand account state) - Periodic health checks (cache age, quota, MQTT connection) - Integration debugging All 659 tests passing.
Add observability infrastructure for production monitoring:
- src/logger.ts: pino logger factory (LOG_LEVEL, LOG_FORMAT env vars)
- /healthz endpoint: always 200, returns {ok, version, pid, uptimeSec}
- /ready endpoint: 200 when MQTT connected, 503 otherwise
- /metrics endpoint: Prometheus text format (0.0.4) with gauges:
- switchbot_mqtt_connected
- switchbot_mqtt_subscribers
- process_uptime_seconds
No debug logging added yet (deferred to Phase G part 2 when needed).
Health endpoints bypass auth/rate limiting for orchestrator liveness probes.
All 659 tests passing.
Add production deployment files: - Dockerfile: multi-stage build, Node 20-alpine, unprivileged user (10001), healthcheck - docker-compose.example.yml: example setup with env vars, healthcheck - contrib/systemd/switchbot-mcp.service: systemd unit with hardening (ProtectSystem, PrivateTmp) Usage: docker build -t switchbot:1.7 . docker-compose --env-file .env up Or systemd: sudo cp contrib/systemd/switchbot-mcp.service /etc/systemd/system/ sudo systemctl enable --now switchbot-mcp All 659 tests passing.
Improve agent developer experience with richer documentation: - Upgraded tool descriptions for send_command and list_devices (120+ chars with context) - docs/schema-versioning.md: explains v1→v1.1 backward-compatibility and migration path - Clarified that schemaVersion "1.1" is backward-compatible with "1" parsers Schema versioning policy: - Additive changes (new optional fields) → minor bump (1.1, 1.2, ...) - Breaking changes → major bump (2.0) - Parsers pinning "1" continue to work on 1.1+ (backward-compatible) - Migration guide included for v1.6 → v1.7 (batch error payload change) All 659 tests passing.
Bumps package.json 1.7.0 → 2.0.0 and refreshes the hard-coded version strings inside the MCP server, /healthz, /ready, and account_overview. Adds tsconfig.build.json (sourceMap:false, declaration:false) plus a build:prod + clean + prepublishOnly pipeline so the published tarball drops .js.map and .d.ts files. Result against the prior build: - package size: 140.2 kB → 83.0 kB (−41%) - unpacked: 622.7 kB → 328.1 kB (−47%) - files: 144 → 45 A CLI binary has no consumers that import its types or need shipped source maps; local dev still emits both via the default tsc target. Version 2.0.0 is the first npm release after 1.3.2 and carries three breaking changes that land over the following commits: JSON envelope with top-level schemaVersion, batch.failed[].error shape from string to object, and HTTP MCP default bind flipped to 127.0.0.1.
Previously, HTTP MCP requests extracted x-switchbot-profile / ?profile but used the value only as a rate-limit bucket key. Every tool call then resolved credentials via the process-global --profile flag in loadConfig(), so multi-tenant HTTP deployments silently collapsed all traffic onto the default account. This change introduces src/lib/request-context.ts — a tiny AsyncLocalStorage wrapper with withRequestContext() and getActiveProfile(). loadConfig() and configFilePath() now read the active profile via getActiveProfile(), which prefers the ALS context and falls back to the CLI flag when no HTTP context is active. The HTTP handler wraps each request in withRequestContext so tool calls land in the right account. Also rejects unknown profiles with 401 before entering MCP dispatch, so probing for valid profile names is closed off and agents get a clear error instead of a confusing credentials-missing exit. Stdio mode is unchanged: no request context, so getActiveProfile() goes straight to the flag lookup. Tests: tests/lib/request-context.test.ts covers concurrent isolation, nested contexts, and flag fallback.
…lope
Every --json response now emits {schemaVersion:'1.1', data:...} on
success and {schemaVersion:'1.1', error:...} on failure, fulfilling
the contract documented in docs/schema-versioning.md.
- src/utils/output.ts: printJson wraps payload in {schemaVersion, data};
handleError JSON branch wraps in {schemaVersion, error}
- src/commands/capabilities.ts: switch raw console.log to printJson
- src/commands/schema.ts: drop non-json-mode raw branch, always use printJson
- docs/schema-versioning.md: add envelope shape examples, migration guide
from v1.x, note that batch.summary.schemaVersion is the historical
nested location kept for back-compat
- All test files updated to unwrap .data (success) or .error (failure)
from the parsed envelope
…//events resource
- src/mqtt/client.ts: add 'disabled' to MqttState
- src/mqtt/credential.ts: new file — resolve MQTT config from
SWITCHBOT_MQTT_HOST / USERNAME / PASSWORD env vars; returns null
when any are absent
- src/mcp/events-subscription.ts: getState() returns 'disabled' (not
'idle') when no client; add getRecentEvents(limit) to expose ring
buffer for MCP resource reads
- src/commands/mcp.ts:
- import getMqttConfig and call eventManager.initialize() on startup
if creds present; log a warning and leave manager disabled if not
- remove dead mqttInitialized variable
- /ready: returns 503 + {ready:false, reason:'mqtt disabled', mqtt:'disabled'}
when MQTT is not configured; 503 + reason:'mqtt failed' on failure
- /metrics: add switchbot_mqtt_state{state=...} gauge (one per state)
so dashboards can distinguish disabled/connecting/connected/failed
- register switchbot://events MCP resource backed by the ring buffer;
returns {state, count, events[]} snapshot when read
- add resources:{} to server capabilities
- tests/commands/mcp-http-health.test.ts: new file covering /ready 503
+ reason, /metrics state gauge, and EventSubscriptionManager defaults
- src/mqtt/client.ts: delete scheduleStableEvent() and its caller in
onConnect(); the timer body only nulled itself and never emitted
anything. Also remove the unused stableThresholdMs field.
- src/mcp/events-subscription.ts: replace empty catch {} with
log.debug({err, topic}, ...) so JSON parse failures on shadow payloads
are visible at debug level instead of silently discarded; simplify the
no-op try/rethrow in subscribe() to a direct parseFilter() call.
…qtt and events Type safety: - src/mqtt/client.ts: replace (err as any).code with (err as NodeJS.ErrnoException).code - src/mcp/events-subscription.ts: import Device type and construct a Device-compatible shape instead of casting a partial object as any New tests: - tests/lib/idempotency.test.ts: LRU eviction, TTL expiry, concurrent same-key behavior, undefined-key passthrough, clear() - tests/logger.test.ts: LOG_LEVEL=warn silences debug; LOG_LEVEL=debug enables it; setLogLevel/getLogLevel roundtrip
chenliuyun
pushed a commit
that referenced
this pull request
Apr 20, 2026
The v2.4.0 release notes claimed "MCP tools mirror the tier in
meta.agentSafetyTier" but only aggregate_device_history (added in 2.5.0
work) actually exposed it. This fix adds _meta: { agentSafetyTier: <tier> }
to all other 10 MCP tool registrations, matching their CLI safety tiers
from COMMAND_META:
- list_devices, get_device_status, get_device_history, query_device_history,
list_scenes, search_catalog, describe_device, account_overview: read
- send_command, run_scene: action
Also adds tests/mcp/tool-meta.test.ts to verify every tool has _meta and
spot-check key tiers match expected values.
Fixes bug #6.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
chenliuyun
pushed a commit
that referenced
this pull request
Apr 20, 2026
Document every fix landed in this branch beyond the history-aggregate feature: bugs #1, #4, #5, #6, #8, #9, #10, #11, #12, #13, #14, #15, #16, #17, #18 from the OpenClaw v2.4.0 smoke-test report. Call out the deferred items (#2, #7) explicitly so readers don't assume they were overlooked. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4 tasks
chenliuyun
pushed a commit
that referenced
this pull request
Apr 20, 2026
The v2.4.0 release notes claimed "MCP tools mirror the tier in
meta.agentSafetyTier" but only aggregate_device_history (added in 2.5.0
work) actually exposed it. This fix adds _meta: { agentSafetyTier: <tier> }
to all other 10 MCP tool registrations, matching their CLI safety tiers
from COMMAND_META:
- list_devices, get_device_status, get_device_history, query_device_history,
list_scenes, search_catalog, describe_device, account_overview: read
- send_command, run_scene: action
Also adds tests/mcp/tool-meta.test.ts to verify every tool has _meta and
spot-check key tiers match expected values.
Fixes bug #6.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR bumps 1.3.2 → 2.0.0 and lands four phases of improvements plus a set of contract fixes discovered during review.
Breaking changes
--jsonresponse is now{schemaVersion:'1.1', data:...}or{schemaVersion:'1.1', error:...}. Consumers that readparsed.foomust now readparsed.data.foo(orparsed.erroron failure).batch.failed[].errorshape —string→ErrorPayloadobject. Read.messagefor the old string content; use.transient/.retryAfterMsfor retry decisions.0.0.0.0→127.0.0.1. Pass--bind 0.0.0.0 --auth-token <token>to restore external reachability.New features (Phases A–I, already on branch since v1.3.2)
kind,transient,retryAfterMs,errorClass)account_overviewMCP tool for agent cold-start/healthz,/ready), metrics (/metrics), structured logging (pino)Contract fixes (this fixup wave)
AsyncLocalStorage— multi-tenant HTTP now actually routes each request to the correct SwitchBot account.{schemaVersion, data|error}envelope (the breaking change). All ~20printJsoncallsites now wrap automatically.EventSubscriptionManagerproperly initialized from env vars (SWITCHBOT_MQTT_*);/readyreturns 503 +reason:'mqtt disabled'when MQTT creds absent;/metricsaddsswitchbot_mqtt_state{state=...}gauge;switchbot://eventsMCP resource registered.scheduleStableEventtimer removed; swallowed JSON parse errors in MQTT shadow handler replaced withlog.debug; no-op try/rethrow removed.NodeJS.ErrnoException,Deviceshape); new tests forIdempotencyCache,logger,EventSubscriptionManagerdefaults,/ready+/metricshealth endpoints.Migration guide (consumers of the JSON output)
parsed.fooparsed.data.fooparsed.error(on stderr)parsed.error.messageetc.parsed.failed[i].error(string)parsed.failed[i].error.messagemcp servebinds0.0.0.0mcp servebinds127.0.0.1; add--bind 0.0.0.0 --auth-token $TTest plan
npm run build— clean TypeScript compile, zero errorsnpm test— 685 tests pass (41 test files)parsed.data.*)tests/commands/mcp-http-health.test.ts—/ready503,/metricsstate gauge,EventSubscriptionManagerdefaultstests/lib/idempotency.test.ts,tests/logger.test.ts🤖 Generated with Claude Code