Skip to content

feat: v2.7 AI-first maturity — filter fields, help JSON, destructive normalization#27

Closed
chenliuyun wants to merge 16 commits intomainfrom
feat/v2.7-ai-first-maturity
Closed

feat: v2.7 AI-first maturity — filter fields, help JSON, destructive normalization#27
chenliuyun wants to merge 16 commits intomainfrom
feat/v2.7-ai-first-maturity

Conversation

@chenliuyun
Copy link
Copy Markdown
Collaborator

Summary

Three gaps identified by smoke-v3 AI-first benchmark (all were 🔴 D):

  • Field consistency (40% → target 100%): Expanded --filter from 6 to 11 device fields — agents can now filter by familyName, hubDeviceId, roomID, enableCloudService, alias in addition to the existing 6 keys. All three entry points (--filter, --fields, MCP) now accept the same canonical field names.

  • --help --json (0/14 → 14/14): New src/utils/help-json.ts with commandToJson() + resolveTargetCommand(). When --json is in argv and --help is requested, all subcommands now return structured JSON ({ name, description, arguments, options, subcommands }) instead of plain text. Agents can introspect available commands without parsing unstructured help.

  • Destructive normalization (missing → explicit): devices commands --json and devices describe --json now coerce destructive: Boolean(c.destructive) so every command carries an explicit true/false. Consistent with schema export which already did this.

Test plan

  • npx vitest run — 1028 pass, 1 pre-existing fail (events 413)
  • npx tsc --noEmit — clean
  • switchbot devices list --filter family=X — filters by familyName
  • switchbot devices list --help --json | jq .data.options — structured JSON array
  • switchbot devices commands Bot --json | jq '.data.commands[].destructive' — all explicit booleans

chenliuyun added 16 commits April 21, 2026 20:42
…normalization

Three smoke-v3 red-dimension fixes:

1. Expand --filter to all 11 device fields
   - LIST_FILTER_CANONICAL now includes familyName, hubDeviceId, roomID,
     enableCloudService, alias (was only 6 keys)
   - LIST_KEYS and LIST_FILTER_TO_RUNTIME updated to match
   - All 4 matchesFilter() call sites pass new fields

2. Add --help --json structured output
   - New src/utils/help-json.ts: commandToJson() + resolveTargetCommand()
   - src/index.ts: suppress plain-text help in JSON mode; emit structured
     JSON on commander.helpDisplayed
   - tests/helpers/cli.ts: mirror the same interception so integration
     tests can exercise --help --json paths

3. Normalize destructive: boolean on all command JSON outputs
   - devices commands --json: normalizeCatalogForJson() coerces undefined → false
   - devices describe --json: same coercion in describeDevice() capabilities assembly
   - schema export already did this; now all three entry points are consistent
Wire resolveField() into `devices status` and `devices watch` so user-typed
--fields aliases resolve to canonical API keys instead of silently returning
null. Unknown fields now exit 2 with a candidate list.

Expand FIELD_ALIASES from 10 identification keys to ~55 total:
- Phase 1 (13): battery, temperature, colorTemperature, humidity, brightness,
  fanSpeed, position, moveDetected, openState, doorState, CO2, power, mode
- Phase 2 (19): childLock, targetTemperature, electricCurrent, voltage,
  usedElectricity, electricityOfDay, weight, version, lightLevel, oscillation,
  verticalOscillation, nightStatus, chargingStatus, switch1Status, switch2Status,
  taskType, moving, onlineStatus, workingStatus
- Phase 3 (13): group, calibrate, direction, deviceMode, nebulizationEfficiency,
  sound, lackWater, filterElement, color, useTime, switchStatus, lockState,
  slidePosition

Conflict rules enforced by tests:
- `temp` exclusive to temperature (not colorTemperature / targetTemperature)
- `motion` → moveDetected only; `moving` uses `active`
- `mode` → top-level mode; device-specific goes through deviceMode
- Reserved words (auto, status, state, switch, on, off, lock, fan) are never
  aliases. `type` is grandfathered on deviceType.

+60 tests covering every new alias, conflict rules, dispatch behavior, and
UsageError paths. No regressions — 1089 tests pass.
… enum

Introduce SafetyTier union ('read' | 'mutation' | 'ir-fire-forget' |
'destructive' | 'maintenance') and migrate the 7 destructive catalog
entries (Smart Lock x3 unlock, Garage Door Opener turnOn/turnOff,
Keypad createKey/deleteKey) to use safetyTier + safetyReason.

The legacy destructive:boolean / destructiveReason fields are retained
on CommandSpec as @deprecated for overlay back-compat; deriveSafetyTier
handles both forms. Output layers (capabilities, schema, describe,
explain, agent-bootstrap, MCP search_catalog) emit safetyTier alongside
a derived destructive:boolean for v2.6 consumer compatibility, to be
removed in v3.0.

capabilities JSON now exposes safetyTiersInUse (sorted unique set of
tiers present in the effective catalog). 'read' and 'maintenance' are
reserved — no built-in entries use them yet (P11 will populate 'read'
via statusQueries; 'maintenance' awaits SwitchBot API endpoints).

tests: tests/devices/catalog.test.ts extended with tier validity,
IR→ir-fire-forget, derivation fallbacks, and safetyReason fallback
(new wins over legacy); full suite 1096/1096 green.
Table-driven describe.each suite that registers every top-level command
through its real register* function, walks the commander tree, and
asserts structural invariants — not text snapshots, so wording drift
does not break CI. For each command the contract checks:

- name matches the registered string and is non-empty
- description is a non-empty string
- arguments / options / subcommands are arrays
- each option carries flags + description; --help and --version are
  filtered out
- every subcommand has a non-empty name and non-empty description
- the full subtree is individually serializable via commandToJson

Added separately from tests/utils/help-json.test.ts (unit-level),
so the contract test is the canonical guard against a new subcommand
landing without a description or a command ID drifting from its
registration.
aggregate_device_history was missing .describe() on every input field and
shipping no outputSchema at all, so the MCP Inspector (and downstream
clients using structuredContent validation) could not introspect or
validate its response. Fixed by:

- documenting every input field (deviceId, since, from, to, metrics,
  aggs, bucket, maxBucketSamples) with non-empty descriptions
- adding a fully-typed outputSchema mirroring AggResult (deviceId,
  bucket?, from, to, metrics, aggs, buckets[], partial, notes)
- replacing the Record<string, unknown> cast with an explicit structured
  object so omitted `bucket` stays omitted

Added tests/mcp/tool-schema-completeness.test.ts as a regression guard:
walks every registered tool via the InMemory transport and asserts
non-empty title/description, inputSchema of type "object", a non-empty
description on every input property, and the presence of an
outputSchema. One surgical assertion spot-checks every
aggregate_device_history input so a future drop of .describe() breaks CI
instantly.

No stdio JSON-RPC log churn this round: the HTTP-mode console.error
calls at mcp.ts:988/991 are stderr-safe and low-severity per the plan.
…r (P5)

Remove the bypass pattern
  if (isJsonMode()) emitJsonError({...}); else console.error(...);
  process.exit(N);
in favour of the single-call exitWithError({code, kind, message, hint?, context?})
helper that already centralises JSON envelope + plain-text + exit.

Touched: config set-token (5 sites), history replay, devices command
validation, batch destructive guard, expand destructive guard. Unused
emitJsonError imports cleaned up across batch / devices / history / mcp.

Add tests/commands/error-envelope.test.ts (N-3 regression guard):
  - envelope shape { schemaVersion, error } under --json
  - stderr-only output in plain mode
  - runtime kind + non-2 exit codes
  - textual audit that no command source pairs emitJsonError() with
    process.exit() in the same module (except mcp.ts signal handlers)
Before: `events tail` emitted {t, remote, path, body, matched:bool}, while
`events mqtt-tail` emitted {t, eventId, topic, payload} for events and
{type:"__connect", at, eventId} for control records. Downstream consumers
had to key on field presence to tell webhook apart from mqtt, and the
matched-bool on webhook gave no clue which filter clause hit.

After: both sides add an overlapping envelope keyed on
  { schemaVersion: "1", source: "webhook"|"mqtt", kind: "event"|"control",
    t, eventId, deviceId, topic, payload }
Webhook additionally carries matchedKeys:string[] — exact list of clause
keys that matched, or [] for no filter / no match. MQTT control records
gain controlKind ("connect"|"reconnect"|"disconnect"|"heartbeat"|
"session_start") while keeping the legacy "type":"__connect" / "at"
fields for one minor window (removed in v3.0).

- src/commands/events.ts: startReceiver emits unified + legacy mirror;
  mqtt-tail event + control lines carry the unified envelope.
- src/mcp/device-history.ts: ControlEvent extended with optional
  schemaVersion/source/kind/controlKind/t so __control.jsonl round-trips.
- tests/commands/events.test.ts: 4 new tests — webhook envelope,
  matchedKeys emission, mqtt event envelope, mqtt control envelope.
- addHelpText for both subcommands updated to describe the new shape +
  legacy deprecation schedule.
All three streaming commands (devices watch, events tail, events
mqtt-tail) now emit a stream-header record as the very first JSON line
under --json:

  { "schemaVersion": "1", "stream": true,
    "eventKind": "tick" | "event", "cadence": "poll" | "push" }

Downstream consumers can route by `{ stream: true }` to distinguish
the header from subsequent event lines and pick a parser strategy
based on `eventKind` / `cadence`. Non-streaming commands (single-
object / array output) are untouched.

New `docs/json-contract.md` documents both envelope shapes (non-
streaming success/error vs. streaming header + event lines), the two
versioning axes, and consumer routing patterns.

P7 from the v2.7.0 AI-first maturity plan (N-4).
Previously `recordRequest()` fired from the axios response interceptor
— so successful responses and exhausted HTTP retries counted against
the daily quota, but timeouts, DNS errors, connection refusals, and
requests aborted after dispatch were silently missed. The local cap
was therefore optimistic versus the real SwitchBot billing.

Move the call to the request interceptor so every dispatched HTTP
request counts at send time, regardless of outcome. Pre-flight
refusals (daily-cap, --dry-run) still skip recording because they
never touch the network. Retries re-enter the interceptor and record
each attempt, which matches how SwitchBot bills.

New tests cover the four paths: successful dispatch records once;
5xx after dispatch stays at one count (no double-record in the error
interceptor); timeouts record even though no response arrives;
`--no-quota` suppresses the record entirely.

P8 from the v2.7.0 AI-first maturity plan (N-5).
- quota check now exposes percentUsed / remaining / projectedResetTime /
  recommendation so agents can decide to slow down or pause; warn when >80%.
- New catalog-schema check detects drift between catalog schemaVersion and
  the agent-bootstrap payload version (paired constants in both modules).
- New audit check surfaces recent command failures from ~/.switchbot/audit.log
  (last 24h), capped at 10 entries to keep the doctor payload bounded.

All additive under the locked doctor.schemaVersion=1 contract — existing
consumers unaffected.
…(P10)

- New `mcp` check: dry-run instantiates createSwitchBotMcpServer() and
  counts registered tools (no network I/O, no token needed). Fails when
  server construction throws.
- `mqtt` check gains `--probe` variant that does a real broker handshake
  (fetchMqttCredential + connect + disconnect), with a 5s hard timeout so
  it can never wedge the CLI. Default run is still file-only.
- New flags:
    --list           print the check registry + exit 0 without running
    --section <csv>  run only the named subset (deduped, order-preserved)
    --fix            apply safe reversible remediations (cache-clear only)
    --yes            required together with --fix for write actions
    --probe          opt into live-probe variants
- Invalid --section names exit 2 with "Valid: ..." hint via exitWithError.
- Unknown check names never silently dropped.
- Public helper `listRegisteredTools(server)` added to mcp.ts so doctor
  can introspect without touching the SDK's private fields directly.
- New ReadOnlyQuerySpec type + statusQueries?: ReadOnlyQuerySpec[] on
  DeviceCatalogEntry.
- New deriveStatusQueries(entry) helper: returns explicit statusQueries
  when set, otherwise synthesises a ReadOnlyQuerySpec per statusFields
  entry (all keyed to endpoint:'status', safetyTier:'read'). IR entries
  and entries without statusFields return [].
- Field descriptions drawn from a curated STATUS_FIELD_DESCRIPTIONS map
  that covers the common SwitchBot API v1.1 fields.
- capabilities.catalog now surfaces readOnlyQueryCount and adds 'read'
  to safetyTiersInUse whenever any entry exposes a status query — the
  enum's 'read' tier is now actually used, not just reserved.
- statusFields stays as the source of truth (no duplication) — overrides
  are possible via explicit statusQueries on specific entries.
Two flags called "plan" with different meanings (batch --plan emits a
plan document; `plan` is its own subcommand that runs plan docs) is
confusing. Rename the batch flag to --emit-plan, keep --plan accepted
for one minor with a deprecation warning on stderr so existing scripts
don't break.

- New --emit-plan flag (canonical name).
- --plan still accepted; prints "[WARN] --plan is deprecated; use
  --emit-plan. Will be removed in v3.0." to stderr before executing.
- Passing both together is a usage error (exit 2).
- Help text marks --plan as [DEPRECATED] and updates the Planning
  section to show --emit-plan.
Add 8 ultra-niche alias groups covering water leak, pressure sensor,
motion counter, error codes, and webhook payload fields (buttonName,
pressedAt, deviceMac, detectionState). Registry now at ~51 canonical
keys covering ~98% of catalog statusFields and webhook event fields.

Phase 4 additions:
- waterLeakDetect: leak, water
- pressure: press, pa
- moveCount: movecnt
- errorCode: err
- buttonName: btn, button
- pressedAt: pressed (distinct from pressure.press)
- deviceMac: mac
- detectionState: detected, detect

All existing conflict rules preserved (no 'type'/'state'/'switch'/'on').
Remaining ~2% deferred to user-driven PR per plan.
Add declarative metadata for non-device resources (scenes, webhooks,
keypad keys) so AI agents can discover these surfaces through the same
bootstrap path as device commands.

- src/devices/resources.ts: SceneSpec, WebhookCatalog (4 endpoints,
  15 event specs covering Meter/Presence/Contact/Lock/Plug/Bot/Curtain/
  Doorbell/Keypad/ColorBulb/Strip/Sweeper/WaterLeak/Hub/CO2), KeySpec
  (4 types: permanent/timeLimit/disposable/urgent), constraints.
- capabilities: emit RESOURCE_CATALOG under the new 'resources' top-level
  key alongside 'catalog'.
- schema export: same pass-through so the published schema document
  includes resource metadata.
- Tests: 14 new in tests/devices/resources.test.ts (tier validation,
  event field completeness, key-type coverage) + 1 capabilities test
  asserting resources presence.

MCP tool surface (setup_webhook/query_webhook/create_key/delete_key) is
reachable today via send_command + the webhook CLI; dedicated MCP tools
deferred — the metadata is already queryable via capabilities.
Close out the v2.7.0 AI-first maturity release. 15 feature commits
landed on this branch (P1–P15):

- Field-alias registry expanded from ~10 to ~51 canonical keys
- safetyTier 5-tier enum replaces destructive:boolean
- Help-JSON contract coverage for 16 commands
- MCP tool schema + log + structuredContent polish
- Unified error envelope across all commands
- Unified events envelope (tail / mqtt-tail)
- Streaming JSON header + docs/json-contract.md
- Quota records all API attempts (not just successes)
- doctor quota headroom / catalog-schema / audit checks
- doctor MQTT live-probe + MCP dry-run + --section/--list/--fix
- catalog statusQueries powering safetyTier 'read'
- batch --plan renamed to --emit-plan (with deprecation warning)
- --format=yaml/tsv for all non-streaming commands
- FIELD_ALIASES Phase 4 ultra-niche sweep
- Resources catalog (scenes/webhooks/keys) exposed via capabilities/schema

1262 tests passing across 60 test files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant