Skip to content

release: flashduty skill v1.3.15 — generated cards + withdraw from skills.sh#57

Merged
ysyneu merged 21 commits into
mainfrom
feat/ai-sre
Jun 21, 2026
Merged

release: flashduty skill v1.3.15 — generated cards + withdraw from skills.sh#57
ysyneu merged 21 commits into
mainfrom
feat/ai-sre

Conversation

@ysyneu

@ysyneu ysyneu commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Cuts v1.3.15. Ships the skilldoc generated-cards skill system (#56): __dump-commands cobra-tree oracle, harvester/validator/generator, 19 domain reference cards, the incident-summary.sh compound-flow script — and withdraws the public skills from skills.sh (deletes the 10 superseded drifted per-domain skills, de-advertises both READMEs, marks the consolidated flashduty skill hidden: true). CLI commands unchanged. Authorized by Boss (发版cli).

ysyneu added 21 commits June 20, 2026 11:06
The validator harvests fduty references from inline backtick spans as well as fenced blocks. Prose mentions — a bare `fduty` word or a templated `fduty <group> <verb>` — resolved to an empty/placeholder command path and were wrongly flagged unknown-command. Skip an unresolved example whose command-path words are empty or contain a placeholder, mirroring the existing placeholder tolerance for flag values. A non-placeholder name that still doesn't resolve (e.g. statuspage) is still reported — catching command-name drift is the validator's job.
…rated reference)

First card of the generated-cards skill: a lean SKILL.md router (shared 3-layer model, toon/empty-result/safety conventions, domain index) plus reference/status-page.md. Its factual fence (every verb + flags + enums + --data body, synced from the cobra tree via make gen-cards) is wrapped by hand-written routing, hot flow, status values, gotchas, and a worked example. make check-cards green.
cligen folds a single required *_id field into a positional argument (Use becomes e.g. "change-active-list <page-id>", Args=requireExactArg); the same-named --flag stays registered but passing it WITHOUT the positional fails the binary's Args check ("missing page_id"). The oracle discarded Use (Path uses c.Name(), which strips the placeholder), so the generator rendered these as --page-id flags — teaching an invocation that fails at runtime. Live E2E caught this; the validator can't, since the flag genuinely exists.

Capture c.Use in the dump and render folded fields as positionals: the verb heading carries the signature (### change-active-list <page-id>) and the field shows as a (positional, required) row instead of a --flag. Commands needing two required ids (change-info, change-delete, change-timeline-*) don't fold and correctly keep --page-id/--change-id flags. Normalization matches plural array wires (incident-ids) to singular placeholders (<incident-id>).
Regenerate the status-page fence (now positional-aware) and fix the curated hot-flow + worked example to pass page-id positionally on change-create/change-active-list/change-list, while keeping --page-id/--change-id flags on the two-id timeline verbs. Add a positional-args convention note to SKILL.md and a leading gotcha. make check-cards green.
… validated as fduty flags

An example like `fduty member list --json | jq --argjson ids …` tokenized the whole line, attributing jq's --argjson to the fduty command (unknown-flag false positive). Truncate the invocation at the first standalone shell operator (| || && ; > …); tokens after it are a different command. Cross-domain cards pipe to jq heavily, so this is the shared chokepoint.
…l domains

Cards for incident, monit, channel, enrichment, insight, alert, template, role, rum, team, schedule, member, calendar, field, route, oncall — each a lean curated router (route-here, intent->verb, hot flow, gotchas, worked example) wrapping a generated factual fence (every verb + flags + enums + positional + --data, synced from the cobra tree). Authored by per-domain agents from the dump (truth) plus harvested judgment from the old fc-safari bootstrap references and cli per-domain skills; agents used the dump to override drifted source command names. SKILL.md domain index now routes all 17 domains. make check-cards green (two issues caught + fixed: member-list --name->--query drift in team.md; the harvester pipe false-positive).
…terminism note

monit-query (datasource-side RCA: diagnose log_patterns/metric_trends, raw rows passthrough) and monit-agent (on-box host/db diagnostics: catalog->invoke up to 8 tools) are the diagnostic sibling groups the old fc-safari SKILL.md cited as deeper probes; they were excluded from the bulk rollout. Add lean cards (discover-datasource-first, time-in-query-not-flags, transient-5xx-retry, catalog-before-invoke, heredoc for SQL params) + index rows. Add a SKILL.md note: fduty answers config/RBAC/enrichment/monit/oncall questions directly — don't grep docs or dispatch a subagent for what the CLI covers. check-cards green (19 cards).
…ichment/route/oncall cards

Live cross-domain eval caught three judgment-layer errors the static validator can't (wrong data PROVENANCE in prose, not wrong command syntax):

- enrichment + route claimed integration-id comes from `channel list`. It does not — channel list/info surface channel_id, never an integration id. A real integration_id comes from `alert list` (every alert carries integration_id + integration_name). The eval agent fed channel ids to enrichment and brute-forced 47 of them → 400 'Integration not found'. Fixed the provenance + added a 'don't enumerate every id' guard.
- oncall's name-resolution example assumed `member list --json` is a top-level array keyed by person_id; it is actually .items[] keyed by member_id (+ member_name). Replaced with a correct single-pass join (capture member list, join person_ids→member_name) and a 'don't loop refining jq; person_ids is a fine answer' guard.
…scalar/array folding

From the post-implementation quality gates (adversarial bug review + simplify pass):

- BUG 1 (validator false negative, the high-value one): a field cligen folds into
  a required positional (e.g. status-page change-active-list <page-id>) keeps its
  same-named flag registered, so the old validator waved through `--page-id 5`
  even though the binary rejects it at the Args check. This is the exact class of
  error that previously only a live E2E run caught. Now that the oracle carries
  cobra's Use, thread the folded set into commandIndex and report it statically as
  kind "positional-as-flag". check-cards now enforces the positional convention
  across all cards and prevents regressions.
- BUG 2 (generator over-suppression): foldedFlagNames matched on a trailing-"s"-
  stripped key, so an unrelated plural flag (--types) collided with a scalar
  <type> positional and was dropped from the docs. Replaced with exact matching:
  a scalar positional folds its exact flag; an array positional (<incident-id>
  [<id2>...]) folds the plural *-ids wire. Verified the suppression set is
  identical for every real command (no fence drift; all 19 cards stay fresh).
- BUG 4 + simplify: removed dead ["/"] stripping in placeholderInner (only <...>
  tokens reach it), deleted the now-unused normalizeID, and inlined the
  single-caller commandWords helper.

Tests: added TestValidate_FoldedPositionalAsFlag (folded-as-flag flagged; positional
usage and a non-folding two-id command stay clean) and
TestFoldedFlagNames_ExactScalarAndArrayPlural. Full suite + check-cards green.
A live Chinese-terminology eval (12 real product-term queries) confirmed routing is correct end-to-end, but exposed a vocabulary trap worth hardening: Flashduty's product calls a `channel` a "协作空间" (collaboration space), not the naive "频道". SKILL.md's domain index already maps 协作空间→channel (so all 12 queries routed correctly), but the channel card's own "Route here when" trigger list omitted it. Add 协作空间 + an explicit "协作空间 IS the channel noun" note so the card is self-consistent and robust if read directly.
… fallback; fix war-room verbs

Addresses the pre-PR review findings + the dropped Safari-runtime line:

- gen-cards (review Bug 1+2): the Makefile target and the stale-fence error hint both pointed at `make gen-cards`, but that only regenerated status-page — drift in any other card was unfixable via the documented command. `skilldoc gen` now takes an optional group: with none it regenerates every dump group that has a card (set derived from the dump ∩ existing cards, no hardcoded list; cardless groups like webhook are skipped). Makefile gen-cards drops the status-page arg.
- incident.md (review Bug 3): the war-room intent-table row wrote `war-room list` / `war-room create` (space) but the real leaf verbs are fully hyphenated `war-room-list` / `war-room-create`. The validator can't catch this (the span isn't fduty-led), so it was a latent wrong-invocation. A full scan of all cards confirms this was the only such case.
- SKILL.md: restore the `fduty: command not found` CDN install fallback (dropped in the lean rewrite) under Auth & availability — generic, one line, actionable.

New test TestRunGenAll_FillsEveryCardAndSkipsCardless. Full suite + check-cards green.
… errcheck

Two CI failures on PR #56, both real:

- build (windows-latest): `skilldoc check` reported all 19 cards stale. Root cause:
  the fence freshness check compares the card's fence block byte-for-byte against a
  generated (LF) render, but Windows git checks the cards out with CRLF, so every
  comparison mismatched. Fix the root cause — normalize CRLF→LF on read (the
  generated fence is canonical LF) so the tool is insensitive to checkout settings —
  and add a .gitattributes pinning the cards to LF so the committed files stay
  canonical. ubuntu/macos were already green.
- lint (golangci-lint / errcheck): two unchecked fmt.Fprint* return values in
  main.go (the 'cards OK' line and the per-issue print loop). Propagate both errors.

New regression test TestRunCheck_CRLFCardIsClean. Local: go test green, golangci-lint
0 issues, check-cards OK.
…index, fault-analysis script

Rework the flashduty skill per review feedback:

- SKILL.md is now a clean PUBLIC router: strip Safari-internal behavior
  (credential injection, subagent dispatch, the `task` tool). Only the
  reference/*.md cards and scripts/ are shared with the Safari embed; each
  consumer owns its own SKILL.md.
- Merge the `oncall` domain into `schedule` (one on-call / 值班 domain).
  `oncall who` is kept as the deprecated current-on-call lookup; the durable
  path is schedule-native `info --start now`. Drop reference/oncall.md.
- Add reference/change.md (the `change` group was previously uncovered).
- Make the domain index bilingual (Chinese + English routing terms per row).
- Add scripts/incident-summary.sh: one call fetches all six fault-analysis
  aspects (detail/alerts/timeline/similar/post-mortem/changes) so a compound
  summary is written from real output instead of fabricated piecemeal.
…main skills

The cli publicly advertised 10 hand-written per-domain skills via `npx skills add` (README + skills.sh). They are superseded by the consolidated, oracle-validated `skills/flashduty/` skill and had drifted from the binary (`flashduty-statuspage` taught `statuspage create-incident`, which does not exist).

Until the consolidated skill is hardened for public release (anti-fabrication guard + evals-in-CI, like vercel-labs/agent-browser), withhold it from public discovery instead of shipping it half-ready: delete the 10 superseded per-domain skill trees; remove the "Agent Skills" advertisement from README.md + README_zh.md; mark skills/flashduty/SKILL.md hidden:true (kept in-repo as Safari's source-of-truth + the skilldoc validator target).

The consolidated skill stays and validates clean (`make check-cards`). Safari embeds its own copy unchanged; reference/ + scripts/ remain byte-identical.
feat(skilldoc): generated+validated flashduty skill — lean SKILL.md router + 19 domain cards
@ysyneu ysyneu merged commit 6a9294e into main Jun 21, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant