Add a /research skill#8
Merged
Merged
Conversation
Evidence-based investigation with adversarial validation. Recommends a separate /research skill scoped to open-ended, output-agnostic research. Includes plain-language summary, for/against evidence table, and four cross-referenced artifacts (investigation angles + adversarial pass).
plan-a-feature Steps 1-5: behavioral spec for a new /research skill (question -> evidence -> options landscape -> recommendation -> adversarial validation), 11 full + 3 trivial decisions. Three forks settled by user: web+codebase reach, new research agent + reuse, and small/medium/large swarm sizing. No tech-notes qualified.
plan-a-feature Steps 5.5-7. Medium-size review team (junior-developer, gap-analyzer, edge-case-explorer, adversarial-security-analyst). Resolved 16 major + 6 minor findings: added untrusted-web-source handling (D16), research sizing signals (D15), compound-question (D17), hybrid-routing (D18), output-collision guard (D19); strengthened evidence sourcing (D11) and validator charter (D7); dropped gap-analyzer from the roster per user (D4). Decision log + findings log updated and cross-referenced.
plan-a-feature Step 8. project-manager (synthesis mode) verified all 22 findings discharged in-file, confirmed cross-reference invariants and no mechanics leak, and fixed a broken anchor (D14 promoted to heading so the spec's #d14-invocation-surface link resolves).
D20: rollout plan owned by plan-implementation, ~14+ files with the count/sizing surfaces enumerated. D21: group /research next to /investigate under a relabeled "Investigation & research" grouping. Spec Open Items, Summary, and Out of Scope updated; decision log and findings log cross-referenced. OI-3 remains, pending the skills-calling-skills investigation.
Full /investigate run (3 evidence-based-investigators + claude-code-guide + adversarial-validator). Adversarial pass overturned the naive "blanket-ban" reading: data-fetch sub-skills are evidenced-unreliable, orchestration is underdetermined (unsupported assertion, no documented failure), recommended pattern is Agent-tool dispatch + inline discovery. Decisive for OI-3 (V8): /research invokes no skills (routing = naming a sibling, not calling it), so it already complies; only build-time check is that SKILL.md allowed-tools omits Skill. Broader six-file guidance contradiction tracked as a separate ADR-worthy Han maintenance item. Investigation artifact added; OI-3 closed; spec/decision-log/findings cross-referenced. All open items resolved.
New swarming skill plugin/skills/research/ (SKILL.md + report template) and plugin/agents/research-analyst.md, implementing the spec at docs/plans/research-skill/feature-specification.md. Sized small/medium/ large with research-specific signals (D15); question -> sourced evidence -> options landscape -> recommendation -> adversarial validation spine (D6/D7); untrusted-web-source controls — data-not-instruction, web/ codebase isolation, corroboration (D11/D16); compound/hybrid/redirect classification (D8/D17/D18); output-collision guard (D19); allowed-tools includes web + Agent and omits Skill per D22.
docs/skills/research.md and docs/agents/research-analyst.md (coverage rule). Reciprocal 'use research' boundary statements added to all five neighbors per D9 — investigate, plan-a-feature, coding-standard, gap-analysis, architectural-analysis — in both SKILL.md descriptions and long-form 'Do not invoke for' sections, completing the bidirectional disambiguation /research's own description already declares.
Counts bumped to 19 skills / 22 agents across CLAUDE.md, README.md, docs/concepts.md, and every long-form doc footer. /research registered as the 7th sizing-aware skill in sizing.md (enumeration + table), concepts.md, skills/README.md, README.md, and quickstart.md. Skills index grouping relabeled 'Investigation & research' per D21 with the /research entry; research-analyst added to the agents index. New quickstart Path E plus a combining-paths example. Bidirectional Related-docs link between investigate and research. No version bump, no CHANGELOG change, manifests auto-discover.
YAGNI is a planning/implementation gate, not a research standard. Drop the See-also breadcrumb, the dedicated ## YAGNI section, and the Related-docs bullet from docs/skills/research.md and docs/agents/research-analyst.md, matching the convention used by other non-YAGNI skill/agent docs (project-discovery, update-pr-description, project-scanner). /research was never registered in yagni.md or the concepts YAGNI list, so no index change is needed.
D23: evidence required by default; operator can opt into exploratory (evidence-optional) mode; report always labels every claim's evidence status and states the recommendation's evidence basis. D24: one fixed report structure — plain-language Summary at top, Research Results (minimal tech detail), indexed Options to Consider, Recommendation with evidence basis, Validation, an indexed Artifacts registry (link + summary per source), and a References section at the very bottom; all cross-referenced inline by artifact ID for full traceability. Spec Outcome/Primary Flow/Edge Cases/User Interactions and decision-log cross-refs (D1->D24, D11->D23) updated; user-input decision count 5->7.
SKILL.md: detect strict (default) vs exploratory evidence mode in Step 1 and thread it through briefs; Step 6 compiles an indexed Artifacts registry (link + summary + trust class + corroboration status) instead of a flat evidence list; Step 7 synthesizes plain Research Results + indexed Options + Recommendation with explicit evidence basis; Step 8 renders the one fixed structure. Report template rebuilt: Summary (plain, top) -> Research Results -> Options to Consider -> Recommendation -> Validation -> Artifacts -> References (bottom), cross-referenced by artifact ID for full traceability. research-analyst agent output format + rules updated to artifacts/results/options with evidence-mode handling. Long-form docs (research.md, research-analyst.md) updated for the new structure and the evidence-mode override.
Full code-review (large; junior-developer + adversarial-security-analyst + manual conformance pass) at docs/plans/research-skill/artifacts/code-review.md. Fixed: - CRIT-001: extend shared adversarial-validator with a 4th, generally- applicable strategy (Challenge the Evidence-Gathering Integrity: injection/astroturfing/staleness/single-source) so D7's web-reach defense is enforced at the agent level, not only via brief text; vocabulary, anti-pattern, rules, and long-form doc updated. Additive and valuable for /investigate and planning consumers too. - WARN-001: Step 5 brief exclusion now also bars operator/CLAUDE context, matching the Operating Principle (closes an exfiltration precondition). - WARN-002: added the missing ## Sizing section to docs/skills/research.md. - WARN-003: CLAUDE.md and docs/sizing.md six-skill enumerations now include /research (seven). - SUGG-001..005: template Summary/cross-ref contradiction; codebase- explorer added to research-analyst Related docs; directory link now targets the file; role identity tightened to the token budget; argument-hint surfaces the evidence-mode opt-in. Surfaced, not fixed: WARN-004 (em-dashes) — writing-voice.md bans them but every plugin file uses them; project-pattern deference makes this a repo-wide reconciliation, not a /research-only correction.
mxriverlynn
added a commit
that referenced
this pull request
May 26, 2026
allowed-tools is Level 1 frontmatter, always loaded in every conversation. The list previously enumerated 17 Bash runners covering every major stack. Pruning to the runners most users actually invoke (npm/npx/pnpm/yarn, pytest/python3, go, cargo, make, bundle/rake, plus git and find) drops four entries: mix, dotnet, gradle, mvn. Users on Elixir, .NET, or JVM stacks will see a one-time permission prompt for their build tool. Other stacks lose nothing. The cross-language posture is preserved where it matters most; the always-loaded token cost shrinks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds
/research, a sizing-aware skill that takes an open-ended question and returns an evidence-backed, adversarially-validated report recommending an option without producing any committed artifact, so that Han has a question-shaped sibling to/investigateinstead of overloading the bug pipeline./researchskill (Han's 7th sizing-aware skill, 19th skill overall) plus a newresearch-analystagent (22nd agent). It reusescodebase-explorerfor codebase-grounded evidence andadversarial-validatorto attack the result. The investigation behind this (recommendation.md) deliberately rejected expanding/investigateor building a two-mode skill, on Han's single-responsibility rule.adversarial-validatoragent with a new, generally-applicable 4th strategy ("challenge the evidence-gathering integrity"). This is the load-bearing change: it moves the web-reach threat model (indirect prompt injection, astroturfing, single-source laundering) from brief text into the agent's hardcoded contract, and also strengthens/investigate. Reviewers should weigh whether this 4th strategy is correctly scoped as additive and low-risk for existing consumers.Behavior changes
/investigateis a symptom→root-cause→fix pipeline;/plan-a-feature,/coding-standard,/gap-analysis, and/architectural-analysiseach do research only as a bounded step toward a fixed artifact. There was no way to research options before committing to anything./research <question>classifies and sizes the question, fans research agents out across the codebase, the open web, and provided material, consolidates a numbered evidence/artifact registry, builds an options landscape, recommends one option, runs an adversarial-validation pass that can overturn the recommendation, and writes a fixed-structure report (plain-language Summary → Research Results → Options → Recommendation+evidence-basis → Validation → Artifacts registry → References). It never emits a spec, standard, gap report, architecture assessment, or code. Out-of-scope, hybrid, and compound requests are routed or split rather than forced through./investigateand 4 other neighbors now route research-shaped requests back to/research.What to look at first
adversarial-validator4th-strategy change (plugin/agents/adversarial-validator.md). This is the only edit that changes behavior of an existing, multi-consumer agent. CRIT-001 in the code review found the web-reach threat model was depending on brief text overriding the agent's closed 3-strategy contract; the fix promotes it into the agent. Confirm the "all three"/"minimum 5" wording was updated consistently and that it stays additive for/investigate.plugin/skills/research/SKILL.mdOperating Principles + Step 5,research-analyst.mdanti-patterns). The skill reaches the live web, so untrusted content is a first-class input. Decisions D16 (data-not-instruction, context isolation, trust labeling) and D11 (corroboration, retrieval dates) are the defenses. WARN-001 specifically tightened Step 5's brief exclusion to match the stricter Operating Principle — worth confirming the two now agree.writing-voice.mdbans unconditionally — but every existing plugin file already uses them. The code review surfaced this (WARN-004) as a repo-wide standard-vs-practice contradiction and deliberately did not de-em-dash/researchin isolation, since that would make it the lone outlier. This wants a repo-wide decision, not a fix in this PR.Files of interest
plugin/skills/research/SKILL.md— the skill's behavior: classification, sizing, roster, web-reach controls, 8-step flow.plugin/agents/research-analyst.md— the new agent owning the web/prior-art and option-comparison angles, with the data-not-instruction anti-patterns.plugin/agents/adversarial-validator.md— the shared-agent 4th-strategy change; the only behavior edit affecting an existing consumer (/investigate).docs/plans/research-skill/recommendation.md— why/researchis a separate skill, with the adversarial validation that corrected the original evidence.docs/plans/research-skill/artifacts/decision-log.md— D1–D24, where every behavioral tradeoff (web reach, roster, sizing, untrusted-source controls, evidence mode, report structure) was settled.