Skip to content

Add ranked next-action diagnostics for detect / doctor#47

Merged
pengfei-threemoonslab merged 5 commits intomainfrom
claude/gallant-mclean-36d7a4
May 7, 2026
Merged

Add ranked next-action diagnostics for detect / doctor#47
pengfei-threemoonslab merged 5 commits intomainfrom
claude/gallant-mclean-36d7a4

Conversation

@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor

Summary

  • Adds a ranked, machine-readable recovery surface so a coding agent that hits a common first-run failure (no shipgate.yaml, zero tools, MCP/OpenAPI artifact-only repo, dynamic toolsets, missing source file, unresolved CHANGE_ME, non-agent / pure-prompt workspace, production target without permissions) gets a routable next step in JSON without having to consult docs.
  • New cli/diagnostics.py module: NextAction + Diagnostic Pydantic models with strict validators, a 10-entry catalog under stable SHIP-DIAG-* ids, and pure-functional resolvers (diagnose_detect, diagnose_doctor, diagnose_missing_manifest, top_next_actions).
  • detect --json and each doctor --json payload now carry diagnostics: [...] and next_actions: [...] blocks alongside the existing single-string next_action. The legacy field stays string-typed for every kind by projecting from the rank-1 action (Edit <path>, Stop: <why>, etc.), so consumers that only read next_action keep working.
  • AGENTS_SHIPGATE_AGENT_MODE=1 error JSON gains the same next_actions array. Audited every emit site: 7 in main.py, 2 in scenario.py, and the local helper in apply_patches.py.

Type

  • CLI or GitHub Action behavior
  • Documentation only (also)

Behavior change (deliberate, one)

agents-shipgate doctor no longer raises InputParseError(3) when a required tool_sources[].path does not resolve. It now exits 0 with:

  • unresolved_sources: [{id, declared_path, line}] in the per-manifest payload
  • a SHIP-DIAG-MISSING-SOURCE-FILE diagnostic whose rank-1 action is an edit pointing at shipgate.yaml:<line>

agents-shipgate scan is unchanged — it still raises InputParseError(3) on the same condition. Documented in AGENTS.md, CHANGELOG.md, and docs/diagnostics.md. A regression-guard test asserts scan still exits 3.

Notes for reviewers

  • No report.json schema bump. Diagnostics are pre-scan recovery hints; per-finding remediation already lives in v0.7 fields (autofix_safe, suggested_patch_kind, docs_url).
  • DetectResult extension is additive — new workspace_signals block (Python file count, pyproject/requirements/prompts/tools dir presence). Existing fields and JSON output are unchanged.
  • _collect_placeholders extracted to cli/discovery/placeholders.py so init and the new doctor diagnostic share one implementation. Now also returns line so edit actions can point at shipgate.yaml:<line>.
  • Negative-control precedence is explicit: PURE_PROMPT_EXPERIMENT > NON_AGENT_LIBRARY > NO_AGENT_SURFACE. Asserted in tests.
  • NextAction validates kind/field correlation via a model_validator: kind="command" requires command; kind="edit" requires path; kind="stop" rejects command. Empty next_actions lists are rejected at the Diagnostic level (min_length=1).

Verification

CI is authoritative. Local checks run:

  • python -m pytest -q — 463 passed (35 new in tests/test_diagnostics.py, 6 new integration tests in tests/test_cli.py)
  • End-to-end smoke against missing-manifest, empty-workspace, zero-tools, missing-source-file, and CHANGE_ME-unresolved scenarios — each produces the expected diagnostic id and rank-1 action
  • Cross-command consistency: scan and doctor produce the same rank-1 next action for the missing-manifest case

Release-readiness notes

  • No user-code import added to default scan paths
  • No network access added to default scan paths
  • New or changed check IDs are documented in docs/checks.md — N/A (these are SHIP-DIAG-* diagnostics, documented in docs/diagnostics.md)
  • Report/schema changes are additive or documented in STABILITY.mdreport.json schema is unchanged. The DetectResult and inspect_sources payload additions are additive only.

Test plan

  • Confirm a coding agent receives a routable rank-1 action for each catalogued first-run failure
  • Confirm next_action (string) stays present in every JSON output for back-compat
  • Confirm doctor --json exits 0 on unresolved required source; scan exits 3 on the same input

🤖 Generated with Claude Code

pengfei-threemoonslab and others added 2 commits May 7, 2026 14:29
…ures

A coding agent that hits a common first-run failure (no shipgate.yaml,
zero tools, MCP/OpenAPI artifact-only repo, dynamic toolsets, missing
source file, unresolved CHANGE_ME, non-agent workspace, pure prompt
experiment) now gets a ranked, structured recovery hint in JSON
without having to consult human-facing docs.

- New cli/diagnostics.py with NextAction / Diagnostic models, a
  10-entry catalog, and pure-functional resolvers.
- detect --json and doctor --json gain diagnostics[] and
  next_actions[] alongside the existing single-string next_action
  (which now projects from the rank-1 action so it stays string-typed
  even for stop / edit kinds).
- AGENTS_SHIPGATE_AGENT_MODE=1 errors carry the same next_actions
  array. Audit covered all emit sites in main.py, scenario.py, and
  apply_patches.py.
- Behavior change: doctor --json no longer raises InputParseError(3)
  on a required tool_sources path that doesn't resolve. It now exits
  0 with unresolved_sources[] and a SHIP-DIAG-MISSING-SOURCE-FILE
  diagnostic. scan is unchanged — still raises 3 on the same condition.
- _collect_placeholders extracted to discovery/placeholders.py and
  enriched with line numbers so edit actions can target
  shipgate.yaml:<line>.
- DetectResult gains a workspace_signals block (Python file count,
  pyproject/requirements/prompts/tools dir hits) so the resolver can
  discriminate the three negative-control cases.
- 41 new tests across test_diagnostics.py and test_cli.py covering
  model invariants, catalog stability, every resolver, precedence,
  cross-command consistency, and the doctor behavior change. Full
  suite green (463 passed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1-1 — doctor's non-JSON output now surfaces unresolved required
sources and any diagnostics (id, severity, rank-1 action) and exits 3
when a required tool_sources path is unresolved. The --json contract
is unchanged: agents still get exit 0 with structured diagnostics.

P1-2 — Ruff: sort imports in diagnostics.py and test_diagnostics.py,
add strict=True to the doctor zip().

P2-1 — diagnose_missing_manifest now receives the workspace derived
from --config / --workspace, not Path.cwd(). New helper
_missing_manifest_workspace centralises the rule. An agent that runs
"agents-shipgate scan -c /tmp/repo/shipgate.yaml" from elsewhere now
gets a rank-1 detect command targeting /tmp/repo.

P2-2 — _resolve_source_paths catches the containment failure case
(declared path resolves outside base_dir) in addition to the missing-
file case. Each unresolved entry carries a `reason` field
("missing" | "outside_manifest_dir"). diagnose_doctor uses the reason
to tailor the SHIP-DIAG-MISSING-SOURCE-FILE diagnostic message.

P2-3 — diagnose_doctor edit-action paths now use str(manifest_path)
instead of manifest_path.name, so workspace and nested-manifest runs
emit "subdir/shipgate.yaml:<line>" or absolute paths instead of an
ambiguous "shipgate.yaml:<line>".

Plus 4 regression tests in test_cli.py covering each finding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Thanks for the review — all five findings addressed in 0393fd3.

[P1] Human doctor silently passes a missing required source — fixed. Doctor's non-JSON output now prints Unresolved required sources: (with id, declared path, and <config>:<line>) and a Diagnostics: block with the rank-1 action per diagnostic, then exits 3 if any payload has unresolved_sources. The --json exit-0 contract for agents is preserved (regression-tested in test_doctor_emits_unresolved_source_diagnostic_without_failing); the human path's loud failure is regression-tested in test_doctor_human_output_fails_loudly_on_missing_required_source.

[P1] Ruff — fixed. Imports re-sorted in cli/diagnostics.py and tests/test_diagnostics.py; zip() in main.py:628 got strict=True. python -m ruff check . is clean.

[P2] Missing-manifest recovery uses caller cwd — fixed. New _missing_manifest_workspace(config=, workspace=) helper centralises the rule: prefer --workspace, fall back to the config's parent directory, only use Path.cwd() when the config is bare like shipgate.yaml. Both scan and doctor ConfigError handlers route through it. Regression-tested in test_missing_manifest_recovery_uses_config_workspace (invokes from an unrelated cwd, asserts the rank-1 command points at the config's parent).

[P2] Missing-source detection misses containment failures — fixed. _resolve_source_paths now applies a relative_to(base_resolved) check after the exists() check; entries get a structured reason: "missing" | "outside_manifest_dir". diagnose_doctor uses the reason to tailor the diagnostic's why. Regression-tested with a real outside-the-manifest file in test_doctor_flags_outside_manifest_dir_source_as_diagnostic.

[P2] Edit actions drop manifest directories — fixed. diagnose_doctor switched from manifest_path.name to str(manifest_path) for edit-action targets, so workspace and nested-manifest runs now point at subdir/shipgate.yaml:<line> (or absolute paths) instead of an ambiguous shipgate.yaml:<line>. Regression-tested in test_doctor_edit_action_paths_include_manifest_directory.

Full local: python -m ruff check . clean, python -m pytest -q 467 passed.

P1 — distinguish missing-manifest from invalid-manifest in agent mode.
ConfigError covers two failure shapes: file-not-found and
exists-but-unparseable (invalid YAML, schema validation failure,
unsupported version, etc.). Both used to dispatch to
SHIP-DIAG-MISSING-MANIFEST whose rank-1 action is `detect / init` —
which is the wrong recovery for an existing-but-invalid file (init
refuses to overwrite, so the agent would loop). New
SHIP-DIAG-INVALID-MANIFEST diagnostic with an `edit <path>` rank-1
action; new `_diagnose_config_error` helper dispatches by file
existence in both scan and doctor handlers.

P2 — POSIX-shell-quote dynamically-interpolated paths in `command`
fields so a coding-agent shell runner doesn't word-split workspaces or
manifest paths containing spaces. Applied to
`diagnose_missing_manifest` (workspace), `diagnose_doctor`'s zero-tools
re-run command (manifest path), and apply_patches' malformed_patch
re-run command (--out parent). `next_actions[].command` is still a
single string per the v1 contract; argv-style structured commands
remain a future option.

P3 — clarify the human-vs-JSON exit code split in CHANGELOG.md,
docs/diagnostics.md, and AGENTS.md. The doctor behavior change is
scoped narrowly: `--json` exits 0 with diagnostics (agent contract);
non-JSON exits 3 (human contract); scan is unchanged regardless.
Docs now spell this out and call out that diagnostics influence
exit codes only on `doctor` + `MISSING-SOURCE-FILE`.

Plus regression tests:
- invalid-manifest dispatches to INVALID-MANIFEST (both schema-invalid
  and unparseable-YAML), not MISSING-MANIFEST.
- workspace-with-spaces command round-trips through shlex.split().
- diagnose_invalid_manifest unit test confirms edit-action target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Round 2 addressed in 0699069.

[P1] Invalid manifests dispatched to MISSING-MANIFEST — fixed. New SHIP-DIAG-INVALID-MANIFEST diagnostic covers the "file exists but the loader rejected it" case (invalid YAML, non-object YAML, unsupported version, schema validation failure). Rank-1 action is kind="edit", path=<manifest> with the loader's error in why, plus a follow-up command to re-run doctor --json after fixing. New _diagnose_config_error helper in main.py checks Path(config).is_file() and dispatches accordingly; both scan and doctor ConfigError handlers route through it. Two regression tests (test_invalid_manifest_dispatches_to_invalid_diagnostic for schema-invalid manifests, test_invalid_yaml_manifest_dispatches_to_invalid_diagnostic for unparseable YAML) assert the dispatch is correct and that the legacy next_action string never starts with agents-shipgate detect for an existing file.

[P2] Path quoting — fixed. New _quote_path() helper in cli/diagnostics.py wraps shlex.quote(). Applied to:

  • diagnose_missing_manifest workspace path (rank-1 detect command and rank-2 init command)
  • diagnose_doctor zero-tools re-run command (manifest path)
  • apply_patches.py's malformed_patch re-run command (--out parent)

Regression test test_missing_manifest_command_quotes_workspace_with_spaces asserts the emitted command for a workspace at tmp/space path/repo dir round-trips through shlex.split() and yields the original path verbatim. Unit test test_command_quotes_workspace_with_spaces covers the resolver in isolation. Kept next_actions[].command a single string per the v1 contract; argv-style structured commands remain a deliberate future option (called out in the original plan and review).

[P3] Stale exit-code docs — fixed. The doctor behavior change is now spelled out as:

  • doctor --json → exit 0 with structured diagnostic (agent contract)
  • doctor (no --json) → exit 3 with human-readable diagnostic block (human contract)
  • scan → unchanged, exits 3 regardless of --json

Updated CHANGELOG.md, docs/diagnostics.md (both the "advisory exit codes" preamble and the "Doctor behavior change" section now describe the divergence and call out that it's bounded to MISSING-SOURCE-FILE on doctor only), and AGENTS.md. The diagnostic catalog table also gained SHIP-DIAG-INVALID-MANIFEST with its rank-1 action.

Verification: python -m ruff check . clean; python -m pytest -q 472 passed (5 new since round 1).

P1 — extend the missing-vs-invalid manifest dispatch to cover
--workspace and glob configs.

`doctor` now catches `ConfigError` separately for the discovery phase
(no candidate manifests) and the per-path inspect phase (a specific
discovered manifest is invalid). The inner handler dispatches with
the failing path in scope, so workspace and glob runs surface
SHIP-DIAG-INVALID-MANIFEST pointing at the exact file.

`scan` was already correct after the v2 refactor (it uses the shared
`_diagnose_config_error` dispatcher), but the dispatcher itself now
walks every candidate manifest path — direct `-c <file>`,
`--workspace` discovery, or glob expansion — instead of only
recognising the bare `-c <file>` case. New `_candidate_manifest_paths`
helper centralises the enumeration; it never raises so the agent-mode
dispatch path remains panic-proof.

`_missing_manifest_workspace` now falls back to `cwd` when the config
is a glob, so the rank-1 detect command no longer carries literal `*`
characters.

P2 — `_quote_path` applied to the SHIP-DIAG-MCP-OPENAPI-ARTIFACT-ONLY
rank-1 command. The detect command for an artifact-only workspace
with spaces in the path now round-trips through shlex.split() like
the other generated commands.

Plus 4 regression tests:
- doctor --workspace with invalid shipgate.yaml dispatches to
  INVALID-MANIFEST with the correct path
- scan with a glob `*/shipgate.yaml` against an invalid file
  dispatches to INVALID-MANIFEST
- glob with no matches falls back to cwd-based MISSING-MANIFEST,
  not a workspace argument containing literal `*`
- artifact-only detect command for a spaced workspace shell-quotes
  correctly

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Round 3 addressed in 0b83fd5.

[P1] Invalid manifests in --workspace / glob mode — fixed two places.

The doctor handler now catches ConfigError separately for the discovery phase vs the per-path inspect phase. The inner handler runs with the failing path in scope, so it dispatches diagnose_invalid_manifest(path, message=str(exc)) directly — no guessing which discovered manifest the loader choked on. Regression-tested in test_doctor_workspace_dispatches_invalid_manifest.

The _diagnose_config_error dispatcher (used by scan and by doctor's discovery-phase handler) now walks every candidate manifest path — direct -c <file>, --workspace discovery, or glob.glob(...) expansion — and emits SHIP-DIAG-INVALID-MANIFEST for the first one that's a real file. New _candidate_manifest_paths helper centralises the enumeration; it swallows any inner exception so the agent-mode dispatch path remains panic-proof. Regression-tested in test_scan_glob_dispatches_invalid_manifest.

_missing_manifest_workspace now falls back to cwd when the config string contains glob metacharacters, so the rank-1 detect --workspace ... command no longer carries a literal *. Regression-tested in test_glob_with_no_matches_yields_workspace_cwd_not_glob_chars.

[P2] MCP-OPENAPI-ARTIFACT-ONLY workspace not quoted — fixed. _quote_path() now wraps the workspace in the rank-1 command for that diagnostic too. Regression-tested in test_artifact_only_command_quotes_workspace_with_spaces (creates a spaced workspace path, plants mcp-tools.json to trigger the diagnostic, asserts the emitted command round-trips through shlex.split() and yields the original path verbatim).

Verification: python -m ruff check . clean; python -m pytest -q 476 passed (4 new since round 2).

P1 — split scan's CLI option-parsing into its own try/except so flag
errors (`--format txt`, `--ci-mode yolo`, `--fail-on banana`,
`--packet-format docx`) don't reach the manifest dispatch helper. They
now emit a `kind="review"` action with guidance to fix the flag value
and re-run, instead of misreporting the manifest as invalid.

P2 — `_missing_manifest_workspace` now derives the longest non-glob
path prefix for glob configs instead of unconditionally falling back
to cwd. New `_glob_non_glob_prefix` helper walks `Path.parts` and
stops at the first component containing a glob metacharacter. So
`scan -c /tmp/repo/*/shipgate.yaml` from /tmp/elsewhere now routes
recovery at /tmp/repo. Purely-relative globs with no leading non-glob
component keep the existing cwd-fallback (there's no useful prefix
to route to).

Plus 4 regression tests:
- bad `--fail-on` flag does not dispatch to INVALID-MANIFEST
- parametrized coverage for `--format`, `--ci-mode`, `--packet-format`
- absolute glob with no matches routes to the glob prefix, not cwd
- relative glob with no matches still falls back to cwd

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Round 4 addressed in 80ae697.

[P2] Scan option errors misreported as invalid manifests — fixed. CLI option parsing (_parse_formats, _parse_packet_formats, the --ci-mode literal check, _parse_fail_on) now lives in its own try/except block ahead of any manifest work. A ConfigError raised from option parsing emits a kind="review" action with guidance to fix the flag value and re-run; the manifest dispatch helper is never reached. Regression-tested in test_scan_bad_flag_value_does_not_dispatch_to_invalid_manifest (the --fail-on banana case from the review) and a parametrized companion test_scan_other_bad_flag_values_skip_manifest_diagnostic covering --format, --ci-mode, and --packet-format. doctor doesn't have these option parsers so no change needed there.

[P2] Absolute glob no-match recovery targets caller cwd — fixed. New _glob_non_glob_prefix(config) helper walks Path(config).parts and returns the longest leading sequence of components without glob metacharacters; _missing_manifest_workspace calls it for glob configs. So scan -c /tmp/repo/*/shipgate.yaml from /tmp/elsewhere now emits agents-shipgate detect --workspace /tmp/repo --json. Purely-relative globs with no leading non-glob component (e.g. */shipgate.yaml or **/shipgate.yaml) keep the existing cwd-fallback — there's no useful prefix to route to. Both behaviors regression-tested: test_absolute_glob_no_match_targets_glob_prefix and test_relative_glob_no_match_still_falls_back_to_cwd.

Verification: python -m ruff check . clean; python -m pytest -q 482 passed (6 new since round 3).

@pengfei-threemoonslab pengfei-threemoonslab merged commit 5e95749 into main May 7, 2026
1 check passed
@pengfei-threemoonslab pengfei-threemoonslab deleted the claude/gallant-mclean-36d7a4 branch May 7, 2026 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant