Add ranked next-action diagnostics for detect / doctor by pengfei-threemoonslab · Pull Request #47 · ThreeMoonsLab/agents-shipgate

pengfei-threemoonslab · 2026-05-07T21:30:36Z

Summary

Adds a ranked, machine-readable recovery surface so a coding agent that hits a common first-run failure (no shipgate.yaml, zero tools, MCP/OpenAPI artifact-only repo, dynamic toolsets, missing source file, unresolved CHANGE_ME, non-agent / pure-prompt workspace, production target without permissions) gets a routable next step in JSON without having to consult docs.
New cli/diagnostics.py module: NextAction + Diagnostic Pydantic models with strict validators, a 10-entry catalog under stable SHIP-DIAG-* ids, and pure-functional resolvers (diagnose_detect, diagnose_doctor, diagnose_missing_manifest, top_next_actions).
detect --json and each doctor --json payload now carry diagnostics: [...] and next_actions: [...] blocks alongside the existing single-string next_action. The legacy field stays string-typed for every kind by projecting from the rank-1 action (Edit <path>, Stop: <why>, etc.), so consumers that only read next_action keep working.
AGENTS_SHIPGATE_AGENT_MODE=1 error JSON gains the same next_actions array. Audited every emit site: 7 in main.py, 2 in scenario.py, and the local helper in apply_patches.py.

Type

CLI or GitHub Action behavior
Documentation only (also)

Behavior change (deliberate, one)

agents-shipgate doctor no longer raises InputParseError(3) when a required tool_sources[].path does not resolve. It now exits 0 with:

unresolved_sources: [{id, declared_path, line}] in the per-manifest payload
a SHIP-DIAG-MISSING-SOURCE-FILE diagnostic whose rank-1 action is an edit pointing at shipgate.yaml:<line>

agents-shipgate scan is unchanged — it still raises InputParseError(3) on the same condition. Documented in AGENTS.md, CHANGELOG.md, and docs/diagnostics.md. A regression-guard test asserts scan still exits 3.

Notes for reviewers

No report.json schema bump. Diagnostics are pre-scan recovery hints; per-finding remediation already lives in v0.7 fields (autofix_safe, suggested_patch_kind, docs_url).
DetectResult extension is additive — new workspace_signals block (Python file count, pyproject/requirements/prompts/tools dir presence). Existing fields and JSON output are unchanged.
_collect_placeholders extracted to cli/discovery/placeholders.py so init and the new doctor diagnostic share one implementation. Now also returns line so edit actions can point at shipgate.yaml:<line>.
Negative-control precedence is explicit: PURE_PROMPT_EXPERIMENT > NON_AGENT_LIBRARY > NO_AGENT_SURFACE. Asserted in tests.
NextAction validates kind/field correlation via a model_validator: kind="command" requires command; kind="edit" requires path; kind="stop" rejects command. Empty next_actions lists are rejected at the Diagnostic level (min_length=1).

Verification

CI is authoritative. Local checks run:

python -m pytest -q — 463 passed (35 new in tests/test_diagnostics.py, 6 new integration tests in tests/test_cli.py)
End-to-end smoke against missing-manifest, empty-workspace, zero-tools, missing-source-file, and CHANGE_ME-unresolved scenarios — each produces the expected diagnostic id and rank-1 action
Cross-command consistency: scan and doctor produce the same rank-1 next action for the missing-manifest case

Release-readiness notes

No user-code import added to default scan paths
No network access added to default scan paths
New or changed check IDs are documented in docs/checks.md — N/A (these are SHIP-DIAG-* diagnostics, documented in docs/diagnostics.md)
Report/schema changes are additive or documented in STABILITY.md — report.json schema is unchanged. The DetectResult and inspect_sources payload additions are additive only.

Test plan

Confirm a coding agent receives a routable rank-1 action for each catalogued first-run failure
Confirm next_action (string) stays present in every JSON output for back-compat
Confirm doctor --json exits 0 on unresolved required source; scan exits 3 on the same input

🤖 Generated with Claude Code

…ures A coding agent that hits a common first-run failure (no shipgate.yaml, zero tools, MCP/OpenAPI artifact-only repo, dynamic toolsets, missing source file, unresolved CHANGE_ME, non-agent workspace, pure prompt experiment) now gets a ranked, structured recovery hint in JSON without having to consult human-facing docs. - New cli/diagnostics.py with NextAction / Diagnostic models, a 10-entry catalog, and pure-functional resolvers. - detect --json and doctor --json gain diagnostics[] and next_actions[] alongside the existing single-string next_action (which now projects from the rank-1 action so it stays string-typed even for stop / edit kinds). - AGENTS_SHIPGATE_AGENT_MODE=1 errors carry the same next_actions array. Audit covered all emit sites in main.py, scenario.py, and apply_patches.py. - Behavior change: doctor --json no longer raises InputParseError(3) on a required tool_sources path that doesn't resolve. It now exits 0 with unresolved_sources[] and a SHIP-DIAG-MISSING-SOURCE-FILE diagnostic. scan is unchanged — still raises 3 on the same condition. - _collect_placeholders extracted to discovery/placeholders.py and enriched with line numbers so edit actions can target shipgate.yaml:<line>. - DetectResult gains a workspace_signals block (Python file count, pyproject/requirements/prompts/tools dir hits) so the resolver can discriminate the three negative-control cases. - 41 new tests across test_diagnostics.py and test_cli.py covering model invariants, catalog stability, every resolver, precedence, cross-command consistency, and the doctor behavior change. Full suite green (463 passed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

P1-1 — doctor's non-JSON output now surfaces unresolved required sources and any diagnostics (id, severity, rank-1 action) and exits 3 when a required tool_sources path is unresolved. The --json contract is unchanged: agents still get exit 0 with structured diagnostics. P1-2 — Ruff: sort imports in diagnostics.py and test_diagnostics.py, add strict=True to the doctor zip(). P2-1 — diagnose_missing_manifest now receives the workspace derived from --config / --workspace, not Path.cwd(). New helper _missing_manifest_workspace centralises the rule. An agent that runs "agents-shipgate scan -c /tmp/repo/shipgate.yaml" from elsewhere now gets a rank-1 detect command targeting /tmp/repo. P2-2 — _resolve_source_paths catches the containment failure case (declared path resolves outside base_dir) in addition to the missing- file case. Each unresolved entry carries a `reason` field ("missing" | "outside_manifest_dir"). diagnose_doctor uses the reason to tailor the SHIP-DIAG-MISSING-SOURCE-FILE diagnostic message. P2-3 — diagnose_doctor edit-action paths now use str(manifest_path) instead of manifest_path.name, so workspace and nested-manifest runs emit "subdir/shipgate.yaml:<line>" or absolute paths instead of an ambiguous "shipgate.yaml:<line>". Plus 4 regression tests in test_cli.py covering each finding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-07T21:50:19Z

Thanks for the review — all five findings addressed in 0393fd3.

[P1] Human doctor silently passes a missing required source — fixed. Doctor's non-JSON output now prints Unresolved required sources: (with id, declared path, and <config>:<line>) and a Diagnostics: block with the rank-1 action per diagnostic, then exits 3 if any payload has unresolved_sources. The --json exit-0 contract for agents is preserved (regression-tested in test_doctor_emits_unresolved_source_diagnostic_without_failing); the human path's loud failure is regression-tested in test_doctor_human_output_fails_loudly_on_missing_required_source.

[P1] Ruff — fixed. Imports re-sorted in cli/diagnostics.py and tests/test_diagnostics.py; zip() in main.py:628 got strict=True. python -m ruff check . is clean.

[P2] Missing-manifest recovery uses caller cwd — fixed. New _missing_manifest_workspace(config=, workspace=) helper centralises the rule: prefer --workspace, fall back to the config's parent directory, only use Path.cwd() when the config is bare like shipgate.yaml. Both scan and doctor ConfigError handlers route through it. Regression-tested in test_missing_manifest_recovery_uses_config_workspace (invokes from an unrelated cwd, asserts the rank-1 command points at the config's parent).

[P2] Missing-source detection misses containment failures — fixed. _resolve_source_paths now applies a relative_to(base_resolved) check after the exists() check; entries get a structured reason: "missing" | "outside_manifest_dir". diagnose_doctor uses the reason to tailor the diagnostic's why. Regression-tested with a real outside-the-manifest file in test_doctor_flags_outside_manifest_dir_source_as_diagnostic.

[P2] Edit actions drop manifest directories — fixed. diagnose_doctor switched from manifest_path.name to str(manifest_path) for edit-action targets, so workspace and nested-manifest runs now point at subdir/shipgate.yaml:<line> (or absolute paths) instead of an ambiguous shipgate.yaml:<line>. Regression-tested in test_doctor_edit_action_paths_include_manifest_directory.

Full local: python -m ruff check . clean, python -m pytest -q 467 passed.

P1 — distinguish missing-manifest from invalid-manifest in agent mode. ConfigError covers two failure shapes: file-not-found and exists-but-unparseable (invalid YAML, schema validation failure, unsupported version, etc.). Both used to dispatch to SHIP-DIAG-MISSING-MANIFEST whose rank-1 action is `detect / init` — which is the wrong recovery for an existing-but-invalid file (init refuses to overwrite, so the agent would loop). New SHIP-DIAG-INVALID-MANIFEST diagnostic with an `edit <path>` rank-1 action; new `_diagnose_config_error` helper dispatches by file existence in both scan and doctor handlers. P2 — POSIX-shell-quote dynamically-interpolated paths in `command` fields so a coding-agent shell runner doesn't word-split workspaces or manifest paths containing spaces. Applied to `diagnose_missing_manifest` (workspace), `diagnose_doctor`'s zero-tools re-run command (manifest path), and apply_patches' malformed_patch re-run command (--out parent). `next_actions[].command` is still a single string per the v1 contract; argv-style structured commands remain a future option. P3 — clarify the human-vs-JSON exit code split in CHANGELOG.md, docs/diagnostics.md, and AGENTS.md. The doctor behavior change is scoped narrowly: `--json` exits 0 with diagnostics (agent contract); non-JSON exits 3 (human contract); scan is unchanged regardless. Docs now spell this out and call out that diagnostics influence exit codes only on `doctor` + `MISSING-SOURCE-FILE`. Plus regression tests: - invalid-manifest dispatches to INVALID-MANIFEST (both schema-invalid and unparseable-YAML), not MISSING-MANIFEST. - workspace-with-spaces command round-trips through shlex.split(). - diagnose_invalid_manifest unit test confirms edit-action target. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-07T22:01:18Z

Round 2 addressed in 0699069.

[P1] Invalid manifests dispatched to MISSING-MANIFEST — fixed. New SHIP-DIAG-INVALID-MANIFEST diagnostic covers the "file exists but the loader rejected it" case (invalid YAML, non-object YAML, unsupported version, schema validation failure). Rank-1 action is kind="edit", path=<manifest> with the loader's error in why, plus a follow-up command to re-run doctor --json after fixing. New _diagnose_config_error helper in main.py checks Path(config).is_file() and dispatches accordingly; both scan and doctor ConfigError handlers route through it. Two regression tests (test_invalid_manifest_dispatches_to_invalid_diagnostic for schema-invalid manifests, test_invalid_yaml_manifest_dispatches_to_invalid_diagnostic for unparseable YAML) assert the dispatch is correct and that the legacy next_action string never starts with agents-shipgate detect for an existing file.

[P2] Path quoting — fixed. New _quote_path() helper in cli/diagnostics.py wraps shlex.quote(). Applied to:

diagnose_missing_manifest workspace path (rank-1 detect command and rank-2 init command)
diagnose_doctor zero-tools re-run command (manifest path)
apply_patches.py's malformed_patch re-run command (--out parent)

Regression test test_missing_manifest_command_quotes_workspace_with_spaces asserts the emitted command for a workspace at tmp/space path/repo dir round-trips through shlex.split() and yields the original path verbatim. Unit test test_command_quotes_workspace_with_spaces covers the resolver in isolation. Kept next_actions[].command a single string per the v1 contract; argv-style structured commands remain a deliberate future option (called out in the original plan and review).

[P3] Stale exit-code docs — fixed. The doctor behavior change is now spelled out as:

doctor --json → exit 0 with structured diagnostic (agent contract)
doctor (no --json) → exit 3 with human-readable diagnostic block (human contract)
scan → unchanged, exits 3 regardless of --json

Updated CHANGELOG.md, docs/diagnostics.md (both the "advisory exit codes" preamble and the "Doctor behavior change" section now describe the divergence and call out that it's bounded to MISSING-SOURCE-FILE on doctor only), and AGENTS.md. The diagnostic catalog table also gained SHIP-DIAG-INVALID-MANIFEST with its rank-1 action.

Verification: python -m ruff check . clean; python -m pytest -q 472 passed (5 new since round 1).

P1 — extend the missing-vs-invalid manifest dispatch to cover --workspace and glob configs. `doctor` now catches `ConfigError` separately for the discovery phase (no candidate manifests) and the per-path inspect phase (a specific discovered manifest is invalid). The inner handler dispatches with the failing path in scope, so workspace and glob runs surface SHIP-DIAG-INVALID-MANIFEST pointing at the exact file. `scan` was already correct after the v2 refactor (it uses the shared `_diagnose_config_error` dispatcher), but the dispatcher itself now walks every candidate manifest path — direct `-c <file>`, `--workspace` discovery, or glob expansion — instead of only recognising the bare `-c <file>` case. New `_candidate_manifest_paths` helper centralises the enumeration; it never raises so the agent-mode dispatch path remains panic-proof. `_missing_manifest_workspace` now falls back to `cwd` when the config is a glob, so the rank-1 detect command no longer carries literal `*` characters. P2 — `_quote_path` applied to the SHIP-DIAG-MCP-OPENAPI-ARTIFACT-ONLY rank-1 command. The detect command for an artifact-only workspace with spaces in the path now round-trips through shlex.split() like the other generated commands. Plus 4 regression tests: - doctor --workspace with invalid shipgate.yaml dispatches to INVALID-MANIFEST with the correct path - scan with a glob `*/shipgate.yaml` against an invalid file dispatches to INVALID-MANIFEST - glob with no matches falls back to cwd-based MISSING-MANIFEST, not a workspace argument containing literal `*` - artifact-only detect command for a spaced workspace shell-quotes correctly Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-07T22:14:57Z

Round 3 addressed in 0b83fd5.

[P1] Invalid manifests in --workspace / glob mode — fixed two places.

The doctor handler now catches ConfigError separately for the discovery phase vs the per-path inspect phase. The inner handler runs with the failing path in scope, so it dispatches diagnose_invalid_manifest(path, message=str(exc)) directly — no guessing which discovered manifest the loader choked on. Regression-tested in test_doctor_workspace_dispatches_invalid_manifest.

The _diagnose_config_error dispatcher (used by scan and by doctor's discovery-phase handler) now walks every candidate manifest path — direct -c <file>, --workspace discovery, or glob.glob(...) expansion — and emits SHIP-DIAG-INVALID-MANIFEST for the first one that's a real file. New _candidate_manifest_paths helper centralises the enumeration; it swallows any inner exception so the agent-mode dispatch path remains panic-proof. Regression-tested in test_scan_glob_dispatches_invalid_manifest.

_missing_manifest_workspace now falls back to cwd when the config string contains glob metacharacters, so the rank-1 detect --workspace ... command no longer carries a literal *. Regression-tested in test_glob_with_no_matches_yields_workspace_cwd_not_glob_chars.

[P2] MCP-OPENAPI-ARTIFACT-ONLY workspace not quoted — fixed. _quote_path() now wraps the workspace in the rank-1 command for that diagnostic too. Regression-tested in test_artifact_only_command_quotes_workspace_with_spaces (creates a spaced workspace path, plants mcp-tools.json to trigger the diagnostic, asserts the emitted command round-trips through shlex.split() and yields the original path verbatim).

Verification: python -m ruff check . clean; python -m pytest -q 476 passed (4 new since round 2).

P1 — split scan's CLI option-parsing into its own try/except so flag errors (`--format txt`, `--ci-mode yolo`, `--fail-on banana`, `--packet-format docx`) don't reach the manifest dispatch helper. They now emit a `kind="review"` action with guidance to fix the flag value and re-run, instead of misreporting the manifest as invalid. P2 — `_missing_manifest_workspace` now derives the longest non-glob path prefix for glob configs instead of unconditionally falling back to cwd. New `_glob_non_glob_prefix` helper walks `Path.parts` and stops at the first component containing a glob metacharacter. So `scan -c /tmp/repo/*/shipgate.yaml` from /tmp/elsewhere now routes recovery at /tmp/repo. Purely-relative globs with no leading non-glob component keep the existing cwd-fallback (there's no useful prefix to route to). Plus 4 regression tests: - bad `--fail-on` flag does not dispatch to INVALID-MANIFEST - parametrized coverage for `--format`, `--ci-mode`, `--packet-format` - absolute glob with no matches routes to the glob prefix, not cwd - relative glob with no matches still falls back to cwd Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-07T22:34:11Z

Round 4 addressed in 80ae697.

[P2] Scan option errors misreported as invalid manifests — fixed. CLI option parsing (_parse_formats, _parse_packet_formats, the --ci-mode literal check, _parse_fail_on) now lives in its own try/except block ahead of any manifest work. A ConfigError raised from option parsing emits a kind="review" action with guidance to fix the flag value and re-run; the manifest dispatch helper is never reached. Regression-tested in test_scan_bad_flag_value_does_not_dispatch_to_invalid_manifest (the --fail-on banana case from the review) and a parametrized companion test_scan_other_bad_flag_values_skip_manifest_diagnostic covering --format, --ci-mode, and --packet-format. doctor doesn't have these option parsers so no change needed there.

[P2] Absolute glob no-match recovery targets caller cwd — fixed. New _glob_non_glob_prefix(config) helper walks Path(config).parts and returns the longest leading sequence of components without glob metacharacters; _missing_manifest_workspace calls it for glob configs. So scan -c /tmp/repo/*/shipgate.yaml from /tmp/elsewhere now emits agents-shipgate detect --workspace /tmp/repo --json. Purely-relative globs with no leading non-glob component (e.g. */shipgate.yaml or **/shipgate.yaml) keep the existing cwd-fallback — there's no useful prefix to route to. Both behaviors regression-tested: test_absolute_glob_no_match_targets_glob_prefix and test_relative_glob_no_match_still_falls_back_to_cwd.

Verification: python -m ruff check . clean; python -m pytest -q 482 passed (6 new since round 3).

pengfei-threemoonslab and others added 2 commits May 7, 2026 14:29

pengfei-threemoonslab merged commit 5e95749 into main May 7, 2026
1 check passed

pengfei-threemoonslab deleted the claude/gallant-mclean-36d7a4 branch May 7, 2026 22:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ranked next-action diagnostics for detect / doctor#47

Add ranked next-action diagnostics for detect / doctor#47
pengfei-threemoonslab merged 5 commits intomainfrom
claude/gallant-mclean-36d7a4

pengfei-threemoonslab commented May 7, 2026

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pengfei-threemoonslab commented May 7, 2026

Summary

Type

Behavior change (deliberate, one)

Notes for reviewers

Verification

Release-readiness notes

Test plan

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

pengfei-threemoonslab commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant