Address PR #111 review round 3: runtime wrapper + doctor + markdown

pengfei-threemoonslab · claude · pengfei-threemoonslab · commit 943afda92904 · 2026-05-22T14:23:37.000-07:00
Three more findings from PR #111 review — two P1s and one P2. The v0.20 third-party adapter surface now has parity across all three consumers (scan dispatcher, doctor introspection, markdown report) and the documented lenient/strict contract holds end-to-end. 4 new regression tests, 32 total in test_adapter_entry_point_discovery.py. P1 #1 — adapter runtime failures bypassed loaded_adapters.runtime_errors. The dispatcher in ``_load_sources`` called ``adapter.load(...)`` directly for ALL adapters. ``run_validated_adapter`` existed in ``inputs/adapter_validation.py`` but was never invoked during a real scan. A third-party adapter that raised at runtime aborted ``run_scan`` with the raw exception, no report was emitted, and ``--strict-plugins`` never saw the failure. Contradicted the documented "lenient mode records the failure, continues" contract. Fix: - ``_load_inputs`` now builds a ``third_party_records: dict[str, LoadedAdapter]`` map from discovery (one entry per valid third-party adapter, keyed by source_type) and threads it to ``_load_sources(..., third_party_records=...)``. - ``_load_sources`` (pass 1 / pass 2) checks each adapter's source_type against the map. Built-ins keep the direct ``adapter.load(...)`` call shape — a built-in raising is a scanner bug and must abort loudly. Third-party adapters route through ``run_validated_adapter`` instead, which catches every exception, captures it into ``LoadedAdapter.runtime_errors`` (and mirrors into the matching ``loaded_adapters[].runtime_errors`` dict), and returns ``None`` so the dispatcher skips ``_absorb``. - ``_invoke_per_source_adapter`` learned a ``third_party_record`` kwarg that switches to the wrapper. Built-in optional-source ``InputParseError`` handling is preserved on the built-in branch. Result: lenient-mode third-party crashes are captured, the scan completes, the report contains the failed adapter row with ``runtime_errors`` populated, and ``--strict-plugins`` exits 4 as documented. P1 #2 — doctor couldn't introspect manifests using third-party source types. ``inspect_sources`` (the doctor command's manifest- introspection entry point) called ``_load_sources`` against the global ``REGISTRY`` (which is intentionally builtin-only since the PR #111 P1 #1+#2 per-scan-clone refactor) and did NOT run adapter discovery. So a manifest with ``tool_sources[].type: demo_source`` that scanned cleanly now made ``doctor`` crash with ``ConfigError: No adapter registered for source type 'demo_source'``. Fix: - ``inspect_sources`` mirrors the ``_load_inputs`` pattern: build ``scan_registry = REGISTRY.clone()``, run ``discover_third_party_adapters(scan_registry, …)`` (honoring the same ``plugins_enabled`` env / override semantics), pass the per-scan registry and ``third_party_records`` to ``_load_sources``. - New ``plugins_enabled: bool | None = None`` kwarg defaults to ``None`` so the existing env var governs (no doctor CLI flag change required). - ``loaded_adapters`` surfaced in the doctor payload alongside ``policy_packs`` so an operator can confirm what was discovered without running a full scan. P2 #3 — Markdown reports hid adapter validation failures. ``report/markdown.py`` rendered ``loaded_policy_packs`` and ``loaded_plugins`` but had no equivalent ``loaded_adapters`` section. A ``load_failed`` adapter appeared in ``report.json`` but not in ``report.md`` — the default human report was blind to skipped third-party extensions. Fix: - New ``_append_loaded_adapters(lines, report)`` renders a ``## Loaded Adapters`` section listing each entry as ``- distribution version: source_type — \`validation_status\`` followed by per-error indented bullets (validation_error and runtime_error). Hidden when ``loaded_adapters`` is empty so clean repos see no extra noise. - Wired into the main rendering flow right after ``_append_loaded_plugins``. Regression tests (4 new in tests/test_adapter_entry_point_discovery.py): - ``test_runtime_error_in_third_party_adapter_captured_not_propagated`` — a third-party adapter that raises ``RuntimeError`` at runtime doesn't abort ``run_scan``; the failure lands in ``loaded_adapters[].runtime_errors``; the scan exits 0. - ``test_strict_plugins_exits_on_third_party_adapter_runtime_error`` — end-to-end via CliRunner: ``--strict-plugins`` elevates exit code 4 specifically on the new adapter-runtime-error path. - ``test_doctor_resolves_third_party_source_types`` — a manifest with ``type: demo_source`` and a matching third-party adapter is introspected successfully by ``inspect_sources``; the adapter runs; the doctor payload surfaces ``loaded_adapters[]``. - ``test_markdown_report_renders_loaded_adapters_section`` — a ``load_failed`` adapter shows up in ``report.md`` under the ``## Loaded Adapters`` heading with its status and validation error. Verification: 1268 tests pass; ruff clean; schema roundtrip --check exits 0; coverage 88.06%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
diff --git a/src/agents_shipgate/cli/scan.py b/src/agents_shipgate/cli/scan.py
@@ -377,13 +377,30 @@ def _load_inputs(
     # ``monkeypatch.setitem(REGISTRY._adapters, …)`` still work.
     scan_registry = REGISTRY.clone()
     loaded_adapters: list[dict[str, Any]] = []
-    discover_third_party_adapters(
+    discovery_records = discover_third_party_adapters(
         scan_registry,
         plugins_enabled=plugins_enabled,
         loaded_adapters=loaded_adapters,
     )
+    # v0.20 (PR #111 review follow-up #2): map of source_type → valid
+    # LoadedAdapter record. Used by ``_load_sources`` to route
+    # third-party adapter ``load()`` calls through
+    # ``run_validated_adapter`` so runtime exceptions land in
+    # ``loaded_adapters[].runtime_errors`` instead of crashing the
+    # scan. Invalid records (validation_status != "valid") are
+    # excluded: they never registered on ``scan_registry`` and so the
+    # dispatcher will never reach them.
+    third_party_records: dict[str, Any] = {
+        record.adapter.source_type: record
+        for record in discovery_records
+        if record.adapter is not None
+    }
     loaded_sources, artifact_bag = _load_sources(
-        manifest, base_dir, verbose=verbose, registry=scan_registry
+        manifest,
+        base_dir,
+        verbose=verbose,
+        registry=scan_registry,
+        third_party_records=third_party_records,
     )
     logger.debug(
         "loaded sources",
@@ -1246,7 +1263,25 @@ def run_scan(
 
 
 
-def inspect_sources(*, config_path: Path, verbose: bool = False) -> dict[str, object]:
+def inspect_sources(
+    *,
+    config_path: Path,
+    verbose: bool = False,
+    plugins_enabled: bool | None = None,
+) -> dict[str, object]:
+    """``doctor``'s manifest-introspection entry point.
+
+    v0.20 (PR #111 review fix follow-up #3): mirrors ``_load_inputs``'s
+    per-scan registry clone + adapter discovery so ``doctor`` can
+    inspect manifests that reference third-party source types. Before
+    this fix, the global ``REGISTRY`` was builtin-only (by design,
+    after the per-scan-registry refactor), so a manifest with
+    ``tool_sources[].type: demo_source`` scanned fine but ``doctor``
+    raised ``ConfigError: No adapter registered``.
+    """
+
+    from agents_shipgate.inputs.protocol import discover_third_party_adapters
+
     manifest = load_manifest(config_path)
     base_dir = config_path.resolve().parent
     unresolved_sources = _resolve_source_paths(manifest, base_dir, config_path)
@@ -1264,7 +1299,30 @@ def inspect_sources(*, config_path: Path, verbose: bool = False) -> dict[str, ob
                 ]
             }
         )
-    loaded_sources, artifact_bag = _load_sources(manifest, base_dir, verbose=verbose)
+    # v0.20 (PR #111 review follow-up #3): build a per-scan registry
+    # for ``doctor`` so it sees the same adapter set as ``scan``. The
+    # global ``REGISTRY`` is builtin-only by design after the
+    # per-scan-clone refactor; without this discovery step,
+    # third-party source types would be unresolvable here.
+    scan_registry = REGISTRY.clone()
+    loaded_adapters: list[dict[str, Any]] = []
+    discovery_records = discover_third_party_adapters(
+        scan_registry,
+        plugins_enabled=plugins_enabled,
+        loaded_adapters=loaded_adapters,
+    )
+    third_party_records: dict[str, Any] = {
+        record.adapter.source_type: record
+        for record in discovery_records
+        if record.adapter is not None
+    }
+    loaded_sources, artifact_bag = _load_sources(
+        manifest,
+        base_dir,
+        verbose=verbose,
+        registry=scan_registry,
+        third_party_records=third_party_records,
+    )
     adk_artifacts = artifact_bag.get("google_adk", GoogleAdkArtifacts)
     langchain_artifacts = artifact_bag.get("langchain", LangChainArtifacts)
     crewai_artifacts = artifact_bag.get("crewai", CrewAiArtifacts)
@@ -1312,6 +1370,11 @@ def inspect_sources(*, config_path: Path, verbose: bool = False) -> dict[str, ob
             else None
         ),
         "policy_packs": [pack.model_dump(mode="json") for pack in policy_packs.loaded],
+        # v0.20 (PR #111 review follow-up #3): surface third-party
+        # adapter discovery results in the doctor payload so the
+        # operator can confirm which extensions were loaded (or why
+        # they were skipped) without running a full scan.
+        "loaded_adapters": loaded_adapters,
         "baseline": _default_baseline_status(base_dir),
         "warnings": warnings,
         "unresolved_sources": unresolved_sources,
@@ -1399,6 +1462,7 @@ def _load_sources(
     *,
     verbose: bool,
     registry: Any = None,
+    third_party_records: dict[str, Any] | None = None,
 ) -> tuple[list[LoadedToolSource], ArtifactBag]:
     """Dispatch every adapter through the supplied ``registry``.
 
@@ -1430,9 +1494,22 @@ def _load_sources(
     ``_load_inputs`` (notably the legacy tests in
     ``tests/test_adapter_registry.py``). New code should always pass
     a per-scan registry.
+
+    v0.20 (PR #111 review fix follow-up #2): ``third_party_records``
+    maps each validated third-party ``source_type`` to its
+    ``LoadedAdapter`` record (from ``discover_third_party_adapters``).
+    When set, the dispatcher routes those adapters through
+    ``run_validated_adapter`` so any exception during their
+    ``load()`` call is captured into
+    ``loaded_adapters[].runtime_errors`` and the scan continues in
+    lenient mode (or trips ``--strict-plugins`` exit 4 in strict
+    mode). Built-in adapters keep the direct call shape — a built-in
+    raising means the scanner itself is broken and must abort loudly.
     """
     if registry is None:
         registry = REGISTRY
+    if third_party_records is None:
+        third_party_records = {}
     per_source_loaded: list[LoadedToolSource] = []
     per_scan_loaded: list[LoadedToolSource] = []
     bag = ArtifactBag()
@@ -1447,9 +1524,20 @@ def _load_sources(
         adapter = registry.require(source.type)
         if adapter.scope != "per_source":
             continue
+        third_party_record = third_party_records.get(source.type)
         result = _invoke_per_source_adapter(
-            adapter, source, base_dir, manifest, verbose=verbose
+            adapter,
+            source,
+            base_dir,
+            manifest,
+            verbose=verbose,
+            third_party_record=third_party_record,
         )
+        if result is None:
+            # Third-party adapter raised at runtime; the wrapper
+            # captured the failure into runtime_errors and we skip
+            # absorbing the (None) result.
+            continue
         _absorb(result, source.type, per_source_loaded, bag, adapter)
 
     # Pass 2 — every per-scan adapter fires once, in registry order.
@@ -1458,7 +1546,22 @@ def _load_sources(
     # configured) and manifest-only adapters (openai_api,
     # anthropic_api, n8n).
     for adapter in registry.per_scan_adapters():
-        result = adapter.load(None, base_dir, manifest)
+        third_party_record = third_party_records.get(adapter.source_type)
+        if third_party_record is not None:
+            from agents_shipgate.inputs.adapter_validation import (
+                run_validated_adapter,
+            )
+
+            result = run_validated_adapter(
+                third_party_record,
+                source=None,
+                base_dir=base_dir,
+                manifest=manifest,
+            )
+            if result is None:
+                continue
+        else:
+            result = adapter.load(None, base_dir, manifest)
         _absorb(result, adapter.source_type, per_scan_loaded, bag, adapter)
 
     return per_source_loaded + per_scan_loaded, bag
@@ -1552,7 +1655,35 @@ def _invoke_per_source_adapter(
     manifest: AgentsShipgateManifest,
     *,
     verbose: bool,
-) -> LoadedAdapterResult:
+    third_party_record: Any = None,
+) -> LoadedAdapterResult | None:
+    """Invoke a per_source adapter and return its result.
+
+    For **built-in** adapters: catch ``InputParseError`` only when the
+    source is marked ``optional`` (returning a warning-only stub);
+    any other exception propagates. A built-in raising means the
+    scanner is broken and must abort loudly.
+
+    For **third-party** adapters (``third_party_record`` is the
+    matching ``LoadedAdapter``): route through
+    ``run_validated_adapter``, which captures ALL exceptions into the
+    record's ``runtime_errors`` list and returns ``None``. Returning
+    ``None`` signals the caller to skip ``_absorb`` for this source —
+    the scan continues in lenient mode and ``--strict-plugins`` sees
+    the runtime error on exit.
+    """
+
+    if third_party_record is not None:
+        from agents_shipgate.inputs.adapter_validation import (
+            run_validated_adapter,
+        )
+
+        return run_validated_adapter(
+            third_party_record,
+            source=source,
+            base_dir=base_dir,
+            manifest=manifest,
+        )
     try:
         return adapter.load(source, base_dir, manifest)
     except InputParseError:
diff --git a/src/agents_shipgate/report/markdown.py b/src/agents_shipgate/report/markdown.py
@@ -104,6 +104,7 @@ def render_markdown_report(report: ReadinessReport) -> str:
     _append_source_warnings(lines, report)
     _append_loaded_policy_packs(lines, report)
     _append_loaded_plugins(lines, report)
+    _append_loaded_adapters(lines, report)
     _append_tool_surface(lines, report)
     _append_action_surface_diff(lines, report)
     _append_tool_surface_diff(lines, report)
@@ -461,6 +462,44 @@ def _append_loaded_policy_packs(lines: list[str], report: ReadinessReport) -> No
     lines.append("")
 
 
+def _append_loaded_adapters(lines: list[str], report: ReadinessReport) -> None:
+    """v0.20 (PR #111 review follow-up #4): render third-party adapter
+    provenance in the human-readable report.
+
+    Before this section landed, ``load_failed`` and other invalid
+    adapters appeared in ``report.json`` only — the default human
+    report was blind to skipped third-party extensions. Show the
+    validation status prominently so the reviewer sees what was
+    skipped (and why) without opening the JSON.
+    """
+
+    adapters = getattr(report, "loaded_adapters", None) or []
+    if not adapters:
+        return
+    lines.extend(["## Loaded Adapters", ""])
+    for adapter in adapters:
+        distribution = adapter.get("distribution") or "unknown distribution"
+        version = adapter.get("version")
+        source_type = adapter.get("source_type") or "unknown source_type"
+        status = adapter.get("validation_status") or "unknown"
+        suffix = f" {version}" if version else ""
+        head = (
+            f"- {_safe_markdown_text(distribution)}"
+            f"{_safe_markdown_text(suffix)}: "
+            f"{_safe_markdown_text(source_type)} — "
+            f"`{_safe_markdown_text(status)}`"
+        )
+        lines.append(head)
+        # Surface every validation_error / runtime_error on its own
+        # indented bullet so a reviewer scanning the report cannot
+        # miss an adapter that failed to load or crashed at runtime.
+        for err in adapter.get("validation_errors") or []:
+            lines.append(f"  - validation_error: {_safe_markdown_text(err)}")
+        for err in adapter.get("runtime_errors") or []:
+            lines.append(f"  - runtime_error: {_safe_markdown_text(err)}")
+    lines.append("")
+
+
 def _append_tool_surface(lines: list[str], report: ReadinessReport) -> None:
     surface = report.tool_surface
     lines.extend(
diff --git a/tests/test_adapter_entry_point_discovery.py b/tests/test_adapter_entry_point_discovery.py