Skip to content

Add plugin validation pipeline + --strict-plugins (M5)#83

Merged
pengfei-threemoonslab merged 1 commit into
mainfrom
claude/m5-plugin-validation
May 16, 2026
Merged

Add plugin validation pipeline + --strict-plugins (M5)#83
pengfei-threemoonslab merged 1 commit into
mainfrom
claude/m5-plugin-validation

Conversation

@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor

@pengfei-threemoonslab pengfei-threemoonslab commented May 16, 2026

Summary

Hardens the third-party plugin loader on agents_shipgate.checks entry points. Adds five load-time validation gates, a runtime result wrapper that blocks check-ID smuggling, and an opt-in --strict-plugins flag. Promotes three new loaded_plugins[] fields to required at v0.17. Existing plugins continue to work unchanged.

Part of the v0.17 Trust Hardening Pass (M5 in the trust-hardening plan).

What changes

Load-time gates (src/agents_shipgate/checks/plugin_validation.py):

  1. loadentry_point.load() exceptions captured (no longer aborts the scan).
  2. signature — must accept exactly one required positional parameter; rejects required keyword-only parameters since the scanner calls plugins as plugin(context) with no kwargs.
  3. metadataAGENTS_SHIPGATE_METADATA must parse as CheckMetadata; both id and check_id are accepted as the identifier key (v0.17 alias for symmetry with Finding.check_id).
  4. id_collision — plugin check IDs cannot shadow built-ins (including legacy aliases) or earlier plugins in the same scan.
  5. bad_floorfloor_severity cannot exceed default_severity.

Runtime validation (run_validated_plugin):

  • Plugin exceptions captured into loaded_plugins[].runtime_errors; scan continues.
  • Returned values must be list[Finding]; otherwise dropped.
  • Findings whose check_id differs from the plugin's declared id are dropped — a plugin cannot smuggle findings under another check ID. This is the load-bearing trust rule.

Report surface (loaded_plugins[] — required at v0.17):

  • validation_status: valid | load_failed | bad_signature | bad_metadata | id_collision | bad_floor
  • validation_errors: list[str] — empty for clean plugins.
  • runtime_errors: list[str] — empty for clean plugins.

The v0.7 frozen schema continues to pin the original 5-field required list (per the immutability contract). v0.17's 8-field required list is locked by a new paired test test_v17_loaded_plugins_required_includes_validation_fields.

CLI (--strict-plugins):

  • Default lenient mode preserves v0.x behavior.
  • --strict-plugins exits 4 if any plugin failed validation or produced runtime errors.

CheckMetadata extension:

  • Accepts id or check_id via AliasChoices.
  • Adds floor_severity: Severity | None with a model validator that rejects floor > default. Foundation for M1 (manifest-side severity-override floor); unused at the catalog level today (all built-ins default to None).

Backward compatibility

  • Existing plugins using AGENTS_SHIPGATE_METADATA = {"id": ...} work unchanged.
  • loaded_plugins[] shape is additive — only new fields appear.
  • Default mode remains lenient; strict mode is opt-in via the new flag.
  • Promoting the new fields to required is the right move at v0.17 since the scanner always emits them and v0.17 is the current contract on main. The v0.7 frozen schema preserves the pre-M5 shape per the schema immutability rule.

Test plan

  • 20 new test cases in tests/test_plugin_validation.py covering each gate (including the keyword-only signature regression), runtime validation, and --strict-plugins exit behavior
  • Existing tests/test_plugins.py::test_report_includes_loaded_plugin_provenance updated for the new fields
  • New tests/test_reports.py::test_v17_loaded_plugins_required_includes_validation_fields locks the v0.17 8-field required shape
  • Full pytest suite passes (pytest --no-header -q green on Python 3.13)
  • python scripts/generate_schemas.py --check clean (no schema drift); CI runs this step before tests
  • docs/checks.json, docs/report-schema.v0.17.json, and llms-full.txt regenerated

Files

File Reason
src/agents_shipgate/checks/plugin_validation.py New: 5 gates + runtime wrapper + strict_failure_messages
src/agents_shipgate/checks/registry.py Refactor _plugin_check_records to use validator; back-compat _plugin_checks filter; run_checks delegates plugins to run_validated_plugin
src/agents_shipgate/core/models.py CheckMetadata accepts id/check_id alias; adds floor_severity with model validator
src/agents_shipgate/cli/_helpers.py _apply_strict_plugins post-scan helper; threads strict_plugins through _run_multi_scan
src/agents_shipgate/cli/_register_scan.py New --strict-plugins flag wired through single + multi-scan paths
scripts/generate_schemas.py Promote three new loaded_plugins[] fields to v0.17 required list
tests/test_plugin_validation.py New: 20 cases incl. required-kw-only regression
tests/test_plugins.py Update existing test for new loaded_plugins[] fields
tests/test_reports.py New test_v17_loaded_plugins_required_includes_validation_fields
STABILITY.md v0.17+ plugin validation paragraphs + 5-gate description
docs/checks.md Plugin validation paragraph + smuggling rule + alias note
docs/checks.json Regenerated (adds floor_severity: null to every catalog entry)
docs/report-schema.v0.17.json Regenerated with 3 new required fields on loaded_plugins[].items
llms-full.txt Regenerated to reflect doc updates

🤖 Generated with Claude Code

Plugins were previously loaded with a single `callable(loaded)` check;
malformed entry points, signature mismatches, missing metadata,
id-collisions, and findings emitted under foreign check IDs all slipped
through silently. This change wraps every entry point in a five-gate
validator and a runtime result wrapper.

Load-time gates (`checks/plugin_validation.py::validate_entry_point`):
1. load — `entry_point.load()` exceptions are captured.
2. signature — must accept exactly one required positional param.
3. metadata — `AGENTS_SHIPGATE_METADATA` must parse as `CheckMetadata`;
   both `id` and `check_id` are accepted as the identifier key (M5
   alias for symmetry with `Finding.check_id`).
4. id_collision — plugin check IDs cannot shadow built-ins (including
   legacy aliases) or earlier plugins.
5. bad_floor — `floor_severity` cannot exceed `default_severity`.

Runtime validation (`run_validated_plugin`):
- Exceptions during the plugin call are captured into
  `loaded_plugins[].runtime_errors`; the scan continues.
- Returned values must be `list[Finding]`; otherwise dropped.
- Findings whose `check_id` differs from the plugin's declared id are
  dropped — a plugin cannot smuggle findings under another check ID.
  This is the load-bearing trust rule.

`loaded_plugins[]` gains three additive fields on every entry:
`validation_status`, `validation_errors`, `runtime_errors`. Fields stay
optional in the v0.16 JSON Schema (no schema-version bump) so the M5
PR remains purely additive; a future bump may promote them to required.

New CLI flag `--strict-plugins`: exit non-zero (code 4) if any plugin
failed validation or produced runtime errors. Default lenient mode
preserves v0.x behavior.

`CheckMetadata` gains `floor_severity: Severity | None` with a model
validator rejecting floor > default. The field is the foundation for
M1's manifest-side severity-override floor and is unused at the
catalog level today (all built-ins default to None).

Tests: 18 new cases in `tests/test_plugin_validation.py` (one per gate
plus runtime + strict-mode behavior); existing
`tests/test_plugins.py::test_report_includes_loaded_plugin_provenance`
updated for the new fields. Existing plugins using
`AGENTS_SHIPGATE_METADATA = {"id": ...}` continue to work unchanged.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab pengfei-threemoonslab force-pushed the claude/m5-plugin-validation branch from e6176ff to 5157336 Compare May 16, 2026 06:01
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review. All three findings addressed in 5157336 (force-pushed).

P1.1 — schema downgrade from v0.17 to v0.16. Confirmed. My branch forked at 940fdcf before #78 (M4) and #81 (contribution_rules) landed; the v0.16 / no-contribution_rules state in the previous push was stale-branch fallout, not an intentional contract change. Rebased onto current origin/main; v0.17 + release_decision.contribution_rules[] + docs/report-schema.v0.17.json are preserved.

P1.2 — check_catalog() without plugins_enabled=False. Confirmed. Same stale-branch issue: M4 added the plugins_enabled=False defense + --check mode + tests/test_schema_roundtrip.py + CI step. The rebase brings all of those back. python scripts/generate_schemas.py --check now passes against the committed schema after the M5 regen, and the CI step that runs it before tests is back in .github/workflows/ci.yml.

P1 follow-on — plugin validation fields are now required at v0.17. Since v0.17 is the current schema on main, the three new loaded_plugins[] fields (validation_status, validation_errors, runtime_errors) are now in the v0.17 required list (scripts/generate_schemas.py:684). The v0.7 frozen-schema test still locks the original 5-field shape (per the immutability contract); a paired test_v17_loaded_plugins_required_includes_validation_fields in tests/test_reports.py locks the v0.17 8-field shape.

P2 — signature gate misses required keyword-only params. Confirmed and fixed in src/agents_shipgate/checks/plugin_validation.py:_signature_error. The gate now collects required KEYWORD_ONLY parameters (no default, no **kwargs catch-all) and rejects them as bad_signature, since the scanner calls plugins as plugin(context) with no kwargs. Optional kw-only params (with defaults) and **kwargs are still accepted. Two regression tests cover both shapes:

  • test_gate_signature_rejects_required_keyword_onlydef plugin(context, *, required_config)bad_signature.
  • test_gate_signature_accepts_optional_keyword_onlydef plugin(context, *, optional_config=None)valid.

Full pytest suite green (Python 3.13 in a fresh .venv); python scripts/generate_schemas.py --check clean.

Diff stat vs origin/main:

14 files changed, 1255 insertions(+), 66 deletions(-)

No file changes outside the M5 surface and the additive loaded_plugins[] required-list bump.

@pengfei-threemoonslab pengfei-threemoonslab merged commit 95c38ee into main May 16, 2026
1 check passed
@pengfei-threemoonslab pengfei-threemoonslab deleted the claude/m5-plugin-validation branch May 16, 2026 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant