Skip to content

schema: close remaining hand-sync seams in docs/output-schema.json (follow-up to #338) #384

@BartWaardenburg

Description

@BartWaardenburg

Follow-up to #338. #338 made the in-scope definitions/ block in docs/output-schema.json derive from Rust schemars and gated drift on every cargo test. A handful of seams in the Rust to schema to wire chain are still hand-synced; this issue tracks closing them.

Sequencing

Dependency-first, not size-first. Item 5 is the structural backstop that lets every later refactor land safely (no silent shape regression on any of the 182 definitions). The hard chain is:

5 -> 1 -> 6

Item 5 (tighten drift gate) must precede item 1 (retire augment_finding_definition) because item 1 touches every finding definition and we want the gate to fire on every shape change. Item 1 must precede item 6 (document-root oneOf from a Rust FallowOutput enum) because item 6 wants actions modeled on the typed structs, not grafted on after derivation.

Item 6 is the strategic close: a typed FallowOutput enum with a kind discriminator makes fallow's JSON output machine-discriminable in O(1) without try-parsing 11 variants. That is the headline win for AI / agent consumers; it ships last only because it depends on items 1 and 5.

Item 4 is a tiny artifact-level change that opens the PR with the headline item visible.

Open seams

1. Retire augment_finding_definition

crates/cli/src/bin/schema_emit.rs::augment_finding_definition grafts an actions array and optional introduced flag onto every finding definition after derivation, because the Rust source structs do not carry those fields. The mapping from finding type to action $ref (finding_augmentation()) is hand-coupled to actions_for_issue_type / inject_health_actions / inject_dupes_actions in crates/cli/src/report/json.rs. Moving the runtime over to typed wrappers retires the post-pass and the coupling.

  • Add typed actions: Vec<IssueAction> (or per-finding action wrapper) to the Rust finding structs.
  • Add typed introduced: Option<AuditIntroduced> where applicable.
  • Switch the JSON layer in crates/cli/src/report/json.rs to serialize through the typed wrappers instead of post-pass injection.
  • Delete augment_finding_definition and the finding_definition_names() / finding_augmentation() lists.

2. Retire augment_runtime_coverage_report

augment_runtime_coverage_report in schema_emit.rs hard-codes the runtime-coverage schema_version: "1" property onto the derived RuntimeCoverageReport definition. Bumping RUNTIME_COVERAGE_SCHEMA_VERSION in report/json.rs requires hand-editing the augmentation in the same PR (called out as MAINTENANCE: in source).

  • Add a typed schema_version: RuntimeCoverageSchemaVersion field to RuntimeCoverageReport.
  • Remove the post-pass augmentation.

3. Migrate dynamic envelope builders to typed structs

The typed envelope structs (CodeClimateOutput, ReviewEnvelopeOutput, CoverageSetupOutput) lock the schema shape via the drift gate, but the runtime emit still builds the wire as serde_json::Value via json! macros. Adding a field to the struct does not flow to the wire automatically; the schema gate stays green while the wire silently misses it. The dead_code allow at crates/cli/src/output_envelope.rs:34-37 is the breadcrumb.

Three independent migrations; each can land as its own PR.

3a. CodeClimateOutput

  • Swap crates/cli/src/report/codeclimate.rs::cc_issue to construct CodeClimateIssue.
  • Update the dead_code allow list accordingly.

3b. ReviewEnvelopeOutput

  • Swap the review-envelope renderer in crates/cli/src/report/ci/review.rs to construct ReviewEnvelopeOutput / GitHubReviewComment / GitLabReviewComment.
  • Update the dead_code allow list accordingly.

3c. CoverageSetupOutput

  • Swap crates/cli/src/coverage/mod.rs::build_setup_json to construct CoverageSetupOutput.
  • Remove the trailing dead_code allow on output_envelope.rs once 3a + 3b + 3c are all in.

4. Add $id URL for SHA-pinning

Consumers that vendor docs/output-schema.json for runtime validation cannot SHA-pin a specific schema revision today.

  • Add a stable $id to the document root pointing at the canonical raw GitHub URL.
  • Document the pinning contract in docs/backwards-compatibility.md (replace main with a tag for stability; ajv does NOT fetch $id over the network by default).
  • Include a copy-pasteable ajv strict setup snippet so first-time pinners have a recipe.

5. Tighten drift gate to cover every committed definition

derived_definition_names() enumerates ~108 entries; the committed schema carries 182 definitions. The other ~74 (transitive helpers like AnalysisResults, AttributedCloneGroup, AuditSummary, TrendDirection, every kebab-case enum) are overwritten on regen and only protected by git diff --exit-code in CI, not by structural normalization. Schemars-version churn there can land silently if someone runs the regen without inspecting the diff.

  • Walk every key emitted by derived_definitions() in drift_tests::committed_definitions_match_derived_structurally, not just the explicit allow-list.
  • Keep the explicit allow-list of finding types separate (still needed by the augmentation pass until item 1 lands).
  • Audit the helpers for any shape divergence the loose gate missed; reconcile.

6. Derive top-level oneOf / title / description from Rust

The document-root $schema, title, description, and the 11-entry oneOf discriminator at docs/output-schema.json are hand-maintained. The branches $ref typed envelopes, but the array structure and ordering are not generated. A typed top-level FallowOutput enum with #[serde(tag = "kind")] would let consumers discriminate on a single field rather than try-parsing every variant; this is the agent-discriminability headline.

  • Add JsonSchema to a top-level FallowOutput enum whose variants are the existing 11 envelope shapes.
  • Add a kind discriminator (or equivalent tag) so AI / agent consumers can route in O(1).
  • Emit the document-root from Rust instead of stitching the derived definitions into a hand-written wrapper.

Why one meta-issue

The seams are independent and can be picked up in any order along the dependency chain, but each one is a small focused change rather than a multi-day migration. Tracking them as checkboxes here keeps the issue list tight and lets the next contributor pick whichever ladder rung is unblocked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions