Skip to content

test: validate writer output against Vega-Lite v6 JSON Schema#175

Open
cpsievert wants to merge 12 commits intoposit-dev:mainfrom
cpsievert:feat/vegalite-schema-validation
Open

test: validate writer output against Vega-Lite v6 JSON Schema#175
cpsievert wants to merge 12 commits intoposit-dev:mainfrom
cpsievert:feat/vegalite-schema-validation

Conversation

@cpsievert
Copy link
Collaborator

@cpsievert cpsievert commented Mar 6, 2026

Summary

Add Vega-Lite v6 JSON Schema validation to the writer test suite. Every test that produces a full spec now validates against the official schema, catching structural errors at test time rather than downstream in Altair or Vega-Embed.

  • Vendor VL v6 schema (src/writer/vegalite/schema/v6.json, ~1.8MB)
  • Add jsonschema = "0.44" dev-dependency
  • Add assert_valid_vegalite() test helper using LazyLock<Validator> (parses schema once per test run)
  • Add schema validation to 14 existing writer tests

Bugs found and fixed

1. height/width: "container" on faceted specs (3 tests)

  • Root cause: The writer unconditionally set "height": "container" and "width": "container" at the top level. When apply_faceting() restructured the spec into a FacetSpec, these properties remained at the top level, which is invalid in VL v6.
  • Fix: Move width/height into the inner spec object when building faceted specs (both Wrap and Grid layouts).

2. bin: "binned" on non-positional encodings (1 test)

  • Root cause: The writer unconditionally emitted "bin": "binned" for all aesthetics with a binned scale type. VL v6 only allows the "binned" string on positional channels (x, y). Non-positional channels (color, size, etc.) only accept true, a BinParams object, or null.
  • Fix: Guard bin: "binned" with !is_binned_legend so it's only emitted for positional aesthetics. Binned legend aesthetics already use threshold scales and don't need the bin property.

cpsievert and others added 6 commits March 6, 2026 11:32
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add jsonschema = "0.44" to dev-dependencies in src/Cargo.toml
- Vendor the Vega-Lite v6 JSON Schema to src/writer/vegalite/schema/v6.json
- Add sanitize_vegalite_schema() to rename non-URI-safe definition keys
  (angle brackets, parens, pipes, etc.) so jsonschema can compile the schema
- Add LazyLock<Validator> VL_SCHEMA and assert_valid_vegalite() helper
- Add test_schema_validation_catches_invalid_spec test that verifies a valid
  spec passes and an invalid mark value fails validation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add assert_valid_vegalite() calls to 14 writer tests that produce
full specs. 4 tests fail schema validation, revealing real writer bugs:

- 3 facet tests: "height":"container" invalid on FacetSpec in VL v6
- 1 legend test: "values" not a valid legend property in VL v6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In VL v6, the \"binned\" string value for the `bin` property is only
valid for positional channels (x, y). Non-positional channels such as
color, size, and opacity only accept a boolean, a BinParams object, or
null. Emitting `\"bin\": \"binned\"` on a color encoding therefore fails
schema validation.

Fix: guard the `bin: \"binned\"` assignment with `!is_binned_legend` so
it is only added for positional aesthetics. Binned legend aesthetics
already get the correct treatment via a threshold scale type and do not
need the `bin` property at all.

Also remove a duplicate `jsonschema` entry in src/Cargo.toml that was
introduced while retrofitting existing tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This comment was marked as resolved.

cpsievert and others added 6 commits March 6, 2026 14:55
Replace map_strings (which allocated for every string in the 1.8MB
schema) with rewrite_refs that only visits $ref values in objects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a VEGALITE_VERSION constant ("v6") used by the writer to construct
the $schema URL. The vendored schema file at schema/v6.json must match.
A test verifies the constant and writer URL stay in sync — when bumping
versions, update the constant, rename the vendored file, and update the
include_str! path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Download the schema from the writer's $schema URL and compare it to
the vendored copy. Skips gracefully when offline. This catches drift
between the vendored file and upstream Vega-Lite releases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…agment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants