fix: preserve multi-type type union when schema has subschemas (#954)#1000
Open
ndreno wants to merge 24 commits intooxidecomputer:mainfrom
Open
fix: preserve multi-type type union when schema has subschemas (#954)#1000ndreno wants to merge 24 commits intooxidecomputer:mainfrom
type union when schema has subschemas (#954)#1000ndreno wants to merge 24 commits intooxidecomputer:mainfrom
Conversation
Cherry-picked from upstream PR oxidecomputer#991. When a schema defines an untagged enum with both integer and number types, serde tries variants in order. Having Number (f64) before Integer (i64) made the Integer variant unreachable since any integer is also a valid f64. Introduces ReorderedInstanceType to ensure Integer is always matched before Number during deserialization.
… defaults (oxidecomputer#918) When a JSON schema property is both required AND has an explicit default value, typify now recognizes the default and uses it for the Default impl and builder pre-population. Previously, required fields unconditionally mapped to StructPropertyState::Required, ignoring any schema default. This also means required fields with intrinsic defaults (Vec, HashMap, Option) now get serde(default) attributes, making deserialization more lenient while still generating correct Default impls.
…xidecomputer#986) When a JSON schema defines a named integer type with min/max bounds narrower than the native type's range (e.g., uint8 with max: 63), typify now generates TryFrom<T> with range validation instead of From<T>. This prevents constructing invalid values at the type level. Also generates custom Deserialize that validates through TryFrom, FromStr that parses and validates, and keeps the inner field private to enforce the constraint invariant. Adds TypeEntryNewtypeConstraints::Range variant and comprehensive unit tests for extract_integer_range helper.
…ecomputer#843) When schemars stores minimum/maximum as f64, whole numbers like 105 were rendered as "105.0" in generated doc comments. Added normalize_json_numbers() to convert whole-number floats to integers in the JSON schema output used for doc comments.
Tracks upstream PRs to integrate, issues to fix, and a phased approach from quick wins through architectural changes.
…xidecomputer#948) When enum variants come from symbolic values like =, >, <, >=, ≠, etc., sanitization previously stripped all non-identifier characters, producing empty or colliding names (all becoming "X") and causing a panic. Added expand_symbols() to convert common operators and symbols into English names (e.g. "=" → "eq", ">=" → "gt_eq", "≠" → "neq") as an intermediate step before falling back to X-replacement. Unicode and ASCII equivalents get distinct names to avoid collisions. Added test schema with the ComparatorString type from the original issue (Factorio schema) and comprehensive unit tests for expand_symbols.
…omputer#414) The anyOf fallback previously generated structs with #[serde(flatten)] Option<T> fields, which panicked at runtime on non-object types (oxidecomputer#895, oxidecomputer#710) and hit unreachable!() in defaults.rs (oxidecomputer#790). Replace with delegation to convert_one_of, which produces proper untagged enums that handle all types correctly. This is a one-line core change that fixes multiple issues at once. - Remove dead flattened_union_struct function - Add defensive handling in defaults.rs for non-struct flattened types - Add test schema exercising anyOf with strings, mixed primitives, overlapping objects, and ref+inline combinations - Gate mutual exclusivity functions as #[cfg(test)] (no longer used in production code path)
Additional test cases for the anyOf overhaul: - null + overlapping types (3-way anyOf) - mixed objects and primitives (flatten would have panicked) - single subschema (degenerate case) - many primitives (string, integer, number, boolean, null) All generate proper untagged enums with no #[serde(flatten)] on non-object types.
…er#480, oxidecomputer#954, oxidecomputer#489, oxidecomputer#435) if/then/else transformation (oxidecomputer#480): Transform if/then/else into oneOf in merge.rs before downstream processing. The then branch becomes allOf(if, then), the else branch becomes allOf(not(if), else). This produces proper untagged enums in Rust. not schema fixes (oxidecomputer#954, oxidecomputer#489, oxidecomputer#435): - Replace todo!() panic in convert_not with graceful fallback to serde_json::Value for unhandled not schemas - Replace panic!() for mixed-type not enum values with fallback - Handle null/array/object types in not enum value inference - Replace todo!() panics in merge.rs not handling with best-effort behavior (return schema unchanged instead of panicking) Test coverage: - if/then/else with basic discrimination and if/then without else - not: { type: "object" } and not: { type: "string" } (issue oxidecomputer#954) - Array items with not type constraint
18 new runtime tests verifying actual serde behavior, not just codegen: - PR oxidecomputer#991: IntOrStr deserializes 42 as Integer (not swallowed by Number) - PR oxidecomputer#918: RequiredWithDefaults::default() has correct values; deserializing {} uses schema defaults - PR oxidecomputer#986: Dscp TryFrom rejects 64, accepts 0-63; serde rejects out-of-range JSON values - PR oxidecomputer#948: All 9 comparator symbols round-trip through serde correctly - PR oxidecomputer#414: anyOf with object+string+integer deserializes without panic - PR oxidecomputer#954: not:{type:"object"} accepts primitives; array items work Also fixes pre-existing CustomMap test compilation (missing Default and is_empty) and adds serde derive feature to typify-test.
…xidecomputer#882, oxidecomputer#975) oxidecomputer#882 — Replace assertion panics in merge.rs for patternProperties and propertyNames with best-effort pass-through behavior. oxidecomputer#975 — When min and max are both specified, find the smallest integer type that contains the range (iterate forward instead of reverse). Integer [1..32] now generates NonZeroU8 instead of NonZeroU64. Fix range constraint codegen for NonZero types: use .get() to extract the inner value for comparison instead of invalid `as` cast. Runtime tests: SmallRange (NonZeroU8) TryFrom and serde validation.
…idecomputer#579, oxidecomputer#201, oxidecomputer#828, oxidecomputer#933) Two new preprocessing layers that transform schemas before the existing pipeline processes them, requiring zero changes to the core conversion: 1. Draft Normalizer (normalize.rs): Detects 2020-12 schemas and transforms keywords to draft-07: - $defs → definitions - prefixItems → items (array form), items → additionalItems - $ref alongside keywords → allOf wrapping - dependentRequired/dependentSchemas → dependencies - unevaluatedProperties → additionalProperties (best-effort) - $dynamicRef → $ref (best-effort) 2. Reference Bundler (bundler.rs): - Resolves non-$defs internal refs (#/properties/foo → #/definitions/Foo) - Bundles external file refs (other.json#/definitions/X) into local defs New public API: - TypeSpace::add_schema_from_value() — auto-normalizes any draft - TypeSpace::add_schema_with_externals() — handles multi-file schemas Safety: ref_key() and convert_reference() no longer panic on unrecognized refs — return proper errors instead. Tests: 10 normalizer unit tests, 8 bundler unit tests, 3 runtime integration tests (2020-12 schema, external refs, serde round-trip).
Add proper test schema fixtures for both modern JSON Schema drafts: - draft-2020-12.json: $defs, dependentRequired, unevaluatedProperties, $ref alongside other keywords - draft-2019-09.json: $defs, dependentSchemas, unevaluatedProperties Update test harness to use add_schema_from_value() instead of add_root_schema() so the normalizer runs on all test schemas. This ensures 2020-12/2019-09 keywords are properly transformed to draft-07 before the pipeline processes them. Coverage now spans: draft-04, draft-07, 2019-09, 2020-12.
README.md: - Update AnyOf section: now generates untagged enums (not broken flattened structs) - Add If/Then/Else section - Add Not schema section - Add JSON Schema 2020-12 Support section with keyword mapping table - Add External References section with API examples - Remove "Bounded numbers" from WIP (now fixed) CHANGELOG.adoc: - Add comprehensive unreleased section covering all 19 issues fixed - Document all new features: 2020-12 support, external refs, if/then/else, new APIs
The CLI now:
- Auto-discovers external schema files from the input file's directory
(or from --schema-dir if specified) and bundles them automatically
- Uses add_schema_from_value() for automatic 2020-12 normalization
- Supports non-$defs internal refs without panicking
New flag:
--schema-dir <DIR> Directory to search for external schemas
referenced via $ref (defaults to input file dir)
Usage:
cargo typify schema.json # auto-discovers externals
cargo typify schema.json --schema-dir ./schemas # explicit directory
When normalizing prefixItems, inject minItems/maxItems matching the tuple length for closed tuples (items: false). This is required by typify's tuple detection which checks minItems == maxItems. Now `prefixItems: [number, number], items: false` correctly generates a Rust tuple type `(f64, f64)` instead of erroring. Added runtime test deserializing [42.5, -73.0] as Coordinates tuple.
…decomputer#954) When a schema object had both `type: [a, b, c, ...]` and one of `oneOf`/`anyOf`/`allOf`/`not` on the same level, convert_schema_object matched an earlier arm that wildcarded the `instance_type` field (flagged by a pre-existing TODO: "we probably shouldn't ignore this"). The subschema branches were then converted in isolation, silently discarding the outer type union. For the schema in issue oxidecomputer#954: { "type": ["string", "number", "boolean", "array"], "oneOf": [ { "items": { "type": "string" } }, { "items": { "type": "number" } }, { "items": { "type": "boolean" } } ] } typify produced an enum with only three array variants, losing the string/number/boolean alternatives entirely. The fix tightens the match pattern to `None | Some(Single(_))`, letting `Some(Vec(_))` fall through to the existing merge arm which already folds the outer schema into each subschema branch via `try_merge_with_subschemas`. Single-type + subschema behaviour is preserved, so tolerant handling of schemas whose outer type conflicts with a branch (e.g. the rust-collisions fixture) is unchanged. Adds a fixture with eight cases covering the affected code path: - `type:[...]` combined with oneOf, anyOf, allOf, and not - explicit `type: array` in every oneOf branch (primitives pruned) - partially and fully unsatisfiable oneOf - simultaneous oneOf + allOf - singleton-type regression guard (rust-collisions pattern) No existing workspace fixture exercised `type: [...]` alongside subschemas on the same object, so this fix had zero pre-existing regression coverage.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #954.
Problem
When a schema has both a multi-type
type: [...]and a siblingoneOf/anyOf/allOf/noton the same object,convert_schema_objectmatches an arm that wildcardsinstance_type(flagged by a pre-existing// TODO we probably shouldn't ignore this...). The subschema branches are then converted in isolation, silently discarding the outer type union.For the schema from the issue thread:
{ "type": ["string", "number", "boolean", "array"], "oneOf": [ { "items": { "type": "string" } }, { "items": { "type": "number" } }, { "items": { "type": "boolean" } } ] }typify emitted only three array variants, dropping
string/number/booleanentirely:Fix
Tighten the earlier match arm from
instance_type: _toinstance_type: None | Some(SingleOrVec::Single(_)). Schemas withSome(Vec(_))fall through to the existing "Subschemas with other stuff" arm below, which already folds the outer schema into each subschema branch viatry_merge_with_subschemas. Behaviour forNone/Singleis unchanged, so the existing tolerant handling of schemas whose outer singleton type conflicts with a branch (e.g. therust-collisionsfixture) is preserved.After the fix, the issue's schema expands to:
This is not the exact factored shape discussed in the issue thread (hoisting the shared primitive alternatives above a single
Arrayvariant whose inner type is itself an enum). Reaching that would require an additional dedup/factoring pass across theoneOfbranches. This PR restores structural accuracy: every alternative from the outertypeunion is represented, and theitemsconstraint is applied to theArrayvariant of each branch.Tests
No existing fixture exercised
type: [...]alongside subschemas on the same object, so the bug had zero regression coverage. Addstypify/tests/schemas/type-array-with-subschemas.jsonwith eight cases:type:[...]withoneOf/ items-only branches (canonical Panic: Not yet implemented: unhandled not schema Object #954 case)type:[...]withanyOftype:[...]withallOf(constraint refinement, including a numericminLengththat manifests as a newtype wrapper)type:[...]withnot(type exclusion)type+ conflictingoneOfbranch (regression guard for therust-collisionspattern)type:[...]withoneOfwhere every branch pinstype: array(primitives correctly pruned)type:[...]withoneOfwhere some branches are unsatisfiable against the outer type uniontype:[...]withoneOffully unsatisfiable (emits empty enum viaconvert_never, no panic)type:[...]with simultaneousoneOfandallOfThe generated fixture compiles under
#![deny(warnings)]viatrybuild. Full workspacecargo testis green.Out of scope
The top-level anonymous-enum panic at
type_entry.rs:306(get_type_name(...).unwrap()) still trips for schemas whose subschema-with-type-union construct sits directly at the root. Wrapping the construct in$defsis sufficient today. That panic is a separate issue and not addressed here.