Skip to content

fix: preserve multi-type type union when schema has subschemas (#954)#1000

Open
ndreno wants to merge 24 commits intooxidecomputer:mainfrom
barbacane-dev:fix/issue-954-type-union-with-subschemas
Open

fix: preserve multi-type type union when schema has subschemas (#954)#1000
ndreno wants to merge 24 commits intooxidecomputer:mainfrom
barbacane-dev:fix/issue-954-type-union-with-subschemas

Conversation

@ndreno
Copy link
Copy Markdown

@ndreno ndreno commented Apr 22, 2026

Fixes #954.

Problem

When a schema has both a multi-type type: [...] and a sibling oneOf / anyOf / allOf / not on the same object, convert_schema_object matches an arm that wildcards instance_type (flagged by a pre-existing // TODO we probably shouldn't ignore this...). The subschema branches are then converted in isolation, silently discarding the outer type union.

For the schema from the issue thread:

{
  "type": ["string", "number", "boolean", "array"],
  "oneOf": [
    { "items": { "type": "string" } },
    { "items": { "type": "number" } },
    { "items": { "type": "boolean" } }
  ]
}

typify emitted only three array variants, dropping string / number / boolean entirely:

pub enum FooValue {
    Variant0(Vec<String>),
    Variant1(Vec<f64>),
    Variant2(Vec<bool>),
}

Fix

Tighten the earlier match arm from instance_type: _ to instance_type: None | Some(SingleOrVec::Single(_)). Schemas with Some(Vec(_)) fall through to the existing "Subschemas with other stuff" arm below, which already folds the outer schema into each subschema branch via try_merge_with_subschemas. Behaviour for None / Single is unchanged, so the existing tolerant handling of schemas whose outer singleton type conflicts with a branch (e.g. the rust-collisions fixture) is preserved.

After the fix, the issue's schema expands to:

pub enum FooValue {
    Variant0(FooValueVariant0),
    Variant1(FooValueVariant1),
    Variant2(FooValueVariant2),
}
pub enum FooValueVariant0 {
    Boolean(bool),
    Number(f64),
    String(String),
    Array(Vec<String>),
}
// Variant1 carries Array(Vec<f64>); Variant2 carries Array(Vec<bool>).

This is not the exact factored shape discussed in the issue thread (hoisting the shared primitive alternatives above a single Array variant whose inner type is itself an enum). Reaching that would require an additional dedup/factoring pass across the oneOf branches. This PR restores structural accuracy: every alternative from the outer type union is represented, and the items constraint is applied to the Array variant of each branch.

Tests

No existing fixture exercised type: [...] alongside subschemas on the same object, so the bug had zero regression coverage. Adds typify/tests/schemas/type-array-with-subschemas.json with eight cases:

  • type:[...] with oneOf / items-only branches (canonical Panic: Not yet implemented: unhandled not schema Object #954 case)
  • type:[...] with anyOf
  • type:[...] with allOf (constraint refinement, including a numeric minLength that manifests as a newtype wrapper)
  • type:[...] with not (type exclusion)
  • Singleton type + conflicting oneOf branch (regression guard for the rust-collisions pattern)
  • type:[...] with oneOf where every branch pins type: array (primitives correctly pruned)
  • type:[...] with oneOf where some branches are unsatisfiable against the outer type union
  • type:[...] with oneOf fully unsatisfiable (emits empty enum via convert_never, no panic)
  • type:[...] with simultaneous oneOf and allOf

The generated fixture compiles under #![deny(warnings)] via trybuild. Full workspace cargo test is green.

Out of scope

The top-level anonymous-enum panic at type_entry.rs:306 (get_type_name(...).unwrap()) still trips for schemas whose subschema-with-type-union construct sits directly at the root. Wrapping the construct in $defs is sufficient today. That panic is a separate issue and not addressed here.

ndreno added 24 commits March 19, 2026 10:00
Cherry-picked from upstream PR oxidecomputer#991. When a schema defines an untagged
enum with both integer and number types, serde tries variants in order.
Having Number (f64) before Integer (i64) made the Integer variant
unreachable since any integer is also a valid f64.

Introduces ReorderedInstanceType to ensure Integer is always matched
before Number during deserialization.
… defaults (oxidecomputer#918)

When a JSON schema property is both required AND has an explicit default
value, typify now recognizes the default and uses it for the Default impl
and builder pre-population. Previously, required fields unconditionally
mapped to StructPropertyState::Required, ignoring any schema default.

This also means required fields with intrinsic defaults (Vec, HashMap,
Option) now get serde(default) attributes, making deserialization more
lenient while still generating correct Default impls.
…xidecomputer#986)

When a JSON schema defines a named integer type with min/max bounds
narrower than the native type's range (e.g., uint8 with max: 63),
typify now generates TryFrom<T> with range validation instead of
From<T>. This prevents constructing invalid values at the type level.

Also generates custom Deserialize that validates through TryFrom,
FromStr that parses and validates, and keeps the inner field private
to enforce the constraint invariant.

Adds TypeEntryNewtypeConstraints::Range variant and comprehensive
unit tests for extract_integer_range helper.
…ecomputer#843)

When schemars stores minimum/maximum as f64, whole numbers like 105
were rendered as "105.0" in generated doc comments. Added
normalize_json_numbers() to convert whole-number floats to integers
in the JSON schema output used for doc comments.
Tracks upstream PRs to integrate, issues to fix, and a phased
approach from quick wins through architectural changes.
…xidecomputer#948)

When enum variants come from symbolic values like =, >, <, >=, ≠, etc.,
sanitization previously stripped all non-identifier characters, producing
empty or colliding names (all becoming "X") and causing a panic.

Added expand_symbols() to convert common operators and symbols into
English names (e.g. "=" → "eq", ">=" → "gt_eq", "≠" → "neq") as an
intermediate step before falling back to X-replacement. Unicode and
ASCII equivalents get distinct names to avoid collisions.

Added test schema with the ComparatorString type from the original
issue (Factorio schema) and comprehensive unit tests for expand_symbols.
…omputer#414)

The anyOf fallback previously generated structs with #[serde(flatten)]
Option<T> fields, which panicked at runtime on non-object types (oxidecomputer#895,
oxidecomputer#710) and hit unreachable!() in defaults.rs (oxidecomputer#790).

Replace with delegation to convert_one_of, which produces proper
untagged enums that handle all types correctly. This is a one-line
core change that fixes multiple issues at once.

- Remove dead flattened_union_struct function
- Add defensive handling in defaults.rs for non-struct flattened types
- Add test schema exercising anyOf with strings, mixed primitives,
  overlapping objects, and ref+inline combinations
- Gate mutual exclusivity functions as #[cfg(test)] (no longer used
  in production code path)
Additional test cases for the anyOf overhaul:
- null + overlapping types (3-way anyOf)
- mixed objects and primitives (flatten would have panicked)
- single subschema (degenerate case)
- many primitives (string, integer, number, boolean, null)

All generate proper untagged enums with no #[serde(flatten)] on
non-object types.
…er#480, oxidecomputer#954, oxidecomputer#489, oxidecomputer#435)

if/then/else transformation (oxidecomputer#480):
Transform if/then/else into oneOf in merge.rs before downstream
processing. The then branch becomes allOf(if, then), the else
branch becomes allOf(not(if), else). This produces proper untagged
enums in Rust.

not schema fixes (oxidecomputer#954, oxidecomputer#489, oxidecomputer#435):
- Replace todo!() panic in convert_not with graceful fallback to
  serde_json::Value for unhandled not schemas
- Replace panic!() for mixed-type not enum values with fallback
- Handle null/array/object types in not enum value inference
- Replace todo!() panics in merge.rs not handling with best-effort
  behavior (return schema unchanged instead of panicking)

Test coverage:
- if/then/else with basic discrimination and if/then without else
- not: { type: "object" } and not: { type: "string" } (issue oxidecomputer#954)
- Array items with not type constraint
18 new runtime tests verifying actual serde behavior, not just codegen:

- PR oxidecomputer#991: IntOrStr deserializes 42 as Integer (not swallowed by Number)
- PR oxidecomputer#918: RequiredWithDefaults::default() has correct values;
  deserializing {} uses schema defaults
- PR oxidecomputer#986: Dscp TryFrom rejects 64, accepts 0-63; serde rejects
  out-of-range JSON values
- PR oxidecomputer#948: All 9 comparator symbols round-trip through serde correctly
- PR oxidecomputer#414: anyOf with object+string+integer deserializes without panic
- PR oxidecomputer#954: not:{type:"object"} accepts primitives; array items work

Also fixes pre-existing CustomMap test compilation (missing Default
and is_empty) and adds serde derive feature to typify-test.
…xidecomputer#882, oxidecomputer#975)

oxidecomputer#882 — Replace assertion panics in merge.rs for patternProperties and
propertyNames with best-effort pass-through behavior.

oxidecomputer#975 — When min and max are both specified, find the smallest integer
type that contains the range (iterate forward instead of reverse).
Integer [1..32] now generates NonZeroU8 instead of NonZeroU64.

Fix range constraint codegen for NonZero types: use .get() to extract
the inner value for comparison instead of invalid `as` cast.

Runtime tests: SmallRange (NonZeroU8) TryFrom and serde validation.
…idecomputer#579, oxidecomputer#201, oxidecomputer#828, oxidecomputer#933)

Two new preprocessing layers that transform schemas before the existing
pipeline processes them, requiring zero changes to the core conversion:

1. Draft Normalizer (normalize.rs):
   Detects 2020-12 schemas and transforms keywords to draft-07:
   - $defs → definitions
   - prefixItems → items (array form), items → additionalItems
   - $ref alongside keywords → allOf wrapping
   - dependentRequired/dependentSchemas → dependencies
   - unevaluatedProperties → additionalProperties (best-effort)
   - $dynamicRef → $ref (best-effort)

2. Reference Bundler (bundler.rs):
   - Resolves non-$defs internal refs (#/properties/foo → #/definitions/Foo)
   - Bundles external file refs (other.json#/definitions/X) into local defs

New public API:
- TypeSpace::add_schema_from_value() — auto-normalizes any draft
- TypeSpace::add_schema_with_externals() — handles multi-file schemas

Safety: ref_key() and convert_reference() no longer panic on
unrecognized refs — return proper errors instead.

Tests: 10 normalizer unit tests, 8 bundler unit tests, 3 runtime
integration tests (2020-12 schema, external refs, serde round-trip).
Add proper test schema fixtures for both modern JSON Schema drafts:
- draft-2020-12.json: $defs, dependentRequired, unevaluatedProperties,
  $ref alongside other keywords
- draft-2019-09.json: $defs, dependentSchemas, unevaluatedProperties

Update test harness to use add_schema_from_value() instead of
add_root_schema() so the normalizer runs on all test schemas.
This ensures 2020-12/2019-09 keywords are properly transformed
to draft-07 before the pipeline processes them.

Coverage now spans: draft-04, draft-07, 2019-09, 2020-12.
README.md:
- Update AnyOf section: now generates untagged enums (not broken
  flattened structs)
- Add If/Then/Else section
- Add Not schema section
- Add JSON Schema 2020-12 Support section with keyword mapping table
- Add External References section with API examples
- Remove "Bounded numbers" from WIP (now fixed)

CHANGELOG.adoc:
- Add comprehensive unreleased section covering all 19 issues fixed
- Document all new features: 2020-12 support, external refs,
  if/then/else, new APIs
The CLI now:
- Auto-discovers external schema files from the input file's directory
  (or from --schema-dir if specified) and bundles them automatically
- Uses add_schema_from_value() for automatic 2020-12 normalization
- Supports non-$defs internal refs without panicking

New flag:
  --schema-dir <DIR>  Directory to search for external schemas
                      referenced via $ref (defaults to input file dir)

Usage:
  cargo typify schema.json                    # auto-discovers externals
  cargo typify schema.json --schema-dir ./schemas  # explicit directory
When normalizing prefixItems, inject minItems/maxItems matching the
tuple length for closed tuples (items: false). This is required by
typify's tuple detection which checks minItems == maxItems.

Now `prefixItems: [number, number], items: false` correctly generates
a Rust tuple type `(f64, f64)` instead of erroring.

Added runtime test deserializing [42.5, -73.0] as Coordinates tuple.
…decomputer#954)

When a schema object had both `type: [a, b, c, ...]` and one of
`oneOf`/`anyOf`/`allOf`/`not` on the same level, convert_schema_object
matched an earlier arm that wildcarded the `instance_type` field
(flagged by a pre-existing TODO: "we probably shouldn't ignore this").
The subschema branches were then converted in isolation, silently
discarding the outer type union.

For the schema in issue oxidecomputer#954:

    {
      "type": ["string", "number", "boolean", "array"],
      "oneOf": [
        { "items": { "type": "string" } },
        { "items": { "type": "number" } },
        { "items": { "type": "boolean" } }
      ]
    }

typify produced an enum with only three array variants, losing the
string/number/boolean alternatives entirely.

The fix tightens the match pattern to `None | Some(Single(_))`, letting
`Some(Vec(_))` fall through to the existing merge arm which already
folds the outer schema into each subschema branch via
`try_merge_with_subschemas`. Single-type + subschema behaviour is
preserved, so tolerant handling of schemas whose outer type conflicts
with a branch (e.g. the rust-collisions fixture) is unchanged.

Adds a fixture with eight cases covering the affected code path:
  - `type:[...]` combined with oneOf, anyOf, allOf, and not
  - explicit `type: array` in every oneOf branch (primitives pruned)
  - partially and fully unsatisfiable oneOf
  - simultaneous oneOf + allOf
  - singleton-type regression guard (rust-collisions pattern)

No existing workspace fixture exercised `type: [...]` alongside
subschemas on the same object, so this fix had zero pre-existing
regression coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic: Not yet implemented: unhandled not schema Object

1 participant