Skip to content

fix: make_array accepts inputs differing only in nested-field nullability#22658

Open
schenksj wants to merge 1 commit into
apache:mainfrom
schenksj:fix-22366-make-array-nullability
Open

fix: make_array accepts inputs differing only in nested-field nullability#22658
schenksj wants to merge 1 commit into
apache:mainfrom
schenksj:fix-22366-make-array-nullability

Conversation

@schenksj
Copy link
Copy Markdown

Which issue does this PR close?

Rationale for this change

make_array panics with Arrays with inconsistent types passed to MutableArrayData when combining values whose types are identical except for the nullability of a nested field — for example a struct field that is non-nullable when constructed from a literal but nullable when read from a column.

This blocks native execution of plans that construct structs from multiple sources (Delta Lake CDC writes, UNION within arrays), forcing workarounds that sacrifice performance.

Spark, Postgres, and Arrow's own concat all handle this by widening nullable flags rather than enforcing strict type equality. This PR brings make_array in line with that precedent: inputs that differ only in nested-field nullability are accepted, and the result widens nullable flags to true at every nesting level.

What changes are included in this PR?

  • Add merge_nullability, which OR-s nullable flags at every nesting level (struct fields, list elements, ...) using Arrow's Field::try_merge, and returns None (preserving prior behavior) for structurally-incompatible inputs.
  • array_array (the runtime shared by both make_array and Spark's array) computes a merged element type that is a supertype of all arguments and cheaply casts each argument up to it before building the list, so MutableArrayData no longer sees inconsistent types.
  • coerce_types_inner widens the per-argument struct types produced by try_type_union_resolution_with_struct to a single common type, so the declared return type matches the value produced at runtime.

Are these changes tested?

Yes:

  • A new unit test (make_array_relaxes_nested_field_nullability) reproduces the original panic at the make_array_inner boundary and asserts it now succeeds.
  • New sqllogictest coverage in array/make_array.slt for make_array over flat and nested structs.

Note: the SQL planner already normalizes nested-field nullability for struct construction from SQL literals/columns, so the panic is reached from sources with declared non-null nested schemas (e.g. Delta Lake CDC); the unit test exercises that path directly.

Are there any user-facing changes?

make_array (and Spark array) now succeed on inputs that previously panicked. There are no breaking API changes; the result type simply widens nested nullable flags where inputs disagree.

…lity

`make_array` panicked with "Arrays with inconsistent types passed to
MutableArrayData" when combining values whose types are identical except
for the nullability of a nested field (e.g. a struct field that is
non-nullable when built from a literal but nullable when read from a
column). This blocks native execution of plans that construct structs
from multiple sources (Delta Lake CDC writes, UNION within arrays).

Following the precedent of Spark, Postgres, and Arrow's own `concat`,
relax the strict element-type equality by widening nullable flags to
`true` at every nesting level:

- `array_array` now computes a merged element type that is a supertype
  of all arguments (OR-ing nullable flags via `Field::try_merge`) and
  cheaply casts each argument up to it before building the list, so
  `MutableArrayData` no longer sees inconsistent types.
- `coerce_types_inner` widens the per-argument struct types produced by
  `try_type_union_resolution_with_struct` to a single common type so the
  declared return type matches the value produced at runtime.

Adds a unit test reproducing the original panic and sqllogictest
coverage for `make_array` over (nested) structs.

Closes apache#22366

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make_array: relax element-type equality to accept inputs differing only in nested-field nullability

1 participant