Skip to content

RecordBatch::normalize() does not propagate top level null bitmap into the results #9732

@sqd

Description

@sqd

Currently RecordBatch::normalize() has a bug in that the top level struct's null bitmap is not propagated into the resulting normalized arrays' null bitmap. In other words, a child element may suddenly appear non-null, losing the fact that the parent level struct is null at that index.

To Reproduce
See the added test for RecordBatch::normalize() in the associated PR.

Expected behavior
RecordBatch::normalize() propagates top level struct's null bitmap into the resulting normalized arrays.

Additional context
This is perfectly legal https://github.com/apache/arrow/blob/main/docs/source/format/Columnar.rst#struct-validity:

A struct array has its own validity bitmap that is independent of its child arrays' validity bitmaps. The validity bitmap for the struct array might indicate a null when one or more of its child arrays has a non-null value in its corresponding slot; or conversely, a child array might indicate a null in its validity bitmap while the struct array's validity bitmap shows a non-null value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions