Skip to content

feat(table): variant shredded reader #986

@laskoviymishka

Description

@laskoviymishka

Parent: #589

#929 / PR #932 cover the non-shredded variant path (struct<metadata: binary, value: binary>). The shredded path is the next piece: a Parquet variant column appears as struct<metadata: binary, value: binary, typed_value: STRUCT> where typed_value mirrors the shredded subtree and the reader reconstructs the full variant by preferring typed_value per-field and falling back to the residual value slice.

The Parquet codec lives in arrow-go's parquet/variant (already in go.mod); the spec is in Parquet Variant shredding. Add a new file table/internal/variant_shredded.go carrying a pure ReassembleShreddedVariant(metadata, value, typedValue) (variant.Value, error) that walks the shredded struct per spec, then call it from table/internal/parquet_files.go so a shredded variant column reads identically to a non-shredded one. A shredded column should be invisible to the scanner.

Cross-client coverage: a Java-produced shredded variant fixture committed under table/internal/testdata/ and a golden test asserting iceberg-go reads the same variant.Value. Java apache/iceberg PR landing the writer: apache/iceberg#11500 and follow-ups.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions