Skip to content

Handle missing repetition/definition levels#717

Closed
daniellerozenblit wants to merge 1 commit into
facebook:devfrom
daniellerozenblit:export-D103245483
Closed

Handle missing repetition/definition levels#717
daniellerozenblit wants to merge 1 commit into
facebook:devfrom
daniellerozenblit:export-D103245483

Conversation

@daniellerozenblit
Copy link
Copy Markdown
Contributor

Summary:
Per the parquet spec, definition/repetition level blocks are omitted from data pages when the column has no OPTIONAL/REPEATED ancestor. We did not previously handle this case, causing parsing of Parquet files with required columns to fail.

This diff adds tracking of hasDefinitionLevels / hasRepetitionLevels per leaf in the schema metadata, and consume the repetition/definition blocks independently in the lexer only when present.

Differential Revision: D103245483

Summary:
Per the parquet spec, definition/repetition level blocks are omitted from data pages when the column has no `OPTIONAL`/`REPEATED` ancestor. We did not previously handle this case, causing parsing of Parquet files with required columns to fail.

This diff adds tracking of `hasDefinitionLevels` / `hasRepetitionLevels` per leaf in the schema metadata, and consume the repetition/definition blocks independently in the lexer only when present.

Differential Revision: D103245483
@meta-cla meta-cla Bot added the cla signed label May 1, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 1, 2026

@daniellerozenblit has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103245483.

@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 4, 2026

This pull request has been merged in 1f1a45f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant