Skip to content

Wide schema performance: eliminate quadratic column-count scaling #9722

@HippoBaro

Description

@HippoBaro

Describe the bug

Several independent code paths in the Parquet reader scale poorly with column count, leading to catastrophic performance on wide schemas. This epic-like issue tracks the general problem; individual PRs will reference it for context.

To Reproduce
N/A

Expected behavior

Parquet reader/writer operations and Arrow structures (RecordBatch, etc.) should scale linearly (ideally sub-linearly) with column count wherever practical.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions