feat: enforce physical column ordering in Parquet files#6287
Open
g-talbot wants to merge 1 commit intogtt/docs-claude-mdfrom
Open
feat: enforce physical column ordering in Parquet files#6287g-talbot wants to merge 1 commit intogtt/docs-claude-mdfrom
g-talbot wants to merge 1 commit intogtt/docs-claude-mdfrom
Conversation
rishabh
approved these changes
Apr 10, 2026
rishabh
approved these changes
Apr 10, 2026
5 tasks
9e5c6ef to
cc4492e
Compare
cc4492e to
4006b20
Compare
…treaming merge (#6281) * feat: enforce physical column ordering in Parquet files Sort schema columns are written first (in their configured sort order), followed by all remaining data columns in alphabetical order. This physical layout enables a two-GET streaming merge during compaction: the footer GET provides the schema and offsets, then a single streaming GET from the start of the row group delivers sort columns first — allowing the compactor to compute the global merge order before data columns arrive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: verify input column order is actually scrambled The sanity check only asserted presence, not ordering. Now it verifies that host appears before service in the input (scrambled) which is the opposite of the sort-schema order (service before host). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: rustfmt test code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: collapse nested if to satisfy clippy::collapsible_if Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cc4492e to
946c229
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
reorder_columnsunit test verifies sort columns first, then alphabeticalquickwit-parquet-enginetests pass🤖 Generated with Claude Code