feat: unified chunk grid with rectilinear chunk/shard support#3802
Open
maxrjones wants to merge 95 commits intozarr-developers:mainfrom
Open
feat: unified chunk grid with rectilinear chunk/shard support#3802maxrjones wants to merge 95 commits intozarr-developers:mainfrom
maxrjones wants to merge 95 commits intozarr-developers:mainfrom
Conversation
This reverts commit 9c0f582.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR contains an alternative implementation of the rectilinear chunk grid extension, building on the work in #3534 (RLE helpers, validation logic, and test cases were directly adopted). While the core feature of variable-sized chunks is the same, the internal architecture differs in ways that impact extensibility, performance, and release safety.
I appreciate the patience of those who contributed to #3534, and everyone who's been waiting on this feature. I know it's frustrating to see a new PR after #3534 was so close. That PR provided fundamental components, and I hope people will see the value here. I really believe it is worth the churn for the following reasons:
Key differences from #3534
DimensionGridprotocol (FixedDimension,VaryingDimension). Adding a new dimension type (e.g.TiledDimensionfor periodic patterns like days-per-month) requires implementing that protocol — no changes to indexing, codecs, or theChunkGridclass. A prototype was built to verify this.VaryingDimensionuses precomputed prefix sums for O(log n) lookups via binary search. See https://github.com/maxrjones/zarr-chunk-grid-tests for a performance comparison.zarr.config.set({'array.rectilinear_chunks': True})(orZARR_ARRAY__RECTILINEAR_CHUNKS=True), disabled by default. This gives downstream libraries time to adapt before the API is finalized, and us an opportunity to gracefully finalize the API.Design document:
docs/design/chunk-grid.mdcovers the full design, rationale, and a suggested PR sequence for splitting this into reviewable increments, if needed.Downstream POCs (all passing):
TODO:
docs/user-guide/*.mdchanges/