New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate RLE for var-length string dimensions #2938
Conversation
This pull request has been linked to Shortcut Story #13310: Do not store chunk metadata inline with chunk. |
4ec9459
to
f5198f6
Compare
9b47790
to
8517270
Compare
066a2d9
to
fa99212
Compare
fa99212
to
486495b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, great work!
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release-2.8 release-2.8
# Navigate to the new working tree
cd .worktrees/backport-release-2.8
# Create a new branch
git switch --create backport-2938-to-release-2.8
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick --mainline 1 023cb536012048b2a7f6a3a14066862e55a9aaef
# Push it to GitHub
git push --set-upstream origin backport-2938-to-release-2.8
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release-2.8 Then, create a pull request where the |
* Integrate RLE for var-length string dimensions * Fix windows warning * Fix dates in headers * Various correctness fixes * Fix backwards compatibility issues * Add basic test for string attributes and fix filter UTs * Fix merge issues
* Integrate RLE for var-length string dimensions * Fix windows warning * Fix dates in headers * Various correctness fixes * Fix backwards compatibility issues * Add basic test for string attributes and fix filter UTs * Fix merge issues Co-authored-by: Ypatia Tsavliri <ypatia@tiledb.com>
This PR integrates the recently introduced RLE algorithms for string compression/decompression to TileDB filter pipeline using the, again, recently introduced Tile-level filtering/unfiltering.
A few improvements will come in following PRs:
Limitations: When RLE is used for string dimensions, we currently require that it's the only filter applied.
In the future this limitation could be relaxed, but some limitation will still be in place to make sure RLE can get applied to the string representation of data so that it makes sense.
TYPE: FEATURE
DESC: RLE compression support for var-length string dimensions