Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate RLE for var-length string dimensions #2938

Merged
merged 7 commits into from Mar 15, 2022

Conversation

ypatia
Copy link
Contributor

@ypatia ypatia commented Mar 4, 2022

This PR integrates the recently introduced RLE algorithms for string compression/decompression to TileDB filter pipeline using the, again, recently introduced Tile-level filtering/unfiltering.

A few improvements will come in following PRs:

  • do not hardcode run length and string length datasize
  • do not store offsets for RLE encoded strings

Limitations: When RLE is used for string dimensions, we currently require that it's the only filter applied.
In the future this limitation could be relaxed, but some limitation will still be in place to make sure RLE can get applied to the string representation of data so that it makes sense.


TYPE: FEATURE
DESC: RLE compression support for var-length string dimensions

@shortcut-integration
Copy link

This pull request has been linked to Shortcut Story #13310: Do not store chunk metadata inline with chunk.

@ypatia ypatia force-pushed the yt/ch13310/enable_rle_for_string_dimensions branch 3 times, most recently from 4ec9459 to f5198f6 Compare March 4, 2022 12:40
@ypatia ypatia requested a review from Shelnutt2 March 4, 2022 12:41
@ypatia ypatia force-pushed the yt/ch13310/enable_rle_for_string_dimensions branch from 9b47790 to 8517270 Compare March 8, 2022 18:45
tiledb/sm/filter/compression_filter.cc Outdated Show resolved Hide resolved
tiledb/sm/filter/compression_filter.cc Outdated Show resolved Hide resolved
@ypatia ypatia force-pushed the yt/ch13310/enable_rle_for_string_dimensions branch 2 times, most recently from 066a2d9 to fa99212 Compare March 11, 2022 10:08
@ypatia ypatia force-pushed the yt/ch13310/enable_rle_for_string_dimensions branch from fa99212 to 486495b Compare March 15, 2022 11:15
Copy link
Member

@Shelnutt2 Shelnutt2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, great work!

@Shelnutt2 Shelnutt2 merged commit 023cb53 into dev Mar 15, 2022
@Shelnutt2 Shelnutt2 deleted the yt/ch13310/enable_rle_for_string_dimensions branch March 15, 2022 12:30
@github-actions
Copy link
Contributor

The backport to release-2.8 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release-2.8 release-2.8
# Navigate to the new working tree
cd .worktrees/backport-release-2.8
# Create a new branch
git switch --create backport-2938-to-release-2.8
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick --mainline 1 023cb536012048b2a7f6a3a14066862e55a9aaef
# Push it to GitHub
git push --set-upstream origin backport-2938-to-release-2.8
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release-2.8

Then, create a pull request where the base branch is release-2.8 and the compare/head branch is backport-2938-to-release-2.8.

Shelnutt2 pushed a commit that referenced this pull request Mar 16, 2022
* Integrate RLE for var-length string dimensions

* Fix windows warning

* Fix dates in headers

* Various correctness fixes

* Fix backwards compatibility issues

* Add basic test for string attributes and fix filter UTs

* Fix merge issues
Shelnutt2 added a commit that referenced this pull request Mar 17, 2022
* Integrate RLE for var-length string dimensions

* Fix windows warning

* Fix dates in headers

* Various correctness fixes

* Fix backwards compatibility issues

* Add basic test for string attributes and fix filter UTs

* Fix merge issues

Co-authored-by: Ypatia Tsavliri <ypatia@tiledb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants