-
Notifications
You must be signed in to change notification settings - Fork 886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix offset of the string dictionary length stream #8538
Fix offset of the string dictionary length stream #8538
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Codecov Report
@@ Coverage Diff @@
## branch-21.06 #8538 +/- ##
================================================
- Coverage 82.83% 82.44% -0.39%
================================================
Files 109 109
Lines 17896 17542 -354
================================================
- Hits 14824 14463 -361
- Misses 3072 3079 +7
Continue to review full report at Codecov.
|
Fixes #8514
String dictionary length is RLE encoded and rle_data_size and non_rle_data_size take this into account. However, When computing chunk stream offsets, these streams were treated as non-RLE and non_rle_data_size was not added. This caused discrepancy between non-RLE stream sizes and available space, leading to overlap between chunk streams.
Applied the non_rle_data_size to the offset to correct the discrepancy and added a test that uses decimal columns to increase the size of non-RLE encoded data and enable the overflow.