New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reading of sparse columns from s3 #37978
Conversation
I'll try to add test a bit later. |
bool need_to_check_marks_from_the_right = false; | ||
|
||
/// If the end of range is inside the block, we will need to read it too. | ||
if (right_mark.offset_in_decompressed_block > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the main change is the removing of need_to_check_marks_from_the_right
flag. But it is not clear why this may may be a fix. Could you write some comment about it?
: right_mark_non_included - 1; | ||
|
||
auto indices = collections::range(right_mark_included, marks_count); | ||
auto it = std::upper_bound(indices.begin(), indices.end(), right_mark_included, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upper bound looks ok. But may be an overkill. I think the only case when we can have many consequent granules with the same offset is a LowCardinality dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not only LowCardinality dictionary, but also values of sparse columns (which is the point of this PR) and empty arrays. I tried to generalize several cases with upper bound.
Stress test: some issues with Keeper:
Stateless tests (release, DatabaseReplicated, actions): some issues with Keeper:
|
I'm not sure but there just no related diff in other PRs.
|
let's try to revert, we have multiple failures on master each day. |
Hi! I get this error, could it be relevant to this fix?
|
* Backport #37978 to 22.3: Fix reading of sparse columns from s3 * Merge pull request #38239 from CurtizJ/fix-reading-from-s3 Fix reading from s3 in some corner cases Co-authored-by: robot-clickhouse <robot-clickhouse@clickhouse.com> Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com> Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix reading of sparse columns from
MergeTree
tables that store their data in S3.