Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix undefined behavior in quantiles function #44067

Merged
merged 7 commits into from Dec 12, 2022

Conversation

alexey-milovidov
Copy link
Member

@alexey-milovidov alexey-milovidov commented Dec 9, 2022

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in official stable or prestable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix undefined behavior in the quantiles function, which might lead to uninitialized memory. Found by fuzzer. This closes #44066.

@alexey-milovidov alexey-milovidov added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label Dec 9, 2022
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-bugfix Pull request with bugfix, not backported by default label Dec 9, 2022
@davenger davenger self-assigned this Dec 9, 2022
if constexpr (is_decimal<Value>)
result[indices[i]] = Value(static_cast<typename Value::NativeType>(data.quantileInterpolated(levels[indices[i]])));
{
if (is_empty)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding UbSan fix, - this check is actually done inside quantileInterpolated() but this function on empty values, it returns NaN double. I fixed it in #43103, but it leads to changing behavior, see here.
If it's ok, then I guess, it'd be a preferable fix for undefined behavior:
(1) it respects onEmpty policy of ReservoirSampler* (not sure how it's important)
(2) it works for QuantileReservoirSampler::get() as well, current fix addresses only getMany()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the behavior on an empty column is not ok. Maybe not a big deal, but we are very concerned about incompatibilities recently.

I thought about removing the onEmpty policy, as after this change we have a duplication of logic, but did not get to that.

The point on get is valid, it should be fixed as well...

@alexey-milovidov alexey-milovidov merged commit bca1829 into master Dec 12, 2022
@alexey-milovidov alexey-milovidov deleted the fix-quantiles-msan branch December 12, 2022 01:19
alexey-milovidov added a commit that referenced this pull request Dec 13, 2022
Backport #44067 to 22.10: Fix undefined behavior in `quantiles` function
alexey-milovidov added a commit that referenced this pull request Dec 13, 2022
Backport #44067 to 22.11: Fix undefined behavior in `quantiles` function
alexey-milovidov added a commit that referenced this pull request Dec 13, 2022
Backport #44067 to 22.9: Fix undefined behavior in `quantiles` function
@robot-ch-test-poll robot-ch-test-poll added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Dec 13, 2022
alexey-milovidov added a commit that referenced this pull request Dec 13, 2022
Backport #44067 to 22.8: Fix undefined behavior in `quantiles` function
alexey-milovidov added a commit that referenced this pull request Dec 13, 2022
Backport #44067 to 22.3: Fix undefined behavior in `quantiles` function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MSan and UBSan report in quantiles
5 participants