Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Hashing array scalar with null bitmap and non-null 0s bitmap produces different hashes. #35521

Closed
micah-white opened this issue May 9, 2023 · 1 comment · Fixed by #35522
Assignees
Labels
Component: C++ Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. Type: bug
Milestone

Comments

@micah-white
Copy link
Contributor

Describe the bug, including details regarding any error messages, version, and platform.

Null bitmaps in arrays are generally set to nullptr when the null count for the array is 0. However, there are cases where the null bitmap is set to a 0s buffer instead. Semantically, a 0s bitmap and a nullptr bitmap are the same thing. However, the hashing algorithm for array scalars will hash the null bitmap if it exists, and the resulting hash is not 0. This behavior can lead to two scalars with the same semantic value, but different internal values of the null bitmap, to hash to different values.

Component(s)

C++

@AlenkaF AlenkaF changed the title Hashing array scalar with null bitmap and non-null 0s bitmap produces different hashes. [C++] Hashing array scalar with null bitmap and non-null 0s bitmap produces different hashes. May 10, 2023
@jorisvandenbossche
Copy link
Member

Related issue: #35360 (not strictly related to the null bitmap, but just in general about that there are quite some things lacking about the ScalarHash implementation)

@pitrou pitrou added this to the 13.0.0 milestone May 11, 2023
pitrou added a commit that referenced this issue May 11, 2023
### Are these changes tested?

I am new to contributing and am having a hard time creating a good test for this. The steps to reproduce this bug originally are too complicated for a simple test. I included my attempt at making a good test in the PR, but some help would be nice.
-->

### Are there any user-facing changes?
No.

**This PR contains a "Critical Fix".**
* Closes: #35521

Lead-authored-by: micah-white <micahwhitecs@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
ArgusLi pushed a commit to Bit-Quill/arrow that referenced this issue May 15, 2023
…e#35522)

### Are these changes tested?

I am new to contributing and am having a hard time creating a good test for this. The steps to reproduce this bug originally are too complicated for a simple test. I included my attempt at making a good test in the PR, but some help would be nice.
-->

### Are there any user-facing changes?
No.

**This PR contains a "Critical Fix".**
* Closes: apache#35521

Lead-authored-by: micah-white <micahwhitecs@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
rtpsw pushed a commit to rtpsw/arrow that referenced this issue May 16, 2023
…e#35522)

### Are these changes tested?

I am new to contributing and am having a hard time creating a good test for this. The steps to reproduce this bug originally are too complicated for a simple test. I included my attempt at making a good test in the PR, but some help would be nice.
-->

### Are there any user-facing changes?
No.

**This PR contains a "Critical Fix".**
* Closes: apache#35521

Lead-authored-by: micah-white <micahwhitecs@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@raulcd raulcd added the Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. label Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: C++ Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. Type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants