Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C#][Integration] Validity buffer comparison is dependent on zeroed final bits #40870

Closed
paleolimbot opened this issue Mar 28, 2024 · 3 comments

Comments

@paleolimbot
Copy link
Member

paleolimbot commented Mar 28, 2024

Describe the bug, including details regarding any error messages, version, and platform.

I ran across this when implementing integration tests for nanoarrow (#39302), where I'd forgotten to zero the last few bits of each validity buffer when reading the testing integration JSON. This resulted in a failing check for nanoarrow -> C# because the check for validity buffer equivalence is just based on the bytes, not strictly the values of the relevant bits.

I will fix the zeroed final byte for the purposes of #39302; however, I wonder if it is worth updating the bit of code that does this comparison since I don't believe there's any guarantee about the values of bits outside the range [0, length] in the bitmap.

To double check this I at one point replaced:

Assert.True(
expectedValidityBuffer.Span.Slice(0, validityBitmapByteCount).SequenceEqual(actualValidityBuffer.Span.Slice(0, validityBitmapByteCount)),
"Validity buffers do not match.");
}

with

                    ReadOnlySpan<byte> expectedSpan = expectedValidityBuffer.Span.Slice(0, validityBitmapByteCount);
                    ReadOnlySpan<byte> actualSpan = actualValidityBuffer.Span.Slice(0, validityBitmapByteCount);

                    for (int i = 0; i < arrayLength; i++)
                    {
                        Assert.True(
                            BitUtility.GetBit(expectedSpan, i) == BitUtility.GetBit(actualSpan, i),
                            string.Format("bit at index {0}/{1} is not equal", i, arrayLength));
                    }

I'm happy to give this PR a shot if that is an acceptable change!

Component(s)

C#, Integration

@CurtHagenlocher
Copy link
Contributor

It seems fine to me. If we were concerned about performance we could always byte compare the first validityBitmapByteCount-1 bytes directly and only the last byte in a bitwise fashion, but I can't imagine it would matter much in test code.

@paleolimbot
Copy link
Member Author

That makes sense...I'll put a PR together 🙂

paleolimbot added a commit that referenced this issue Mar 28, 2024
…d final bits are not identical (#40873)

### Rationale for this change

Before fixing nanoarrow's testing JSON reader to align with other implementations and properly zero out the last few bits, integration tests failed because C#'s `CompareValidityBuffer()` was comparing the bytes of the validity buffer (including undefined final bits that are maybe not identical due to uninitialized memory or because the arrays are slices).

### What changes are included in this PR?

`CompareValidityBuffer()` now compares the memory for all except the last byte and compares the last byte bitwise.

### Are these changes tested?

They should be but I am not sure exactly where to add the test!

### Are there any user-facing changes?

No
* GitHub Issue: #40870

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>
@paleolimbot
Copy link
Member Author

Issue resolved by pull request 40873
#40873

@paleolimbot paleolimbot added this to the 16.0.0 milestone Mar 28, 2024
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…ecified final bits are not identical (apache#40873)

### Rationale for this change

Before fixing nanoarrow's testing JSON reader to align with other implementations and properly zero out the last few bits, integration tests failed because C#'s `CompareValidityBuffer()` was comparing the bytes of the validity buffer (including undefined final bits that are maybe not identical due to uninitialized memory or because the arrays are slices).

### What changes are included in this PR?

`CompareValidityBuffer()` now compares the memory for all except the last byte and compares the last byte bitwise.

### Are these changes tested?

They should be but I am not sure exactly where to add the test!

### Are there any user-facing changes?

No
* GitHub Issue: apache#40870

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 4, 2024
…ecified final bits are not identical (apache#40873)

### Rationale for this change

Before fixing nanoarrow's testing JSON reader to align with other implementations and properly zero out the last few bits, integration tests failed because C#'s `CompareValidityBuffer()` was comparing the bytes of the validity buffer (including undefined final bits that are maybe not identical due to uninitialized memory or because the arrays are slices).

### What changes are included in this PR?

`CompareValidityBuffer()` now compares the memory for all except the last byte and compares the last byte bitwise.

### Are these changes tested?

They should be but I am not sure exactly where to add the test!

### Are there any user-facing changes?

No
* GitHub Issue: apache#40870

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…ecified final bits are not identical (apache#40873)

### Rationale for this change

Before fixing nanoarrow's testing JSON reader to align with other implementations and properly zero out the last few bits, integration tests failed because C#'s `CompareValidityBuffer()` was comparing the bytes of the validity buffer (including undefined final bits that are maybe not identical due to uninitialized memory or because the arrays are slices).

### What changes are included in this PR?

`CompareValidityBuffer()` now compares the memory for all except the last byte and compares the last byte bitwise.

### Are these changes tested?

They should be but I am not sure exactly where to add the test!

### Are there any user-facing changes?

No
* GitHub Issue: apache#40870

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…ecified final bits are not identical (apache#40873)

### Rationale for this change

Before fixing nanoarrow's testing JSON reader to align with other implementations and properly zero out the last few bits, integration tests failed because C#'s `CompareValidityBuffer()` was comparing the bytes of the validity buffer (including undefined final bits that are maybe not identical due to uninitialized memory or because the arrays are slices).

### What changes are included in this PR?

`CompareValidityBuffer()` now compares the memory for all except the last byte and compares the last byte bitwise.

### Are these changes tested?

They should be but I am not sure exactly where to add the test!

### Are there any user-facing changes?

No
* GitHub Issue: apache#40870

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…ecified final bits are not identical (apache#40873)

### Rationale for this change

Before fixing nanoarrow's testing JSON reader to align with other implementations and properly zero out the last few bits, integration tests failed because C#'s `CompareValidityBuffer()` was comparing the bytes of the validity buffer (including undefined final bits that are maybe not identical due to uninitialized memory or because the arrays are slices).

### What changes are included in this PR?

`CompareValidityBuffer()` now compares the memory for all except the last byte and compares the last byte bitwise.

### Are these changes tested?

They should be but I am not sure exactly where to add the test!

### Are there any user-facing changes?

No
* GitHub Issue: apache#40870

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants