You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
mapleFU
changed the title
[C++][Parquet] Boolean encoding has inconsistent implemention
[C++][Parquet] Boolean encoding has more space than expected
Aug 1, 2023
mapleFU
changed the title
[C++][Parquet] Boolean encoding has more space than expected
[C++][Parquet] Boolean encoding has inconsistent implemention
Aug 1, 2023
By the way, actually, sink_.UnsafeAdvance(data.length()); will allocate sum-of-array-length-include-null bytes. But finally, data should be encoded into sum-of-array-length-not-include-null bits.
… called several times (#36972)
### Rationale for this change
This is from a bug in PLAIN encoding with `BooleanArray` input. Boolean will introduce bad length when writing arrow data.
This interface is not widely used.
### What changes are included in this PR?
Rewrite PLAIN boolean encoder to use `TypedBufferBuilder` instead of an incorrect hand-baked implementation.
### Are these changes tested?
Yes
### Are there any user-facing changes?
No.
* Closes: #36939
Lead-authored-by: mwish <maplewish117@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
…t when called several times (apache#36972)
### Rationale for this change
This is from a bug in PLAIN encoding with `BooleanArray` input. Boolean will introduce bad length when writing arrow data.
This interface is not widely used.
### What changes are included in this PR?
Rewrite PLAIN boolean encoder to use `TypedBufferBuilder` instead of an incorrect hand-baked implementation.
### Are these changes tested?
Yes
### Are there any user-facing changes?
No.
* Closes: apache#36939
Lead-authored-by: mwish <maplewish117@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Describe the bug, including details regarding any error messages, version, and platform.
For
PlainEncoder<BooleanType>
:If values contains null, this only puts length for
valid_bits
.This will always output the length with null. These two implement is inconsistent.
Component(s)
C++, Parquet
The text was updated successfully, but these errors were encountered: