You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GH-47596: [C++][Parquet] Fix printing of large Decimal statistics (#47619)
### Rationale for this change
Parquet CLI tools fail printing the statistics for a Decimal column with a precision larger than the max Decimal128 precision.
Example:
```console
$ /build/build-test/debug/parquet-reader --only-metadata /tmp/pqfuzz/pq-table-1
...
Column 5: col_6 (FIXED_LEN_BYTE_ARRAY(11) / Decimal(precision=24, scale=7) / DECIMAL(24,7))
Column 6: col_7 (FIXED_LEN_BYTE_ARRAY(18) / Decimal(precision=43, scale=7) / DECIMAL(43,7))
...
Column 5
Values: 375, Null Values: 74, Distinct Values: 0
Max (exact: true): 98505381700645007.0205463, Min (exact: true): -99708959786297168.1726196
Compression: UNCOMPRESSED, Encodings: PLAIN(DICT_PAGE) RLE_DICTIONARY
Uncompressed Size: 3754, Compressed Size: 3754
Column 6
Values: 375, Null Values: 69, Distinct Values: 0
Max (exact: true): Parquet error: Failed to parse decimal value: Length of byte array passed to Decimal128::FromBigEndian was 18, but must be between 1 and 16
...
```
### What changes are included in this PR?
Use Decimal256 instead of Decimal128 when printing a Decimal statistic.
### Are these changes tested?
Yes, by new tests.
### Are there any user-facing changes?
No.
* GitHub Issue: #47596
Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
0 commit comments