Describe the bug, including details regarding any error messages, version, and platform.
Bug found from blew comment:
#41036 (comment)
It's bug in grouper when set are_cols_in_encoding_order=false in below codes:
|
/* are_cols_in_encoding_order=*/true); |
It will cause the num_group different with are_cols_in_encoding_order=true condition.
The encoder will sort columns by default, when we only set this compare args to false, the CompareColumnsToRows's input impl_ptr->encoder_.batch_all_cols(), impl_ptr->rows_, are all sorted, but use the incorrect column_offset to access compared column:
|
uint32_t offset_within_row = rows.metadata().encoded_field_offset( |
|
are_cols_in_encoding_order |
|
? static_cast<uint32_t>(icol) |
|
: rows.metadata().pos_after_encoding(static_cast<uint32_t>(icol))); |
Component(s)
C++
Describe the bug, including details regarding any error messages, version, and platform.
Bug found from blew comment:
#41036 (comment)
It's bug in grouper when set are_cols_in_encoding_order=false in below codes:
arrow/cpp/src/arrow/compute/row/grouper.cc
Line 582 in 2979d69
It will cause the num_group different with
are_cols_in_encoding_order=truecondition.The encoder will sort columns by default, when we only set this compare args to false, the
CompareColumnsToRows's inputimpl_ptr->encoder_.batch_all_cols(), impl_ptr->rows_,are all sorted, but use the incorrect column_offset to access compared column:arrow/cpp/src/arrow/compute/row/compare_internal.cc
Lines 366 to 369 in 2979d69
Component(s)
C++