Skip to content

[C++] Grouper produce a wrong num_groups when set are_cols_in_encoding_order to false #41233

@ZhangHuiGui

Description

@ZhangHuiGui

Describe the bug, including details regarding any error messages, version, and platform.

Bug found from blew comment:
#41036 (comment)

It's bug in grouper when set are_cols_in_encoding_order=false in below codes:

/* are_cols_in_encoding_order=*/true);

It will cause the num_group different with are_cols_in_encoding_order=true condition.

The encoder will sort columns by default, when we only set this compare args to false, the CompareColumnsToRows's input impl_ptr->encoder_.batch_all_cols(), impl_ptr->rows_, are all sorted, but use the incorrect column_offset to access compared column:

uint32_t offset_within_row = rows.metadata().encoded_field_offset(
are_cols_in_encoding_order
? static_cast<uint32_t>(icol)
: rows.metadata().pos_after_encoding(static_cast<uint32_t>(icol)));

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions