Skip to content

[GLUTEN-12172][CH] Fix group limit first array result offset#12173

Merged
zzcclp merged 2 commits into
mainfrom
bug_group_limit_empty_offsets
May 29, 2026
Merged

[GLUTEN-12172][CH] Fix group limit first array result offset#12173
zzcclp merged 2 commits into
mainfrom
bug_group_limit_empty_offsets

Conversation

@lgbo-ustc
Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

RowNumGroupArraySorted writes aggregate results into a newly created ColumnArray. For the first output row the array offsets vector can be empty, but insertResultInto read result_array_offsets.back() before appending the first offset. That is undefined behavior and can crash when aggregate top-k writes its first result.

Treat an empty offsets vector as having previous offset 0 before appending the next cumulative offset. Add a ClickHouse backend regression test that forces row_number top-k through the aggregate group limit path and validates the first array result row against vanilla Spark.

closed #12172

How was this patch tested?

UTs

Was this patch authored or co-authored using generative AI tooling?

co-authored using generative AI tooling

RowNumGroupArraySorted writes aggregate results into a newly created ColumnArray. For the first output row the array offsets vector can be empty, but insertResultInto read result_array_offsets.back() before appending the first offset. That is undefined behavior and can crash when aggregate top-k writes its first result.

Treat an empty offsets vector as having previous offset 0 before appending the next cumulative offset. Add a ClickHouse backend regression test that forces row_number top-k through the aggregate group limit path and validates the first array result row against vanilla Spark.
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

@zzcclp zzcclp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zzcclp zzcclp merged commit d87c2b0 into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CH] Fix group limit first array result offset

2 participants