Skip to content

[core] Fix FieldListaggAgg distinct mode incorrectly skipping values …#7652

Merged
JingsongLi merged 3 commits into
apache:masterfrom
lszskye:FieldListaggAgg
Apr 16, 2026
Merged

[core] Fix FieldListaggAgg distinct mode incorrectly skipping values …#7652
JingsongLi merged 3 commits into
apache:masterfrom
lszskye:FieldListaggAgg

Conversation

@lszskye
Copy link
Copy Markdown
Contributor

@lszskye lszskye commented Apr 15, 2026

Purpose

Fix a bug in FieldListaggAgg.agg() where the distinct deduplication logic
incorrectly uses substring matching (BinaryString.contains()) instead of
exact token matching, causing valid values to be silently dropped.

For example, with delimiter ,:

  • accumulator = "abc,def,asd"
  • inputField = "ab,xy"
  • Token "ab" is incorrectly skipped because "abc,def,asd".contains("ab")
    returns true
  • Result: "abc,def,asd,xy" (missing "ab")
  • Expected: "abc,def,asd,ab,xy"

Tests

testFieldListAggDistinctShouldNotMatchSubstring
testFieldListAggDistinctSubstringWithCustomDelimiter

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 79e1481 into apache:master Apr 16, 2026
12 checks passed
XiaoHongbo-Hope pushed a commit that referenced this pull request Apr 19, 2026
…7652)

Fix a bug in `FieldListaggAgg.agg()` where the distinct deduplication
logic
incorrectly uses substring matching (`BinaryString.contains()`) instead
of
exact token matching, causing valid values to be silently dropped.

For example, with delimiter `,`:
- accumulator = `"abc,def,asd"`
- inputField = `"ab,xy"`
- Token `"ab"` is incorrectly skipped because
`"abc,def,asd".contains("ab")`
  returns `true`
- Result: `"abc,def,asd,xy"` (missing `"ab"`)
- Expected: `"abc,def,asd,ab,xy"`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants