Skip to content

[Bug] Collect agg merge engine return duplicate list even we set filed distinct #4999

@neuyilan

Description

@neuyilan

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

master
1.0
0.9
0.8

Compute Engine

none

Minimal reproduce step

  1. create table with collect agg and distinct field
CREATE TABLE test_collect( 
  id INT PRIMARY KEY NOT ENFORCED,
  f0 ARRAY<STRING>
) WITH (
  'merge-engine' = 'aggregation',
  'fields.f0.aggregate-function' = 'collect',
  'fields.f0.distinct' = 'true'
)
  1. insert values

INSERT INTO test_collect VALUES (1, ARRAY['A', 'B', 'A', 'A']);

  1. query the results.

SELECT * FROM test_collect

the result is as follows, it does not meet expectations
1, [A, B, A, A]

What doesn't meet your expectations?

Return deduplicate list value.
for above example, the result shoud be:

1, [A, B]

Anything else?

no

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions