Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] No duplicates in list_intersect #9947

Merged
merged 3 commits into from Dec 12, 2023

Conversation

taniabogatsch
Copy link
Contributor

This PR changes the list_intersect rewrite from list_filter(l1, (x) -> list_contains(l2, x)) to list_filter(list_distinct(l1), (x) -> list_contains(l2, x)). As a result, list_intersect no longer returns duplicate values.

However, the drawback of this change is that list_intersect now requires list_distinct, which uses duckdb's histogram aggregate function internally. The histogram function cannot handle nested inputs yet, as well as some maximum values for huge integers and blobs. This limits the functionality of list_intersect but ensures correct results according to set logic.

@taniabogatsch
Copy link
Contributor Author

Fixes #9942.

@github-actions github-actions bot marked this pull request as draft December 12, 2023 09:33
@taniabogatsch taniabogatsch marked this pull request as ready for review December 12, 2023 09:34
@Mytherin Mytherin merged commit 055e8fc into duckdb:main Dec 12, 2023
45 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

@taniabogatsch taniabogatsch deleted the list-intersect branch December 12, 2023 14:21
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Dec 14, 2023
Merge pull request duckdb/duckdb#9985 from hawkfish/summarize-overflow
Merge pull request duckdb/duckdb#9972 from yiyuanliu/lyy/fix-9949
Merge pull request duckdb/duckdb#9962 from Tishj/statement_error_expected_non_optional
Merge pull request duckdb/duckdb#9961 from Tishj/python_supply_config_to_connect
Merge pull request duckdb/duckdb#9947 from taniabogatsch/list-intersect
Merge pull request duckdb/duckdb#9568 from Jens-H/append-BigDecimal
Merge pull request duckdb/duckdb#9959 from Tishj/python_udf_arrow_side_effects
Merge pull request duckdb/duckdb#9855 from StarveZhou/issue_9795_arg
Merge pull request duckdb/duckdb#9944 from Tishj/pyproject_toml
Merge pull request duckdb/duckdb#9946 from lnkuiper/issue9718
Merge pull request duckdb/duckdb#9953 from mlafeldt/fix-httpfs-null-ptr
Merge pull request duckdb/duckdb#9897 from Tmonster/left_semi_anti_feature_rebased
Merge pull request duckdb/duckdb#9936 from hawkfish/empty-frames
Merge pull request duckdb/duckdb#9715 from Tishj/dependency_set
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants