New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DISTINCT is not implemented for window functions #4545
Comments
Thanks for the report! This is a big can of worms that is on my radar for later this year. I wouldn't recommend wading in because the architecture will changing underneath you for out of core operations, but hey, its open source! |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days. |
A lot of the necessary refactoring has been done, so we are getting closer. |
I think I have a working merge sort tree now (which was less work than I feared) so progress continues to be made. |
Compile basic WindowDistinctAggregator build steps.
Import and compile aggregation method.
Refactor EXCLUDE "part" machinery for reuse.
Compile first back end implementation.
Add parser support. Not hooked up yet.
Change refactoring base to custom aggregator and back out previous refactor. Add missing finalisation.
Debug algorithm 1 implementation.
Basic COUNT(DISTINCT) working.
Add basic naive accelerator triggered by the window debug "separate" setting. Still needs distinct.
Add and test naive distinct aggregation.
Test naïve DISTINCT + EXCLUDE combination.
Add scaling tests. Divert DISTINCT EXCLUDE to the naive implementation.
Improve EXCLUDE DISTINCT test to actually do something.
Tania's PR feedback.
Mark's test feedback.
Mark's PR Feedback: Force naïve implementation when the optimiser is disabled.
Mark's PR Feedback: Coverage tests, fix FILTER.
Only fix FILTER when filtering.
Don't dispose of radix data when merging.
Call aggregate destructors for distinct tree nodes to avoid memory leaks.
Issue #4545: Windowed Distinct Aggregates
What happens?
Count distinct over windows is not currently supported. For example:
Prints the error "DISTINCT is not implemented for window functions."
The lack of support starts in the parser and is inherited from the pg code.
If I wanted to work on this, where would I start?
Do you think this is a reasonable thing to try to implement? (given your knowledge and preferences, as well as the fact that postgres never did (and as far as I can tell neither did presto, Spark or mssql))
To Reproduce
select count(distinct a) over(partition by b) from c
OS:
Linux
DuckDB Version:
0.4.1.dev2358
DuckDB Client:
Python
Full Name:
Dror Speiser
Affiliation:
None
Have you tried this on the latest
master
branch?Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
The text was updated successfully, but these errors were encountered: