-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Labels
Description
What happens?
select("distinct <columns>") ignores distinct.
Expected behavior: a) throw an exception if this is not supported or b) apply the distinct.
E.g. select(count("*") throws an exception _duckdb.BinderException: Binder Error: Aggregates cannot be present in a Project relation!
To Reproduce
import duckdb
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 1, 1], "b": [10, 10, 10, 20], "c": [0, 1, 2, 3]})
duckdb.execute("CREATE TABLE foo AS SELECT * FROM df")
sql_query_selectdistinct = duckdb.table("foo").select("DISTINCT a, b").sql_query()
print(sql_query_selectdistinct)
assert "SELECT DISTINCT a, b FROM main.foo" == sql_query_selectdistinct
assert duckdb.table("foo").select("a, b").distinct().sql_query() == sql_query_selectdistinctOutput:
SELECT a, b FROM main.foo
Both assertions fail
OS:
Windows 11 x86_64
DuckDB Package Version:
1.4.1
Python Version:
3.13.6
Full Name:
Clemens Korner
Affiliation:
AIT Austrian Institute of Technology
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration to reproduce the issue?
- Yes, I have