-
Notifications
You must be signed in to change notification settings - Fork 133
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
A clear and concise description of what the bug is.
If I generate a column based on a window function then try to filter and select the first value it barfs
To Reproduce
Steps to reproduce the behavior:
import datafusion as dfn
from datafusion import lit, col, functions as F
from datafusion.expr import Window, WindowFrame
def main() -> None:
ctx = dfn.SessionContext()
df = ctx.from_pydict(
{"any_row": list(range(10))},
)
df = df.select(
"any_row",
lit(1).alias("ones"),
)
df = df.select(
"any_row",
F.sum(col("ones"))\
.over(Window(window_frame=WindowFrame("rows", None, 0), order_by=col("any_row").sort(ascending=True))) \
.alias("forward_row_sum"),
F.sum(col("ones"))\
.over(Window(window_frame=WindowFrame("rows", None, 0), order_by=col("any_row").sort(ascending=False))) \
.alias("reverse_row_sum"),
)
df.collect()
df.select(
F.first_value(col("forward_row_sum"), order_by=col("any_row"))
).collect()
df.select(
F.last_value(col("reverse_row_sum"), filter=col("reverse_row_sum") >= 5, order_by=col("any_row").sort(ascending=True))
).collect()
if __name__ == "__main__":
main()Traceback (most recent call last):
File "/Users/nick/repos/bug.py", line 39, in <module>
main()
~~~~^^
File "/Users/nick/repos/bug.py", line 26, in main
).collect()
~~~~~~~^^
File "/Users/nick/repos/.venv/lib/python3.13/site-packages/datafusion/dataframe.py", line 681, in collect
return self.df.collect()
~~~~~~~~~~~~~~~^^
Exception: DataFusion error: NotImplemented("Physical plan does not support logical expression AggregateFunction(AggregateFunction { func: AggregateUDF { inner: FirstValue { name: \"first_value\", signature: Signature { type_signature: Any(1), volatility: Immutable }, accumulator: \"<FUNC>\" } }, params: AggregateFunctionParams { args: [Column(Column { relation: None, name: \"sum(ones) ORDER BY [c19e557aec20e49b985bb070e969ba68f.any_row ASC NULLS FIRST] ROWS BETWEEN UNBOUNDED PRECEDING AND 0 FOLLOWING\" })], distinct: false, filter: None, order_by: [Sort { expr: Column(Column { relation: Some(Bare { table: \"c19e557aec20e49b985bb070e969ba68f\" }), name: \"any_row\" }), asc: true, nulls_first: true }], null_treatment: Some(RespectNulls) } })")Expected behavior
A clear and concise description of what you expected to happen.
That I get the first (or last) value.
Additional context
Add any other context about the problem here.
import datafusion as dfn
>>> dfn.__version__
'50.1.0'Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working