-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(array): implement min, max, any, all, sum, mean #9704
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing the other back ends! I can't get trini running on my dev machine so debugging the failing tests is gonna be hard, if you are able to fix that it would be awesome!
If you view a scalar as a one row, one column table, does that clarify why it makes sense as an operation on a "scalar"? |
|
That wasn't my mental model for a scalar, but with that model then yes this does make sense. This makes me want to write/edit something on the concepts section of the docs that explains this. Since I've been using ibis quite a lot but still never grokked that, it makes me think this isn't advertised enough Ok I think mins() is probably the best we can do! I'll push this change. |
|
How do I get codespell to ignore the looks-like-a-misspelling of |
|
I need to fix the see also links, the urls are wrong |
|
Regarding Polars, there's some incorrect behavior of any and all, so I'll leave those out for now. I've created a ticket upstream (pola-rs/polars#17917). |
|
Actually, we can workaround the polars issue (which isn't an upstream bug, it turns out). |
|
Ok, I was able to get Trino passing, as well as Snowflake and BigQuery. To ensure that BigQuery works I parameterized the |
|
To be crystal clear here, the current implementation ensures that the semantics of the array aggregations match the behavior of the corresponding columnar aggregations. IMO this the right choice to ensure that reasoning about the two cases' aggregation requires only one conceptual understanding for both "plural" aggregates as well as columnar ones. |
|
Also got postgres and risingwave working, using the same strategy as bigquery, which is this: SELECT (SELECT AGG(el) FROM UNNEST(array_column))
FROM t |
we were missing BooleanColumn.min, .max, etc
this is getting handled in the global config file
fixes #7073
Supported backends: