New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17523: [C++] Add support to substrait function is_null, is_not_null and count #13969
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename pull request title in the following format?
or
See also: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution! is_null
and is_not_null
look good to me. I think there may be a slight issue with count
but the issue exists with sum
and avg
as well so I think I would be ok with proceeding with this PR as-is and adding this to the list of functions to fix in ARROW-17534.
{{kSubstraitAggregateGenericFunctionsUri, "count"}, | ||
{"[1, 2, 30]"}, | ||
{int8()}, | ||
"[3]", | ||
"[2, 1]", | ||
int64()}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the count function takes in a single optional enum argument. I have no idea what the purpose of this argument is. However, in Substrait, "optional arguments" should still be included in the protobuf.
I realize now that I did not account for this in the other aggregate functions too :(.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to substrait. Maybe it means the enumeration arguments.
https://github.com/substrait-io/substrait/blob/main/text/simple_extensions_schema.yaml#L94
And here may have some clue https://substrait.io/expressions/scalar_functions/#optional-enumeration-properties
The lint error appears valid as well. Please run clang-format on the code. |
Fixed code style. |
Thanks @jinchengchenghh . I updated the title of the PR to conform to our style and I changed the description a little (since this is used for the git commit) and merged. |
|
Benchmark runs are scheduled for baseline = e48afd6 and contender = 74dae61. 74dae61 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
…null and count (apache#13969) Add Substrait function mapping for is_null, is_not_null, and count Authored-by: Chengcheng Jin <chengcheng.jin@intel.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
Thanks! It's ok. |
…null and count (apache#13969) Add Substrait function mapping for is_null, is_not_null, and count Authored-by: Chengcheng Jin <chengcheng.jin@intel.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
…null and count (apache#13969) Add Substrait function mapping for is_null, is_not_null, and count Authored-by: Chengcheng Jin <chengcheng.jin@intel.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
Add Substrait function mapping for is_null, is_not_null, and count