-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-35122: [C++] Add support for scalars in run_end_encode and run_end_decode #35123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@jorisvandenbossche @benibus @zeroshade @westonpace is it OK/expected for/from vector functions to handle scalar inputs and always return an array? Because that's what I'm doing here. |
|
As far as I know, vector functions aren't used anywhere internally beyond CallFunction, so if it runs, then I think it's ok. |
|
Checking some other vector functions passing them a scalar, there are that raise a NotImplementedError (eg sort_indices, rank) or that return an array (eg replace_with_mask, cumulative_sum, indices_nonzero). |
|
@jorisvandenbossche I'm now a bit unsure about this PR because in the context I tried to use this implementation, I would need to return an array expanded from the Scalar that had length equal to the other columns in the Think |
|
Perhaps a related question. Why are run end encode/decode vector functions in the first place? Wouldn't they be scalar functions? |
I just continued the work of Tobias and didn't put much thinking into what they should have been. I don't see any reason for them to not be scalar functions. |
|
While they follow the typical requirements of a scalar function of same shape for input / output, the kernel is not strictly speaking scalar because it's not just independently element-wise (i.e. the order of the values of the input array impact the exact result array, although you could say that the exact run_ends/values are an implementation detail, as long as they still represent "logically" the same values). |
|
Fair points. I don't think I know the function registry well enough to answer. |
|
@jorisvandenbossche I think the requirements you listed make a good case for this not being a scalar function. I will close the PR because of that and my failed attempt in using this as a function that takes a scalar. |
Rationale for this change
To make
run_end_encodeandrun_end_decodeusable in all contexts, they should gracefully handleScalarinputs.What changes are included in this PR?
run_end_encodeturn a regularScalarinto aRUN_END_ENCODEDarray.run_end_decodeturn aRUN_END_ENCODEDAre these changes tested?
Yes, by unit tests.
Are there any user-facing changes?
No.