You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following up on #33466, in order to implement direct and efficient handling of dictionary arrays to/from parquet without having to expand them out, we first need to implement a compute "unique" function kernel so we can easily compute the statistics on the dictionary like max/min. In theory it would be more efficient to do this with a "min_max" kernel itself, but that would take longer and be more involved to create at the moment.
In the meantime, it's still useful to be able to compute a uniquified version of an Arrow array.
Component(s)
Go
The text was updated successfully, but these errors were encountered:
### Rationale for this change
Implementing a kernel for computing the "unique" values in an arrow array, primarily for use in solving #33466.
### What changes are included in this PR?
Adds a "unique" function to the compute list and helper convenience functions.
### Are these changes tested?
Yes, unit tests are included.
### Are there any user-facing changes?
Just the new available functions.
* Closes: #34171
Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
### Rationale for this change
Implementing a kernel for computing the "unique" values in an arrow array, primarily for use in solving apache#33466.
### What changes are included in this PR?
Adds a "unique" function to the compute list and helper convenience functions.
### Are these changes tested?
Yes, unit tests are included.
### Are there any user-facing changes?
Just the new available functions.
* Closes: apache#34171
Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
fatemehp
pushed a commit
to fatemehp/arrow
that referenced
this issue
Feb 24, 2023
### Rationale for this change
Implementing a kernel for computing the "unique" values in an arrow array, primarily for use in solving apache#33466.
### What changes are included in this PR?
Adds a "unique" function to the compute list and helper convenience functions.
### Are these changes tested?
Yes, unit tests are included.
### Are there any user-facing changes?
Just the new available functions.
* Closes: apache#34171
Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Describe the enhancement requested
Following up on #33466, in order to implement direct and efficient handling of dictionary arrays to/from parquet without having to expand them out, we first need to implement a compute "unique" function kernel so we can easily compute the statistics on the dictionary like max/min. In theory it would be more efficient to do this with a "min_max" kernel itself, but that would take longer and be more involved to create at the moment.
In the meantime, it's still useful to be able to compute a uniquified version of an Arrow array.
Component(s)
Go
The text was updated successfully, but these errors were encountered: