Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go][Compute] Add kernel for "unique" function #34171

Closed
zeroshade opened this issue Feb 13, 2023 · 0 comments · Fixed by #34172
Closed

[Go][Compute] Add kernel for "unique" function #34171

zeroshade opened this issue Feb 13, 2023 · 0 comments · Fixed by #34172

Comments

@zeroshade
Copy link
Member

Describe the enhancement requested

Following up on #33466, in order to implement direct and efficient handling of dictionary arrays to/from parquet without having to expand them out, we first need to implement a compute "unique" function kernel so we can easily compute the statistics on the dictionary like max/min. In theory it would be more efficient to do this with a "min_max" kernel itself, but that would take longer and be more involved to create at the moment.

In the meantime, it's still useful to be able to compute a uniquified version of an Arrow array.

Component(s)

Go

@zeroshade zeroshade self-assigned this Feb 13, 2023
zeroshade added a commit that referenced this issue Feb 14, 2023
### Rationale for this change

Implementing a kernel for computing the "unique" values in an arrow array, primarily for use in solving #33466. 

### What changes are included in this PR?
Adds a "unique" function to the compute list and helper convenience functions.

### Are these changes tested?
Yes, unit tests are included.

### Are there any user-facing changes?
Just the new available functions.

* Closes: #34171

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
@zeroshade zeroshade added this to the 12.0.0 milestone Feb 14, 2023
gringasalpastor pushed a commit to gringasalpastor/arrow that referenced this issue Feb 17, 2023
### Rationale for this change

Implementing a kernel for computing the "unique" values in an arrow array, primarily for use in solving apache#33466. 

### What changes are included in this PR?
Adds a "unique" function to the compute list and helper convenience functions.

### Are these changes tested?
Yes, unit tests are included.

### Are there any user-facing changes?
Just the new available functions.

* Closes: apache#34171

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
fatemehp pushed a commit to fatemehp/arrow that referenced this issue Feb 24, 2023
### Rationale for this change

Implementing a kernel for computing the "unique" values in an arrow array, primarily for use in solving apache#33466. 

### What changes are included in this PR?
Adds a "unique" function to the compute list and helper convenience functions.

### Are these changes tested?
Yes, unit tests are included.

### Are there any user-facing changes?
Just the new available functions.

* Closes: apache#34171

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant