Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement string_concat kernel #1540

Closed
alamb opened this issue Apr 11, 2022 · 0 comments · Fixed by #1720
Closed

Implement string_concat kernel #1540

alamb opened this issue Apr 11, 2022 · 0 comments · Fixed by #1720
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog good first issue Good for newcomers

Comments

@alamb
Copy link
Contributor

alamb commented Apr 11, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We currently have a concat kernel at https://docs.rs/arrow/11.1.0/arrow/compute/kernels/concat/index.html

This concatenates Arrays together.

There is also a need to concatenate strings together. Something like string_concat that takes two StringArrays (or LargeStringArrays` and concatenates them element by element.

DataFusion has an implementation of string concat here: https://github.com/apache/arrow-datafusion/blob/28a6da3d2d175eb9d2f4ff8a6ea58e7c22dae97c/datafusion/physical-expr/src/expressions/binary.rs#L422 which @WinkerDu has kindly been improving.

Describe the solution you'd like
I suggest adding an optimized string_concat kernel in arrow-rs. @Dandandan outlines some good first optimizations here: apache/datafusion#2183 (comment)

The signature would be something like:

pub fn string_concat<OffsetSize: StringOffsetSizeTrait>(
    left: &GenericStringArray<OffsetSize>, 
    right: &GenericStringArray<OffsetSize>
) -> Result<GenericStringArray<OffsetSize>>

An example use:

    let left = [Some("foo"), Some("bar"), None].into_iter().collect::<StringArray>();
    let right = [None, Some("yyy"), Some("zzz")].into_iter().collect::<StringArray>();

    let res = string_concat(left, right).unwrap();

    let expected = [None, Some("baryyy"), None].into_iter().collect::<StringArray>();
    assert_eq(res, expected);

Describe alternatives you've considered

Additional context
Add any other context or screenshots about the feature request here.

@alamb alamb added enhancement Any new improvement worthy of a entry in the changelog arrow Changes to the arrow crate good first issue Good for newcomers labels Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant