Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline is_valid calls in PrimitiveIter/BooleanIter #1857

Open
jhorstmann opened this issue Jun 13, 2022 · 1 comment
Open

Inline is_valid calls in PrimitiveIter/BooleanIter #1857

jhorstmann opened this issue Jun 13, 2022 · 1 comment
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@jhorstmann
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Shortly discussed in #1048 (comment)

The calls to is_valid do not seem to get inlined in PrimitiveIter and some other places, leading to slower than optimal performance.

Describe the solution you'd like

Mark ArrayData::is_valid/is_null as inline and also add is_valid_unchecked methods for use in trusted code.

Could also try using a ScalarBuffer from #1825 and a separate bitmap instead of a PrimitiveArray field in the iterator.

Describe alternatives you've considered

Additional context

Ideally I'd like code like the following the be completly inlined and basically be equivalent to a memcpy:

fn slice_from_array(a: &Int32Array, output: &mut[i32]) {
    output.iter_mut().zip(a.iter()).for_each(|(out, x)| {
        *out = x.unwrap_or(0);
    });

That would require the compiler to move the check for presence of a null buffer outside of the loop and generate two optimized loops, one for nullable and one for non-null arrays.

@jhorstmann jhorstmann added the enhancement Any new improvement worthy of a entry in the changelog label Jun 13, 2022
@jhorstmann jhorstmann changed the title Inline is_valid calls in PrimitiveIter Inline is_valid calls in PrimitiveIter/BooleanIter Jul 15, 2022
@jhorstmann
Copy link
Contributor Author

In a simplified example, llvm is able to completely unswitch a loop based on the presence of a validity bitmap, and generate two separately optimized versions of a kernel: https://rust.godbolt.org/z/sxhsh3h7G

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

1 participant