Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect comment on arrow::compute::kernels::sort::sort_to_indices #5029

Closed
westonpace opened this issue Nov 2, 2023 · 2 comments · Fixed by #5033
Closed

Incorrect comment on arrow::compute::kernels::sort::sort_to_indices #5029

westonpace opened this issue Nov 2, 2023 · 2 comments · Fixed by #5033
Labels
arrow Changes to the arrow crate bug documentation Improvements or additions to documentation good first issue Good for newcomers help wanted

Comments

@westonpace
Copy link
Member

Describe the bug
The docs for arrow::compute::kernels::sort::sort state:

Floats are sorted using IEEE 754 totalOrder

The docs for arrow::compute::kernels::sort::sort_to_indices state:

For floating point arrays any NaN values are considered to be greater than any other non-null value

These are different, which I suppose is technically ok (though not very helpful). However, in practice, it seems that they both use total order.

To Reproduce

use std::sync::Arc;
use arrow_array::{ArrayRef, Float32Array};
pub fn main() {
    let pos_nan = f32::NAN;
    let neg_nan = -pos_nan;
    let zero = 0_f32;

    let vals: ArrayRef = Arc::new(Float32Array::from_iter_values(
        [zero, neg_nan, pos_nan].iter().copied(),
    ));
    // Prints NaN, 0.0, NaN -- matches the comment
    dbg!(arrow::compute::kernels::sort::sort(vals.as_ref(), None).unwrap());
    // Prints 1, 0, 2 -- matches total order, but not the comment
    dbg!(arrow::compute::kernels::sort::sort_to_indices(vals.as_ref(), None, None).unwrap());
}

Expected behavior
My personal preference would be to update the comment to reflect that both sorting methods use total order. If we can confirm that is the intention I can create a PR.

Additional context
N/A

@westonpace westonpace added the bug label Nov 2, 2023
@tustvold tustvold added documentation Improvements or additions to documentation good first issue Good for newcomers help wanted labels Nov 2, 2023
@tustvold
Copy link
Contributor

tustvold commented Nov 2, 2023

As you rightly identify the docs on sort_to_indices are out of date and should be updated

@tustvold
Copy link
Contributor

tustvold commented Nov 7, 2023

label_issue.py automatically added labels {'arrow'} from #5033

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate bug documentation Improvements or additions to documentation good first issue Good for newcomers help wanted
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants