-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-10864: [Rust] Use standard ordering for floats #8882
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8882 +/- ##
==========================================
- Coverage 82.60% 82.60% -0.01%
==========================================
Files 204 204
Lines 50189 50177 -12
==========================================
- Hits 41459 41448 -11
+ Misses 8730 8729 -1
Continue to review full report at Codecov.
|
@Dandandan what more work do you still need to do on this PR? |
Thanks @nevi-me for remembering me. I updated and finalized the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, needs rebase though :)
I've gone through the totalOrder logic, looks the same as what I've seen on the net.
I also saw this old RFC which would be useful by implementing Ord
on floats, potentially giving us totalOrder in the standard library rust-lang/rfcs#1249.
@nevi-me yes, that is the info I also found. There are also nightly implementations of |
@nevi-me rebased, but looks like needs-rebase label isn't removed |
@Dandandan there is another implementation of float comparison in use for lexicographical ordering (sorting by multiple columns) in Ideally the In my understanding, the only difference in behaviour should be around negative NaN and that small difference shouldn't block this PR. |
I will try and look at this PR later today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the code and verified we had test coverage for sorting of floating point nulls:
https://github.com/apache/arrow/blob/master/rust/arrow/src/compute/kernels/sort.rs#L1177
I vote Nice work @Dandandan
} else { | ||
unreachable!("Partition by nan is only applicable to float types") | ||
} | ||
// sorts f32 in IEEE 754 total ordering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good here to provide a pointer to the original source implementation (e.g. https://doc.rust-lang.org/std/primitive.f64.html#method.total_cmp) and a TODO to change to use that API when is stabilized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added comments here: #9129
…algorithm came from Follow up from apache/arrow#8882 Closes #9129 from alamb/alamb/doc-improvement Authored-by: Andrew Lamb <andrew@nerdnetworks.org> Signed-off-by: Neville Dipale <nevilledips@gmail.com>
This implements ordering using IEEE 754 ordering as mentioned in this discussion (only for sort now). I think this simplifies NaN-handling quite a bit. Performance-wise doesn't seem to be a big difference. apache#8685 Closes apache#8882 from Dandandan/float_order Authored-by: Heres, Daniel <danielheres@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
…algorithm came from Follow up from apache#8882 Closes apache#9129 from alamb/alamb/doc-improvement Authored-by: Andrew Lamb <andrew@nerdnetworks.org> Signed-off-by: Neville Dipale <nevilledips@gmail.com>
This implements ordering using IEEE 754 ordering as mentioned in this discussion (only for sort now).
I think this simplifies NaN-handling quite a bit. Performance-wise doesn't seem to be a big difference.
#8685