-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make sort by multiple columns parallel #3549
Conversation
} | ||
} else { | ||
// no floats, so we can compare unchecked | ||
unsafe { a.partial_cmp(b).unwrap_unchecked() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this means IsFloat
should be an unsafe trait (unsafe to implement). Otherwise safe user code can implement it incorrectly on some local type (on which they also implement PartialOrd), leading to a soundness issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. We should make that trait unsafe indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation could also help understand what IsFloat
means.
For example, I could define a local type for 2D points on the integer grid (struct Point { x: i32, y: i32 }
). This struct could have a partial ordering implementation (a > b
iff a.x > b.x && a.y > b.y
), returning None
in some cases.
Implementing IsFloat
sounds easy in that case (it's not a float), but even then we shouldn't be implementing this trait at all here, since there is no easy way to properly behave in this sort function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not meant to be public. It's only for polars physical types.
It is public because its in one of our subcrates, but any user implementation can never be used here. Because we only get here via polars physical types.
But for correctness I will make it an unsafe trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually impossible to call this method on user-defined types?
PolarsNumericType
can be implemented on local types. I haven't tried yet, but is it really impossible to build a ChunkedArray<LocalType>
(or a ChunkedArray<Option<LocalType>>
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe you want to "seal" these traits to make sure they cannot be implemented for local types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes that would indeed be an elegant solution for this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PolarsNumericType can be implemented on local types. I haven't tried yet, but is it really impossible to build a ChunkedArray (or a ChunkedArray<Option>)?
Yes, that would blow up very quickly. You would not get your type into arrow arrays.
@@ -20,6 +20,8 @@ impl IsFloat for u8 {} | |||
impl IsFloat for u16 {} | |||
impl IsFloat for u32 {} | |||
impl IsFloat for u64 {} | |||
impl IsFloat for &str {} | |||
impl<T: IsFloat> IsFloat for Option<T> {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is dangerous, as users could implement PartialOrd for Option<LocalType>
in ways that disagree with this.
For example, a user could have a float wrapper struct LocalType(f32)
, have impl PartialOrd for LocalType
by just forwarding the call to the wrapped float, and correctly implement IsFloat for LocalType
. This code would look totally fine under review.
They could also (arguably correctly) implement PartialOrd for Option<LocalType>
by returning None
if either of the two compared values is None
. Here again, nothing looks unusual during an audit.
Still, this would lead to a possible soundness issue if attempting to sort values of type Option<LocalType>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make it an unsafe trait. For all types this is intended to be used for this is correct.
closes #3537