-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/exp/slices: unexpected pdqsort behavior #52518
Comments
Maybe following remark is helpful: The original pdqsort assumes strict weak ordering as std::sort in C++. NaNs violate this assumption. The developer of pdqsort, Orson Peters, explains it here: https://youtu.be/jz-PBiWwNjc?t=826. |
@ulikunitz Thanks! For For the custom comparison function(both For It seems that disable this optimization can keep our previous compatibility? (some users rely on this behavior, for example, using custom functions to sort float64 slices), but it also will reduce the performance when the slice contains many duplicate elements. |
It's already the case (before pdqsort) that both So I think we can leave everything as It would certainly be worth adding a comment to |
CC @eliben |
Yes! The pdqsort now have adjacent non-NaN values that are sorted incorrectly when sorting a slice that contains NaNs. Added a warning in both Agree the idea that we can just leave everything as |
Change https://go.dev/cl/402534 mentions this issue: |
Thanks for handling this, @zhangyunhao116 ! |
Open this issue to discuss pdqsort behavior. cc @randall77 , please take a look, thanks!
{{GreaterOrEqual}}
instead of!{{Less}}
After we used
go run gen_sort_variants.go -generic
to generate code for packageslices
, we found that the slices test will fail in float64 slice(slices/sort_test.go:TestSortFloat64Slice
).The reason for this error is that in pdqsort we use
if a > 0 && !{{Less "data" "a-1" "pivot"}}
to detect if we selected the same pivot twice for partition, which is fine in most cases. But inslices/zsortordered.go
(generated codes), this line will beif a > 0 && !(data[a-1]<data[pivot])
, whendata[pivot]
ordata[a-1]
is NaN,!(data[a-1]<data[pivot])
is always true, resulting in incorrect algorithm behavior(mistakenly believe thatdata[a-1]
equalsdata[pivot]
).The fix is PatchSet18 at CL 371574, we can see the diff in https://go-review.googlesource.com/c/go/+/371574/17..18. In this patch, we use
{{GreaterOrEqual}}
instead of!{{Less}}
, for other files,{{GreaterOrEqual}}
is!{{Less}}
(keep the same as before), forslices/zsortordered.go
, the generated code will beif a > 0 && (data[a-1]>=data[pivot])
, which can solve the problem. This change should only work forzsortordered.go:pdqsort
, but it still changes StableSort's code due to global replacement, so we need to fix this in the new CL.The reason for these problems is that we use
if a > 0 && !{{Less "data" "a-1" "pivot"}}
in pdqsort to determine whether there are duplicate pivot selections (data [a-1]
must be less than or equal to pivot if exists). If this mechanism is removed, we can use!Less
instead of{{GreaterOrEqual}}
in all cases. I'm still not sure if there is a better way to implement this mechanism.BTW, if we use different
maxInsertion
in sort and slices as discussed earlier, it will also lead to inconsistent algorithm behavior, does this mean that we can't use differentmaxInsertion
in different versions?The text was updated successfully, but these errors were encountered: