-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: intrinsify strings.Compare #61725
Comments
@golang/compiler |
Can we bring back this removed optimization? I'm hitting the same issue. |
Another vote for something that recognizes They seem to have been written in such a way that the optimizer can throw away all the But Or should we also document |
Change https://go.dev/cl/532195 mentions this issue: |
Good. Yet, I'm a bit sad that this means giving up on It's a bit stunning to, for years, take the stance that you don't really need three-way compares, then build an entire 2nd search/sort library around three-way compares, and then fail to optimize the handful There are cases where three-way compares are not something to avoid, and leaving |
If we're going to give up and not optimize string comparisons, and if we can't intrinsify/specialize func Compare[T Ordered](x, y T) int {
if x == y {
return 0
}
if x < y {
return -1
}
if isNaN(x) {
if isNaN(y) {
return 0
}
return -1
}
return +1
} It's slightly worse for ints, slightly better for floats (both hard to measure), and better for strings (mostly equivalent to the previous This is sorting a 10 million random element slice, with
Edit: what would be a good benchmark so I could send this CL in? |
Change https://go.dev/cl/531997 mentions this issue: |
With the existing type Order struct {
Product string
Customer string
Price float64
}
orders := []Order{ /* ... */ }
// ORDER BY Customer ASC, Product ASC, Price DESC
slices.SortFunc(orders, func(a, b Order) int {
return cmp.Or(
strings.Compare(a.Customer, b.Customer), // could also be cmp.Compare if it can be equally optimized
strings.Compare(a.Product, b.Product),
cmp.Compare(b.Price, a.Price),
)
}) |
I've updated examples to use |
I imagine you guys are fond of It's, curious that http://go.dev/cl/532195 changes PS: someone should go change package |
I just forgot to update the benchmark and didn't mean anything else. I can revert my change to not use |
I'm all for optimizing But I disagree with having package We go from one extreme (dissuading users from using |
I agree that since cmp.Or isn’t short circuiting, it is a little weird to use it with strings.Compare to try to eke out a little performance that you just threw away. I still think it’s a pretty elegant way to write a struct comparison function though. |
@dr2chase , this may be an interesting thing to think about with the pure function analysis you've been experimenting with. If we knew or could tell that strings.Compare/cmp.Compare are pure, could we effectively compile a call to cmp.Or with a bunch of comparison arguments as a short circuited operation? Or maybe that whole approach is too implicit. |
To summarize the situation: Go does not compare strings in an optimal way. This is especially visible when sorting. As of now, the documentation in the
|
We should make both as fast as possible, and stop recommending people use either because of performance-over-clarity. |
[...] I also want to add that I see some platform-dependent performance here; taking the correct string sorting benchmark from stdlib, I get better performance for |
I suspect that this happens because the sorting implementation in Directly comparing
Sorting strings is slightly slower then sorting strings represented as Adding a common prefix to all strings / byte slices increases the perf. difference:
|
A better benchmark for this issue may be:
The
|
This should no longer be true after https://go-review.googlesource.com/c/go/+/503795 |
It still compares them twice, and none of those 2 takes advantage of length being different to short circuit. I tried to address that in https://go.dev/cl/531997 but results are inconsistent. |
My point is that |
Note that, since toolchain v1.19, for unknown reasons, slice related benchmarks are often slice length sensitive. So it is best to use several different slices lengths in benchmarks. |
Change https://go.dev/cl/578835 mentions this issue: |
The results of cmpstring are reuseable if the second call has the same arguments and memory. Note that this gets rid of cmpstring, but we still generate a redundant </<= test and branch afterwards, because the compiler doesn't know that cmpstring only ever returns -1,0,1. Update #61725 Change-Id: I93a0d1ccca50d90b1e1a888240ffb75a3b10b59b Reviewed-on: https://go-review.googlesource.com/c/go/+/578835 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Change https://go.dev/cl/642038 mentions this issue: |
I think we might have dropped the ball on this follow-up, which I think is valuable. Raised #71270. |
A comment by @rsc in
strings.Compare
says:I'm not sure this comment has aged well with the recent change to
slices.SortFunc
,where it now takes in an
func(T, T) int
instead offunc(T, T) bool
.This means that the utility of
strings.Compare
suddenly shot up.We should either intrinsify
strings.Compare
asruntime.cmpstring
OR (even better) recognize this pattern.That way,
cmp.Compare
onstring
types will benefit as well.The text was updated successfully, but these errors were encountered: