-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: mod operator is very slow on intel hardware with int64 #59089
Comments
This is expected, |
I have also taken a look at the Rust compiled code, it seems they cleverly check if the upper 32-bits of both the dividend and the divisor are all zero, and fall to faster
|
CC @randall77 |
Seems fine to put in a similar test + 32-bit path in our code. My only question is how universal is this? Skylake was mentioned - how does this fare on other processor vendors / families? |
It is generally true on x86 that an |
Change https://go.dev/cl/482658 mentions this issue: |
Change https://go.dev/cl/482656 mentions this issue: |
Change https://go.dev/cl/482657 mentions this issue: |
Change https://go.dev/cl/482659 mentions this issue: |
The same switch statement handles code generation for signed division of words, double words and quad words. Rather than using multiple switch statements to select the appropriate instructions, determine all of the correctly sized operands up front, then use them as needed. Updates #59089 Change-Id: I2b7567c8e0ecb9904c37607332538c95b0521dca Reviewed-on: https://go-review.googlesource.com/c/go/+/482657 Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
In order to avoid a CPU exception resulting from signed overflow, the signed division code tests if the divisor is -1 and if it is, runs fix up code to manually compute the quotient and remainder (thus avoiding IDIV and potential signed overflow). However, the way that this is currently structured means that the normal code path for the case where the divisor is not -1 results in five instructions and two branches (CMP, JEQ, followed by sign extension, IDIV and another JMP to skip over the fix up code). Rework the fix up code such that the final JMP is incurred by the less likely divisor is -1 code path, rather than more likely code path (which is already more expensive due to IDIV). This result in a four instruction sequence (CMP, JNE, sign extension, IDIV), with only a single branch. Updates #59089 Change-Id: Ie8d065750a178518d7397e194920b201afeb0530 Reviewed-on: https://go-review.googlesource.com/c/go/+/482658 Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Change https://go.dev/cl/484438 mentions this issue: |
This adds benchmarks for division and modulus of 64 bit signed and unsigned integers. Updates #59089 Change-Id: Ie757c6d74a1f355873e79619eae26ece21a8f23e Reviewed-on: https://go-review.googlesource.com/c/go/+/482656 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What did you do?
Mod operator is very slow on intel hardware when used on int type.
What did you expect to see?
Mod operator optimized to handle int types on intel hardwares
What did you see instead?
The mod operator is very slow when used on int types on intel hardware.
The text was updated successfully, but these errors were encountered: