Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make `i*::signum` a `const fn`. #61635

Merged
merged 2 commits into from Jun 8, 2019
Merged

Conversation

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jun 7, 2019

Ticks a box in #53718.

This uses a well-known branchless implementation of signum: (n > 0) as i32 - (n < 0) as i32.

Here's a playground comparing the two techniques. On x86 in release mode, the branchless implementation is able to replace a mov and cmov with a sar and add, so this should be a bit faster as well.

This is marked as a draft since I think I'll need to add #[rustc_const_unstable] somewhere. Perhaps the reviewer can point me in the right direction.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jun 7, 2019

r? @bluss

(rust_highfive has picked a reviewer for you, use r? to override)

Copy link
Contributor

oli-obk left a comment

src/libcore/num/mod.rs Show resolved Hide resolved
@oli-obk oli-obk assigned oli-obk and unassigned bluss Jun 7, 2019
This uses a well-known branchless implementation of `signum`.
Its `const`-ness is unstable and requires `#![feature(const_int_sign)]`.
@ecstatic-morse ecstatic-morse force-pushed the ecstatic-morse:const-signum branch from e080fac to bd899d0 Jun 7, 2019
@ecstatic-morse ecstatic-morse marked this pull request as ready for review Jun 7, 2019
@oli-obk

This comment has been minimized.

Copy link
Contributor

oli-obk commented Jun 7, 2019

can you also add some tests using this inside a constant and comparing the result with expected values?

There should already be tests for other const fns for integer types, you can add the new test to them.

@ecstatic-morse ecstatic-morse force-pushed the ecstatic-morse:const-signum branch from d79e75d to 1bf37ae Jun 7, 2019
@ecstatic-morse ecstatic-morse force-pushed the ecstatic-morse:const-signum branch from 1bf37ae to 7ec54f9 Jun 7, 2019
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jun 8, 2019

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:17e9407e:start=1559951213612132101,finish=1559951215669921732,duration=2057789631
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
$ export AWS_ACCESS_KEY_ID=AKIA46X5W6CZEJZ6XT55
---
[00:53:41] .................................................................................................... 500/5656
[00:53:44] ........................i........................................................................... 600/5656
[00:53:48] .................................................................................................... 700/5656
[00:53:53] .................................................................................................... 800/5656
[00:53:57] ..........................F......................................................................... 900/5656
[00:54:04] ...................................................iiiii............................................ 1100/5656
[00:54:08] .................................................................................................... 1200/5656
[00:54:10] .................................................................................................... 1300/5656
[00:54:13] .................................................................................................... 1400/5656
---
[00:56:51] 
[00:56:51] 23 error[E0716]: temporary value dropped while borrowed
[00:56:51] 24   --> $DIR/const-int-sign.rs:7:30
[00:56:51] 25    |
[00:56:51] - LL |     let sgn: &'static bool = &(5_i32.signum());
[00:56:51] -    |              -------------    ^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
[00:56:51] + LL |     let sgn: &'static i32 = &(5_i32.signum());
[00:56:51] +    |              ------------    ^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
[00:56:51] 29    |              type annotation requires that borrow lasts for `'static`
[00:56:51] 30 LL |
[00:56:51] 
[00:56:51] 
[00:56:51] 
[00:56:51] The actual stderr differed from the expected stderr.
[00:56:51] Actual stderr saved to /checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/consts/const-int-sign/const-int-sign.stderr
[00:56:51] To update references, rerun the tests and pass the `--bless` flag
[00:56:51] To only update this specific test, also pass `--test-args consts/const-int-sign.rs`
[00:56:51] error: 1 errors occurred comparing output.
[00:56:51] status: exit code: 1
[00:56:51] status: exit code: 1
[00:56:51] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/consts/const-int-sign.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-C" "prefer-dynamic" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/consts/const-int-sign" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/consts/const-int-sign/auxiliary" "-A" "unused"
[00:56:51] ------------------------------------------
[00:56:51] 
[00:56:51] ------------------------------------------
[00:56:51] stderr:
[00:56:51] stderr:
[00:56:51] ------------------------------------------
[00:56:51] error[E0716]: temporary value dropped while borrowed
[00:56:51]   --> /checkout/src/test/ui/consts/const-int-sign.rs:2:29
[00:56:51]    |
[00:56:51] LL |     let x: &'static bool = &(5_i32.is_negative());
[00:56:51]    |            -------------    ^^^^^^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
[00:56:51]    |            type annotation requires that borrow lasts for `'static`
[00:56:51] ...
[00:56:51] LL | }
[00:56:51]    | - temporary value is freed at the end of this statement
[00:56:51]    | - temporary value is freed at the end of this statement
[00:56:51] 
[00:56:51] error[E0716]: temporary value dropped while borrowed
[00:56:51]   --> /checkout/src/test/ui/consts/const-int-sign.rs:4:29
[00:56:51]    |
[00:56:51] LL |     let y: &'static bool = &(5_i32.is_positive());
[00:56:51]    |            -------------    ^^^^^^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
[00:56:51]    |            type annotation requires that borrow lasts for `'static`
[00:56:51] ...
[00:56:51] LL | }
[00:56:51]    | - temporary value is freed at the end of this statement
[00:56:51]    | - temporary value is freed at the end of this statement
[00:56:51] 
[00:56:51] error[E0716]: temporary value dropped while borrowed
[00:56:51]   --> /checkout/src/test/ui/consts/const-int-sign.rs:7:30
[00:56:51]    |
[00:56:51] LL |     let sgn: &'static i32 = &(5_i32.signum());
[00:56:51]    |              ------------    ^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
[00:56:51]    |              type annotation requires that borrow lasts for `'static`
[00:56:51] LL |     //~^ ERROR temporary value dropped while borrowed
[00:56:51] LL | }
[00:56:51]    | - temporary value is freed at the end of this statement
---
[00:56:51] thread 'main' panicked at 'Some tests failed', src/tools/compiletest/src/main.rs:521:22
[00:56:51] note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
[00:56:51] 
[00:56:51] 
[00:56:51] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/ui" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "ui" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-6.0/bin/FileCheck" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "6.0.0\n" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--llvm-cxxflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"
[00:56:51] 
[00:56:51] 
[00:56:51] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[00:56:51] Build completed unsuccessfully in 0:52:44

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@ecstatic-morse ecstatic-morse force-pushed the ecstatic-morse:const-signum branch from 7ec54f9 to c2d8841 Jun 8, 2019
@ecstatic-morse ecstatic-morse force-pushed the ecstatic-morse:const-signum branch from c2d8841 to f6611db Jun 8, 2019
@ecstatic-morse

This comment has been minimized.

Copy link
Contributor Author

ecstatic-morse commented Jun 8, 2019

@oli-obk I think this is ready to go now.

@oli-obk

This comment has been minimized.

Copy link
Contributor

oli-obk commented Jun 8, 2019

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 8, 2019

📌 Commit f6611db has been approved by oli-obk

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 8, 2019

⌛️ Testing commit f6611db with merge 7f90abe...

bors added a commit that referenced this pull request Jun 8, 2019
Make `i*::signum` a `const fn`.

Ticks a box in #53718.

This uses a well-known branchless implementation of `signum`: `(n > 0) as i32 - (n < 0) as i32`.

Here's a [playground](https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=747cf191c4974bf66c9d75e509ae6e6e) comparing the two techniques. On x86 in release mode, the branchless implementation is able to replace a `mov` and `cmov` with a `sar` and `add`, so this should be a bit faster as well.

~~This is marked as a draft since I think I'll need to add `#[rustc_const_unstable]` somewhere. Perhaps the reviewer can point me in the right direction.~~
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 8, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: oli-obk
Pushing 7f90abe to master...

@bors bors added the merged-by-bors label Jun 8, 2019
@bors bors merged commit f6611db into rust-lang:master Jun 8, 2019
2 checks passed
2 checks passed
Travis CI - Pull Request Build Passed
Details
homu Test successful
Details
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jun 8, 2019

📣 Toolstate changed by #61635!

Tested on commit 7f90abe.
Direct link to PR: #61635

🎉 rls on windows: test-fail → test-pass (cc @Xanewok, @rust-lang/infra).

rust-highfive added a commit to rust-lang-nursery/rust-toolstate that referenced this pull request Jun 8, 2019
Tested on commit rust-lang/rust@7f90abe.
Direct link to PR: <rust-lang/rust#61635>

🎉 rls on windows: test-fail → test-pass (cc @Xanewok, @rust-lang/infra).
@ecstatic-morse ecstatic-morse deleted the ecstatic-morse:const-signum branch Jun 8, 2019
@ecstatic-morse ecstatic-morse mentioned this pull request Jun 8, 2019
30 of 59 tasks complete
@bjacob

This comment has been minimized.

Copy link

bjacob commented Jun 17, 2019

For what it's worth, regarding 'branchless' in the commit message, this code was/is branchless and more or less as efficient before and after this commit. Example on ARM64:
https://godbolt.org/z/9w4BPr
The old form compiled to 2 mask-and-select instructions (0 branches).
The new form compiles to 1 mask-and-select and 1 subtract-with-shifted-operand (it shifts by 31 bits to get the sign bit, that's how it evals (x < 0)).
So it's just 2 arithmetic instructions before and after.

Note though that the old form also informed the compiler that the result was either -1, 0 or +1. The new form looks to the compiler like the result might be any integer value. That might have side effects on optimization opportunities in inlined contexts.

@ecstatic-morse

This comment has been minimized.

Copy link
Contributor Author

ecstatic-morse commented Jun 17, 2019

@bjacob, I compared the generated assembly (albeit only on x86) in the PR message, so I'm aware that the optimized version of the old implementation is branchless. I've reproduced the full output below for you.

playground::signum: # @playground::signum
# %bb.0:
	xorl	%eax, %eax
	testl	%edi, %edi
	setg	%al
	sarl	$31, %edi
	addl	%edi, %eax
	retq
                                        # -- End function

playground::signum_branch: # @playground::signum_branch
# %bb.0:
	xorl	%ecx, %ecx
	testl	%edi, %edi
	setne	%cl
	movl	$-1, %eax
	cmovnsl	%ecx, %eax
	retq
                                        # -- End function

The main benefit of this PR is that signum is now callable in a const context (since the implementation has no branches). A side benefit is that the code is now faster (caveats: on x86, using clang instead of rustc, in a contrived microbenchmark, on whatever machine quick-bench.com uses). cmov is slower than your typical, constant-time arithmetic instruction on x86 (although the gap is closing with each new microarchitecture), and the new version uses one less register (on both ARM and x86).

Your last point is an interesting one, although I'm not aware of a set of optimizations that would make use this information. Perhaps you have an idea?

@bjacob

This comment has been minimized.

Copy link

bjacob commented Jun 17, 2019

right, sorry I had not followed the link in the commit message.
about the latter point, I was thinking about this: https://godbolt.org/z/kfha8c
See how switch_foo (old form) is simpler than switch_bar(new form) just because the logic is transparent to the compiler so it's able to merge the inline code at the logical level and not even evaluate the actual 'signum' value.

@bjacob

This comment has been minimized.

Copy link

bjacob commented Jun 17, 2019

On the other hand, I'm impressed that even in the new form, the compiler was able to deduce that the result could only be -1, 0 or 1 so it still elided the default: case.

@ecstatic-morse

This comment has been minimized.

Copy link
Contributor Author

ecstatic-morse commented Jun 17, 2019

Ah, I didn't consider branching on the result of signum. The original implementation will indeed be faster in that case since the compiler can unify the control flow (although I am also surprised that it it knows the range of outputs for the branchless version).

I don't know how common code like this is, since

match n {
    n if n > 0 => {}
    0 => {}
    _ => {}
}

is much more straightforward (I only use signum for multiplication), but it's good to know that this PR isn't an unconditional performance win.

@ecstatic-morse

This comment has been minimized.

Copy link
Contributor Author

ecstatic-morse commented Sep 17, 2019

@Centril This should get const-hack, although there may be a slight performance win from the convoluted version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.