Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollup of 4 pull requests #84739

Closed
wants to merge 9 commits into from

Conversation

Dylan-DPC-zz
Copy link

Successful merges:

Failed merges:

r? @ghost
@rustbot modify labels: rollup

Create a similar rollup

mark-i-m and others added 9 commits April 27, 2021 21:20
On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop.  The yield instruction is a
nop.  The isb instruction puts the processor to sleep for some short time.  isb
is a good equivalent to the pause instruction on x86.

Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors.  The micro-benchmarks use
https://github.com/google/benchmark.git

$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
  int i = 0;
  for (auto _ : state)
    benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();

$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------

AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment      0.485 ns        0.485 ns   1000000000
BM_yield                 0.400 ns        0.400 ns   1000000000
BM_isb                    13.2 ns         13.2 ns     52993304

AWS Graviton (A-72) processor:
BM_scalar_increment      0.897 ns        0.874 ns    801558633
BM_yield                 0.877 ns        0.875 ns    800002377
BM_isb                    13.0 ns         12.7 ns     55169412

Apple Arm64 M1 processor:
BM_scalar_increment      0.315 ns        0.315 ns   1000000000
BM_yield                 0.313 ns        0.313 ns   1000000000
BM_isb                    9.06 ns         9.06 ns     77259282

static void BM_pause(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("pause"::);
}
BENCHMARK(BM_pause);

Intel Skylake processor:
BM_scalar_increment      0.295 ns        0.295 ns   1000000000
BM_pause                  41.7 ns         41.7 ns     16780553

Tested on Graviton2 aarch64-linux with `./x.py test`.
Reuse `sys::unix::cmath` on other platforms

Reuse `sys::unix::cmath` on all non-`windows` platforms.

`unix` is chosen as the canonical location instead of `unsupported` or `common` because `unsupported` doesn't make sense semantically and `common` is reserved for code that is supported on all platforms. Also `unix` is already the home of some non-`windows` code that is technically not exclusive to `unix` like `unix::path`.
Be stricter about rejecting LLVM reserved registers in asm!

LLVM will silently produce incorrect code if these registers are used as operands.

cc ````@rust-lang/wg-inline-asm````
[Arm64] use isb instruction instead of yield in spin loops

On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop.  The yield instruction is a
nop.  The isb instruction puts the processor to sleep for some short time.  isb
is a good equivalent to the pause instruction on x86.

Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors.  The micro-benchmarks use
https://github.com/google/benchmark.git

```
$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
  int i = 0;
  for (auto _ : state)
    benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();

$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------

AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment      0.485 ns        0.485 ns   1000000000
BM_yield                 0.400 ns        0.400 ns   1000000000
BM_isb                    13.2 ns         13.2 ns     52993304

AWS Graviton (A-72) processor:
BM_scalar_increment      0.897 ns        0.874 ns    801558633
BM_yield                 0.877 ns        0.875 ns    800002377
BM_isb                    13.0 ns         12.7 ns     55169412

Apple Arm64 M1 processor:
BM_scalar_increment      0.315 ns        0.315 ns   1000000000
BM_yield                 0.313 ns        0.313 ns   1000000000
BM_isb                    9.06 ns         9.06 ns     77259282
```

```
static void BM_pause(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("pause"::);
}
BENCHMARK(BM_pause);

Intel Skylake processor:
BM_scalar_increment      0.295 ns        0.295 ns   1000000000
BM_pause                  41.7 ns         41.7 ns     16780553
```

Tested on Graviton2 aarch64-linux with `./x.py test`.
@rustbot rustbot added the rollup A PR which is a rollup label Apr 30, 2021
@Dylan-DPC-zz
Copy link
Author

@bors r+ rollup=never p=5

@bors
Copy link
Contributor

bors commented Apr 30, 2021

📌 Commit 7a1462f has been approved by Dylan-DPC

@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Apr 30, 2021
@bors
Copy link
Contributor

bors commented Apr 30, 2021

⌛ Testing commit 7a1462f with merge 4a07567b863f1b1bf0d2d7341f366ce6778ba8e4...

@rust-log-analyzer
Copy link
Collaborator

The job dist-various-2 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
[RUSTC-TIMING] gimli test:false 5.402
[RUSTC-TIMING] object test:false 10.260
warning: dropping unsupported crate type `dylib` for target `x86_64-fortanix-unknown-sgx`

error: invalid register `rbx`: rbx is used internally by LLVM and cannot be used as an operand for inline asm
  --> library/std/src/sys/sgx/ext/arch.rs:37:13
   |
37 |             in("rbx") request,


error: invalid register `rbx`: rbx is used internally by LLVM and cannot be used as an operand for inline asm
  --> library/std/src/sys/sgx/ext/arch.rs:65:13
   |
65 |             in("rbx") targetinfo,

error: aborting due to 2 previous errors; 1 warning emitted

[RUSTC-TIMING] std test:false 1.859
[RUSTC-TIMING] std test:false 1.859
error: could not compile `std`

To learn more, run the command again with --verbose.
command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-fortanix-unknown-sgx" "-Zbinary-dep-depinfo" "-j" "16" "--release" "--locked" "--color" "always" "--features" "panic-unwind backtrace compiler-builtins-c" "--manifest-path" "/checkout/library/test/Cargo.toml" "--message-format" "json-render-diagnostics"
failed to run: /checkout/obj/build/bootstrap/debug/bootstrap dist --host= --target x86_64-fuchsia,aarch64-fuchsia,wasm32-unknown-unknown,wasm32-wasi,sparcv9-sun-solaris,x86_64-pc-solaris,x86_64-unknown-linux-gnux32,x86_64-fortanix-unknown-sgx,nvptx64-nvidia-cuda,armv7-unknown-linux-gnueabi,armv7-unknown-linux-musleabi,i686-unknown-freebsd
Build completed unsuccessfully in 0:24:28

@bors
Copy link
Contributor

bors commented Apr 30, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 30, 2021
@jackh726 jackh726 closed this Apr 30, 2021
@Dylan-DPC-zz Dylan-DPC-zz deleted the rollup-6182psa branch April 30, 2021 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rollup A PR which is a rollup S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants