Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize try_eval_bits to avoid layout queries #64673

Merged
merged 1 commit into from Sep 30, 2019

Conversation

@Mark-Simulacrum
Copy link
Member

commented Sep 21, 2019

This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.

It may be worth considering adding a size_of query in the future which
might be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from not being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2019

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

Copy link

commented Sep 21, 2019

Awaiting bors try build completion

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2019

⌛️ Trying commit 511352c with merge 9b2e58c...

bors added a commit that referenced this pull request Sep 21, 2019
Optimize match checking to avoid layout queries

In code with large, single-value match statements, we were previously
spending a lot of time running layout_of for the primitive types
(integers, chars) -- which is essentially useless. This optimizes the
code to avoid those query calls by directly obtaining the size for these
types, when possible.

We fallback to the (slower) previous code if that fails, so this is not
a behavior change.

r? @Centril who I believe knows this code enough, but if not feel free to re-assign
@Centril

This comment has been minimized.

Copy link
Member

commented Sep 21, 2019

@rust-highfive rust-highfive assigned oli-obk and unassigned Centril Sep 21, 2019
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2019

☀️ Try build successful - checks-azure
Build commit: 9b2e58c

@rust-timer

This comment has been minimized.

Copy link

commented Sep 22, 2019

Queued 9b2e58c with parent ed8b708, future comparison URL.

@rust-timer

This comment has been minimized.

Copy link

commented Sep 22, 2019

Finished benchmarking try commit 9b2e58c, comparison URL.

@bjorn3

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2019

unicode_normalization is ~30% faster! 3 other benchea got ~2% slower. The rest is stable.

@Mark-Simulacrum Mark-Simulacrum force-pushed the Mark-Simulacrum:opt-match-ck branch from 511352c to 9c62117 Sep 22, 2019
@Mark-Simulacrum

This comment has been minimized.

Copy link
Member Author

commented Sep 22, 2019

Okay, moved into the try_eval_bits function -- locally that shows that this unicode_normalization is a bit slower (~0.05s) after doing so, but that might be noise (although I get pretty consistent results).

Let's get some official results though @bors try @rust-timer queue

@rust-timer

This comment has been minimized.

Copy link

commented Sep 22, 2019

Awaiting bors try build completion

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2019

⌛️ Trying commit 9c62117 with merge 64687bb...

bors added a commit that referenced this pull request Sep 22, 2019
Optimize match checking to avoid layout queries

In code with large, single-value match statements, we were previously
spending a lot of time running layout_of for the primitive types
(integers, chars) -- which is essentially useless. This optimizes the
code to avoid those query calls by directly obtaining the size for these
types, when possible.

We fallback to the (slower) previous code if that fails, so this is not
a behavior change.

r? @Centril who I believe knows this code enough, but if not feel free to re-assign
@Mark-Simulacrum Mark-Simulacrum changed the title Optimize match checking to avoid layout queries Optimize try_eval_bits to avoid layout queries Sep 22, 2019
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2019

☀️ Try build successful - checks-azure
Build commit: 64687bb

@rust-timer

This comment has been minimized.

Copy link

commented Sep 22, 2019

Queued 64687bb with parent 4ff32c0, future comparison URL.

@bjorn3

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2019

Compilation of four crates failed during benchmarking:

thread \'rustc\' panicked at \'assertion failed: pos.checked_add(num_bytes).unwrap() <= self.mapped_file.len()\', /cargo/registry/src/github.com-1ecc6299db9ec823/measureme-0.3.0/src/mmap_serialization_sink.rs:38:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.39.0-nightly (ed8b708c1 2019-09-21) running on x86_64-unknown-linux-gnu

note: compiler flags: -Z self-profile=/tmp/.tmp1D64N5/self-profile-output -Z self-profile-events=all -C debuginfo=2 --crate-type lib

note: some of the compiler flags provided by cargo are hidden

thread \'rustc\' panicked at \'index 1073741840 out of range for slice of length 1073741824\', src/libcore/slice/mod.rs:2583:5
stack backtrace:
   0:     0x7fd1648a2b04 - backtrace::backtrace::libunwind::trace::h61b15987b9420dc8
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/libunwind.rs:88
   1:     0x7fd1648a2b04 - backtrace::backtrace::trace_unsynchronized::h944547918bca7d09
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/mod.rs:66
   2:     0x7fd1648a2b04 - std::sys_common::backtrace::_print_fmt::hf631db7a19c7ecfe
                               at src/libstd/sys_common/backtrace.rs:76
   3:     0x7fd1648a2b04 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h356b821d79ead967
                               at src/libstd/sys_common/backtrace.rs:60
   4:     0x7fd1648db14c - core::fmt::write::haa7725ecee710b81
                               at src/libcore/fmt/mod.rs:1030
   5:     0x7fd164896d27 - std::io::Write::write_fmt::h677b5c4b9e48abad
                               at src/libstd/io/mod.rs:1412
   6:     0x7fd1648a7335 - std::sys_common::backtrace::_print::hdc27c79deedd181f
                               at src/libstd/sys_common/backtrace.rs:64
   7:     0x7fd1648a7335 - std::sys_common::backtrace::print::h0bb3a218c68a1b38
                               at src/libstd/sys_common/backtrace.rs:49
   8:     0x7fd1648a7335 - std::panicking::default_hook::{{closure}}::h9160d687734b4c2f
                               at src/libstd/panicking.rs:196
   9:     0x7fd1648a7026 - std::panicking::default_hook::h298b832ea14df44f
                               at src/libstd/panicking.rs:210
  10:     0x7fd164ddcf63 - rustc_driver::report_ice::hce2a6b74528a3743
  11:     0x7fd1648a7b1c - std::panicking::rust_panic_with_hook::h7c6406c2637b219f
                               at src/libstd/panicking.rs:477
  12:     0x7fd1648a75d2 - std::panicking::continue_panic_fmt::h8e5e175fd262b206
                               at src/libstd/panicking.rs:380
  13:     0x7fd1648a74c6 - rust_begin_unwind
                               at src/libstd/panicking.rs:307
  14:     0x7fd1648d4ada - core::panicking::panic_fmt::h864c751c34920017
                               at src/libcore/panicking.rs:85
  15:     0x7fd1648d5216 - core::slice::slice_index_len_fail::h5579666bd5db7c44
                               at src/libcore/slice/mod.rs:2583
  16:     0x7fd166856244 - <measureme::mmap_serialization_sink::MmapSerializationSink as core::ops::drop::Drop>::drop::hfdb5ca3a69a780c5
  17:     0x7fd164de1948 - alloc::sync::Arc<T>::drop_slow::h1e9b9b3bcf38be1c
  18:     0x7fd164de1c3e - alloc::sync::Arc<T>::drop_slow::h2d5c08f8d08065e0
  19:     0x7fd164df4f15 - <alloc::rc::Rc<T> as core::ops::drop::Drop>::drop::h226d778706bf29d2
  20:     0x7fd164db01bc - core::ptr::real_drop_in_place::h78ba5d2f1d1836e8
  21:     0x7fd164daa59c - rustc_interface::interface::run_compiler_in_existing_thread_pool::hf9ebc0f8adadba07
[...]
@rust-timer

This comment has been minimized.

Copy link

commented Sep 22, 2019

Finished benchmarking try commit 64687bb, comparison URL.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member Author

commented Sep 22, 2019

Failures are unrelated to this PR; they're due to newly introduced self-profile functionality for rustc.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member Author

commented Sep 23, 2019

Looks like ~no difference which is good -- more general code is probably better. It's probably more likely this would've had a difference with const generics being used in the set of crate benchmarks.

@oli-obk I believe this should be ready to merge.

@oli-obk

This comment has been minimized.

Copy link
Contributor

commented Sep 23, 2019

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 23, 2019

📌 Commit 9c62117 has been approved by oli-obk

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 23, 2019

@bors rollup=never

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 28, 2019

@bors p=3

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 29, 2019

⌛️ Testing commit 9c62117 with merge eea282a...

bors added a commit that referenced this pull request Sep 29, 2019
Optimize try_eval_bits to avoid layout queries

This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.

It may be worth considering adding a `size_of` query in the future which
might be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from *not* being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Sep 29, 2019

The job x86_64-gnu-debug of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-09-29T13:50:20.2613831Z == clock drift check ==
2019-09-29T13:50:20.2613979Z   local time: Sun Sep 29 13:50:19 UTC 2019
2019-09-29T13:50:20.2614140Z   network time: Sun, 29 Sep 2019 13:50:19 GMT
2019-09-29T13:50:20.2614297Z == end clock drift check ==
2019-09-29T13:50:20.7406115Z ##[error]Bash exited with code '1'.
2019-09-29T13:50:20.7458169Z ##[section]Starting: Upload CPU usage statistics
2019-09-29T13:50:20.7461403Z ==============================================================================
2019-09-29T13:50:20.7461479Z Task         : Bash
2019-09-29T13:50:20.7461552Z Description  : Run a Bash script on macOS, Linux, or Windows

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 29, 2019

💔 Test failed - checks-azure

This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.

It may be worth considering adding a `size_of` query in the future which
might be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from *not* being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.
@Mark-Simulacrum Mark-Simulacrum force-pushed the Mark-Simulacrum:opt-match-ck branch from 9c62117 to 06c6e75 Sep 29, 2019
@Mark-Simulacrum

This comment has been minimized.

Copy link
Member Author

commented Sep 29, 2019

@bors r=oli-obk

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 29, 2019

📌 Commit 06c6e75 has been approved by oli-obk

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 29, 2019

⌛️ Testing commit 06c6e75 with merge d16ee89...

bors added a commit that referenced this pull request Sep 29, 2019
Optimize try_eval_bits to avoid layout queries

This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.

It may be worth considering adding a `size_of` query in the future which
might be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from *not* being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 30, 2019

☀️ Test successful - checks-azure
Approved by: oli-obk
Pushing d16ee89 to master...

@bors bors added the merged-by-bors label Sep 30, 2019
@bors bors merged commit 06c6e75 into rust-lang:master Sep 30, 2019
5 checks passed
5 checks passed
homu Test successful
Details
pr Build #20190929.46 succeeded
Details
pr (Linux mingw-check) Linux mingw-check succeeded
Details
pr (Linux x86_64-gnu-llvm-6.0) Linux x86_64-gnu-llvm-6.0 succeeded
Details
pr (LinuxTools) LinuxTools succeeded
Details
nnethercote added a commit to nnethercote/rust that referenced this pull request Oct 4, 2019
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is
very hot for the `unicode_normalization` benchmark.

This commit introduces a special-case alternative for scalars that avoids
`try_eval_bits()` and all the functions it calls (`Const::eval()`,
`ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and
`Scalar::to_bits()`), instead extracting the result immediately.

The type and value checking done by `Scalar::to_bits()` is replicated by moving
it into a new function `Scalar::check_raw()` and using that new function in the
special case.

PR rust-lang#64673 introduced some special-case handling of scalar types in
`Const::try_eval_bits()`. This handling is now moved out of that function into
the new `IntRange::integral_size_and_signed_bias` function.

This commit reduces the instruction count for
`unicode_normalization-check-clean` by about 10%.
@Mark-Simulacrum Mark-Simulacrum deleted the Mark-Simulacrum:opt-match-ck branch Oct 8, 2019
XiangQingW added a commit to XiangQingW/rust that referenced this pull request Oct 13, 2019
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is
very hot for the `unicode_normalization` benchmark.

This commit introduces a special-case alternative for scalars that avoids
`try_eval_bits()` and all the functions it calls (`Const::eval()`,
`ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and
`Scalar::to_bits()`), instead extracting the result immediately.

The type and value checking done by `Scalar::to_bits()` is replicated by moving
it into a new function `Scalar::check_raw()` and using that new function in the
special case.

PR rust-lang#64673 introduced some special-case handling of scalar types in
`Const::try_eval_bits()`. This handling is now moved out of that function into
the new `IntRange::integral_size_and_signed_bias` function.

This commit reduces the instruction count for
`unicode_normalization-check-clean` by about 10%.
choller added a commit to choller/rust that referenced this pull request Oct 17, 2019
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is
very hot for the `unicode_normalization` benchmark.

This commit introduces a special-case alternative for scalars that avoids
`try_eval_bits()` and all the functions it calls (`Const::eval()`,
`ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and
`Scalar::to_bits()`), instead extracting the result immediately.

The type and value checking done by `Scalar::to_bits()` is replicated by moving
it into a new function `Scalar::check_raw()` and using that new function in the
special case.

PR rust-lang#64673 introduced some special-case handling of scalar types in
`Const::try_eval_bits()`. This handling is now moved out of that function into
the new `IntRange::integral_size_and_signed_bias` function.

This commit reduces the instruction count for
`unicode_normalization-check-clean` by about 10%.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.