Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Optimize try_eval_bits to avoid layout queries #64673
Conversation
|
@bors try @rust-timer queue |
|
Awaiting bors try build completion |
Optimize match checking to avoid layout queries In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. We fallback to the (slower) previous code if that fails, so this is not a behavior change. r? @Centril who I believe knows this code enough, but if not feel free to re-assign
|
|
|
Queued 9b2e58c with parent ed8b708, future comparison URL. |
|
Finished benchmarking try commit 9b2e58c, comparison URL. |
|
unicode_normalization is ~30% faster! 3 other benchea got ~2% slower. The rest is stable. |
511352c
to
9c62117
|
Okay, moved into the try_eval_bits function -- locally that shows that this unicode_normalization is a bit slower (~0.05s) after doing so, but that might be noise (although I get pretty consistent results). Let's get some official results though @bors try @rust-timer queue |
|
Awaiting bors try build completion |
Optimize match checking to avoid layout queries In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. We fallback to the (slower) previous code if that fails, so this is not a behavior change. r? @Centril who I believe knows this code enough, but if not feel free to re-assign
|
|
|
Queued 64687bb with parent 4ff32c0, future comparison URL. |
|
Compilation of four crates failed during benchmarking:
|
|
Finished benchmarking try commit 64687bb, comparison URL. |
|
Failures are unrelated to this PR; they're due to newly introduced self-profile functionality for rustc. |
|
Looks like ~no difference which is good -- more general code is probably better. It's probably more likely this would've had a difference with const generics being used in the set of crate benchmarks. @oli-obk I believe this should be ready to merge. |
|
@bors r+ |
|
|
|
@bors rollup=never |
|
@bors p=3 |
Optimize try_eval_bits to avoid layout queries This specifically targets match checking, but is possibly more widely useful as well. In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. It may be worth considering adding a `size_of` query in the future which might be far faster, especially if specialized for "const" cases -- match arms being the most obvious example. It's possibly such a function would benefit from *not* being a query as well, since it's trivially evaluatable from the sty for many cases whereas a query needs to hash the input and such.
|
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
|
|
This specifically targets match checking, but is possibly more widely useful as well. In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. It may be worth considering adding a `size_of` query in the future which might be far faster, especially if specialized for "const" cases -- match arms being the most obvious example. It's possibly such a function would benefit from *not* being a query as well, since it's trivially evaluatable from the sty for many cases whereas a query needs to hash the input and such.
9c62117
to
06c6e75
|
@bors r=oli-obk |
|
|
Optimize try_eval_bits to avoid layout queries This specifically targets match checking, but is possibly more widely useful as well. In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. It may be worth considering adding a `size_of` query in the future which might be far faster, especially if specialized for "const" cases -- match arms being the most obvious example. It's possibly such a function would benefit from *not* being a query as well, since it's trivially evaluatable from the sty for many cases whereas a query needs to hash the input and such.
|
|
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is very hot for the `unicode_normalization` benchmark. This commit introduces a special-case alternative for scalars that avoids `try_eval_bits()` and all the functions it calls (`Const::eval()`, `ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and `Scalar::to_bits()`), instead extracting the result immediately. The type and value checking done by `Scalar::to_bits()` is replicated by moving it into a new function `Scalar::check_raw()` and using that new function in the special case. PR rust-lang#64673 introduced some special-case handling of scalar types in `Const::try_eval_bits()`. This handling is now moved out of that function into the new `IntRange::integral_size_and_signed_bias` function. This commit reduces the instruction count for `unicode_normalization-check-clean` by about 10%.
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is very hot for the `unicode_normalization` benchmark. This commit introduces a special-case alternative for scalars that avoids `try_eval_bits()` and all the functions it calls (`Const::eval()`, `ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and `Scalar::to_bits()`), instead extracting the result immediately. The type and value checking done by `Scalar::to_bits()` is replicated by moving it into a new function `Scalar::check_raw()` and using that new function in the special case. PR rust-lang#64673 introduced some special-case handling of scalar types in `Const::try_eval_bits()`. This handling is now moved out of that function into the new `IntRange::integral_size_and_signed_bias` function. This commit reduces the instruction count for `unicode_normalization-check-clean` by about 10%.
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is very hot for the `unicode_normalization` benchmark. This commit introduces a special-case alternative for scalars that avoids `try_eval_bits()` and all the functions it calls (`Const::eval()`, `ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and `Scalar::to_bits()`), instead extracting the result immediately. The type and value checking done by `Scalar::to_bits()` is replicated by moving it into a new function `Scalar::check_raw()` and using that new function in the special case. PR rust-lang#64673 introduced some special-case handling of scalar types in `Const::try_eval_bits()`. This handling is now moved out of that function into the new `IntRange::integral_size_and_signed_bias` function. This commit reduces the instruction count for `unicode_normalization-check-clean` by about 10%.
This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.
It may be worth considering adding a
size_ofquery in the future whichmight be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from not being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.