Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8257815: Replace global log2 functions with efficient implementations #1663

Closed
wants to merge 30 commits into from

Conversation

@cl4es
Copy link
Member

@cl4es cl4es commented Dec 7, 2020

This patch replaces the log2 functions in globalDefinitions.hpp with more efficient counterparts in utilities/powerOfTwo.hpp

Naming is hard, but I think the following scheme is reasonable:

  • log2i: any integral type. 0-hostile
  • log2i_allow_zero: any integral type. gracefully handles zero (adds a branch)
  • exact_log2i: any integral type. value must be a power of two

I chose log2i rather than log2 to stand out from the log2 functions defined in various standard libraries.

Going through all usage, quite a few uses of log2_long et.c. could be replaced by exact_log2i since they take something that has been checked to be a power of two. Most of the remaining usage seem to be able to use the 0-hostile variant, which avoids a branch.

To sanity check that calculating log2 using count_leading_zeros gives better performance I added a couple of trivial and short-running microbenchmarks to test_powerOfTwo. For small values (<= 1025) the new impl is ~5x faster, with a larger speed-up for larger integer values:

[ RUN      ] power_of_2.log2_long_micro
[       OK ] power_of_2.log2_long_micro (3581 ms)
[ RUN      ] power_of_2.log2_long_small_micro
[       OK ] power_of_2.log2_long_small_micro (549 ms)
[ RUN      ] power_of_2.log2i_micro
[       OK ] power_of_2.log2i_micro (259 ms)
[ RUN      ] power_of_2.log2i_small_micro
[       OK ] power_of_2.log2i_small_micro (113 ms)

I'm not sure if this naive microbenchmark carries its own weight, but it just adds a few seconds and can be useful for quickly checking this performance assumption on other H/W

(Intending this for 17)


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8257815: Replace global log2 functions with efficient implementations

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1663/head:pull/1663
$ git checkout pull/1663

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Dec 7, 2020

👋 Welcome back redestad! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 7, 2020

@cl4es The following labels will be automatically applied to this pull request:

  • hotspot
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

Loading

@cl4es cl4es marked this pull request as ready for review Dec 7, 2020
@openjdk openjdk bot added the rfr label Dec 7, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 7, 2020

Loading

@TheRealMDoerr
Copy link
Contributor

@TheRealMDoerr TheRealMDoerr commented Dec 7, 2020

Builds on PPC64le after JDK-8257798 was fixed.

Loading

@shipilev
Copy link
Contributor

@shipilev shipilev commented Dec 7, 2020

There seem to be gtest/GTestWrapper.java failures on x86_32.

Loading

@cl4es
Copy link
Member Author

@cl4es cl4es commented Dec 7, 2020

There was an issue with casting to uint64_t that only showed up in an existing test on 32-bit. I've implemented and tested a variant using std::make_unsigned which should be more robust by avoiding promoting 32-bit signed types to 64-bit unsigned ones.

Loading

@kimbarrett
Copy link

@kimbarrett kimbarrett commented Dec 8, 2020

Naming is hard, but I think the following scheme is reasonable:

* log2i: any integral type. 0-hostile

Not yet a review, but the "usual" name is "ilog2". Do a web search for that and you'll find lots of relevant hits. I like the short name having the non-zero precondition. Whether the longer version that tests for zero carry's it's weight is a question.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 8, 2020

Mailing list message from Kim Barrett on shenandoah-dev:

On Dec 7, 2020, at 8:18 PM, Kim Barrett <kbarrett at openjdk.java.net> wrote:

On Mon, 7 Dec 2020 12:00:48 GMT, Claes Redestad <redestad at openjdk.org> wrote:

Naming is hard, but I think the following scheme is reasonable:

* log2i: any integral type. 0-hostile

Not yet a review, but the "usual" name is "ilog2". Do a web search for that and you'll find lots of relevant hits. I like the short name having the non-zero precondition. Whether the longer version that tests for zero carry's it's weight is a question.

I think the approach to dealing with negative values should be reconsidered.

------------------------------------------------------------------------------
src/hotspot/share/utilities/powerOfTwo.hpp
49 // Log2 of any integral value, i.e., largest i such that 2^i <= x
50 // Precondition: x != 0
51 // For negative values this will return 63 for 64-bit types, 31 for
52 // 32-bit types, and so on.

I think the behavior for negative values here is wrong. The precondition
should be x > 0. That flows through into the implementation. This also
affects the design around the proposed _allow_zero function.

------------------------------------------------------------------------------
src/hotspot/share/utilities/powerOfTwo.hpp
80 inline int exact_log2(intptr_t value) {
81 return exact_log2i(value);
82 }

This is widening the domain to include negative values, which were
previously excluded since it had is_power_of_2 as a precondition, and that
function is false for negative values. I think the old behavior is correct
and the change is not.

------------------------------------------------------------------------------

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 8, 2020

Mailing list message from Claes Redestad on shenandoah-dev:

Hi Kim,

On 2020-12-08 02:43, Kim Barrett wrote:

On Dec 7, 2020, at 8:18 PM, Kim Barrett <kbarrett at openjdk.java.net> wrote:

On Mon, 7 Dec 2020 12:00:48 GMT, Claes Redestad <redestad at openjdk.org> wrote:

Naming is hard, but I think the following scheme is reasonable:

\* log2i\: any integral type\. 0\-hostile

Not yet a review, but the "usual" name is "ilog2". Do a web search for that and you'll find lots of relevant hits. I like the short name having the non-zero precondition. Whether the longer version that tests for zero carry's it's weight is a question.

I think the approach to dealing with negative values should be reconsidered.

I kind of agree, but...

------------------------------------------------------------------------------
src/hotspot/share/utilities/powerOfTwo.hpp
49 // Log2 of any integral value, i.e., largest i such that 2^i <= x
50 // Precondition: x != 0
51 // For negative values this will return 63 for 64-bit types, 31 for
52 // 32-bit types, and so on.

I think the behavior for negative values here is wrong. The precondition
should be x > 0. That flows through into the implementation. This also
affects the design around the proposed _allow_zero function.

------------------------------------------------------------------------------
src/hotspot/share/utilities/powerOfTwo.hpp
80 inline int exact_log2(intptr_t value) {
81 return exact_log2i(value);
82 }

This is widening the domain to include negative values, which were
previously excluded since it had is_power_of_2 as a precondition, and that
function is false for negative values. I think the old behavior is correct
and the change is not.

------------------------------------------------------------------------------

I think you misread slightly: the behavior of the pre-existing code *is*
to similarly treat signed values as unsigned wrt checking power_of_2:

inline int exact_log2(intptr_t x) {
assert(is_power_of_2((uintptr_t)x), "x must be a power of 2: "
INTPTR_FORMAT, x);
...

inline int exact_log2_long(jlong x) {
assert(is_power_of_2((julong)x), "x must be a power of 2: "
JLONG_FORMAT, x);

exact_log2i does an equivalent check. So unless I'm misreading the
context I'm _preserving_ this behavior.

I agree we could opt for a stricter precondition in the new method
(exact_ilog2?), while retrofitting exact_log/exact_log2_long to be
backwards compatible w.r.t. accepting signed values that turn into
0x8000... when cast to unsigned. I think we should then follow-up
and remove exact_log2/-_long.

WDYT?

/Claes

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 8, 2020

Mailing list message from Kim Barrett on shenandoah-dev:

On Dec 8, 2020, at 7:49 AM, Claes Redestad <claes.redestad at oracle.com> wrote:

Hi Kim,

On 2020-12-08 02:43, Kim Barrett wrote:

I think you misread slightly: the behavior of the pre-existing code *is*
to similarly treat signed values as unsigned wrt checking power_of_2:

inline int exact_log2(intptr_t x) {
assert(is_power_of_2((uintptr_t)x), "x must be a power of 2: " INTPTR_FORMAT, x);
...

inline int exact_log2_long(jlong x) {
assert(is_power_of_2((julong)x), "x must be a power of 2: " JLONG_FORMAT, x);

exact_log2i does an equivalent check. So unless I'm misreading the
context I'm _preserving_ this behavior.

You are right, I missed those pesky ?u?s in those pesky casts. Yuck!

I agree we could opt for a stricter precondition in the new method
(exact_ilog2?), while retrofitting exact_log/exact_log2_long to be
backwards compatible w.r.t. accepting signed values that turn into
0x8000... when cast to unsigned. I think we should then follow-up
and remove exact_log2/-_long.

WDYT?

That seems like a good plan to me. Similarly for ilog2.

I think with that approach you don?t need the _allow_zero form either.
It?s just a precondition of ilog2 that the argument is > 0.

Also, exact_ilog2 should use count_trailing_zeros.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

9 similar comments
@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@cl4es cl4es deleted the log2_template branch Jan 4, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

18 similar comments
@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@cl4es The command integrate can only be used in open pull requests.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
6 participants