New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8221404: C2: Convert RegMask and IndexSet to use uintptr_t #1102
Conversation
👋 Welcome back redestad! A progress list of the required criteria for merging this PR into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general.
You may want to compare RA times from -XX:+LogCompilation to see clear difference.
src/hotspot/share/opto/regmask.cpp
Outdated
@@ -83,9 +81,10 @@ int RegMask::num_registers(uint ireg) { | |||
case Op_VecA: | |||
assert(Matcher::supports_scalable_vector(), "does not support scalable vector"); | |||
return SlotsPerVecA; | |||
default: | |||
// Op_VecS and the rest ideal registers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add assert to make sure we see only expected values here.
src/hotspot/share/opto/indexSet.hpp
Outdated
@@ -102,17 +102,17 @@ class IndexSet : public ResourceObj { | |||
// All of BitBlocks fields and methods are declared private. We limit | |||
// access to IndexSet and IndexSetIterator. | |||
|
|||
// A BitBlock is composed of some number of 32 bit words. When a BitBlock | |||
// A BitBlock is composed of some number of 64 bit words. When a BitBlock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
63- or 32- bit words
Using +CITime to get a breakdown of a sample run of Regalloc times for largeMethod_repeat_c2, baseline:
Patch:
Timings appear pretty stable from run-to-run. No significant change in other phases. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
// Each block consists of 256 bits | ||
block_index_length = 8, | ||
// Split over 4 or 8 words depending on bitness | ||
word_index_length = block_index_length - LogBitsPerWord, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I also thought about using ‘word’ definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! @vidmik pushed me to come up with derived definitions here rather than adding another magic constant for 64-bit.
@cl4es This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been no new commits pushed to the ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
/integrate |
This patch refactors RegMask and IndexSet to use uintptr_t rather than int for storage, which may shorten some code paths and loops on 64-bit VMs. Making storage unsigned further allows for a few simplification, e.g. is_bound_set where there was logic to deal with sign extension that can no longer happen.
To evaluate performance impact I created the included JMH microbenchmark which uses the RepeatCompilation command to repeat the compilation of a few methods: One trivial (
trivialMath
), one "regular" (mixHashCode
), and one largish (largeMethod
..) with a lot of locals. These are designed to put no stress, some stress and quite a bit of stress on register allocation:Baseline:
Patched:
This shows that there's no significant change on
trivialMath
,mixHashCode
see a small improvement (~2%) andlargeMethod
see a larger improvement (~4-5%) on C2 and Tiered tests with compiler repetition.Testing: tier 1-7 on all Oracle platforms, local testing and verification of linux-x86.
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/1102/head:pull/1102
$ git checkout pull/1102