New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8273459: Update code segment alignment to 64 bytes #5547
Conversation
Change align(x) to be relative to the current section's end rather than its size. Issue was uncovered with a raw align(64).
|
@asgibbons The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
src/hotspot/cpu/x86/globals_x86.hpp
Outdated
@@ -44,7 +44,7 @@ define_pd_global(uintx, CodeCacheSegmentSize, 64 COMPILER1_AND_COMPILER2_PRES | |||
// the uep and the vep doesn't get real alignment but just slops on by | |||
// only assured that the entry instruction meets the 5 byte size requirement. | |||
#if COMPILER2_OR_JVMCI | |||
define_pd_global(intx, CodeEntryAlignment, 32); | |||
define_pd_global(intx, CodeEntryAlignment, 64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about setting dynamic alignment based on MaxVectorSize ? i.e. match the alignment to MaxVectorSize. This way we can save internal fragmentations over AVX2.
It could be done during VM Initialization similar to following code.
https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/vm_version_x86.cpp#L1439
.gitignore
Outdated
@@ -16,5 +16,5 @@ NashornProfile.txt | |||
**/JTreport/** | |||
**/JTwork/** | |||
/src/utils/LogCompilation/target/ | |||
/.project/ | |||
**.project** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to .gitignore should probably be made separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This change should be removed from patch.
I share Dean's concern from the discussion before. CodeEntryAlignment
is used in a lot of places and we have to be careful about changes to it.
I found only 7 cases with align(64)
:
src/hotspot/cpu/x86//stubGenerator_x86_64.cpp: __ align(64);
src/hotspot/cpu/x86//stubGenerator_x86_64.cpp: __ align(64);
src/hotspot/cpu/x86//stubGenerator_x86_64.cpp: __ align(64);
src/hotspot/cpu/x86//stubGenerator_x86_64.cpp: __ align(64);
src/hotspot/cpu/x86//stubGenerator_x86_32.cpp: __ align(64);
src/hotspot/cpu/x86//stubGenerator_x86_32.cpp: __ align(64);
src/hotspot/cpu/x86//stubGenerator_x86_32.cpp: __ align(64);
It does not justify such general changes.
May suggestion would be to add new align64()
method to call pc() as you suggested in original proposal.
New function may also use |
I think I have not made the point clearly enough. The I believe this is entirely independent of I think the bottom line is that Perhaps the appropriate thing to do is to put an IMHO, the "right" thing to do is to mark the bytes requiring address alignment and handle the cases on copy. This would add significant complexity, however. |
Following code suggests that instr start addresses always honor CodeEntryAlignment. Current value of CodeEntryAligment is 32, a 32 byte aligned address is inherently 8, 16 byte aligned but not vice-versa. Setting this value to 64 will cover cases for 8,16, 32 and 64 byte alignment constraints in stubGenerator_64.cpp. Also there are several location in stubGenerator.cpp using __ align(CodeEntryAlignment) and suddenly it will also ensure 64 byte alignment and create internal fragmentation issue, also as Vliadimir pointed out there are handful location which needs 64 byte alignment. My suggestion was, if you are attempting it then its scope should be extended only for AVX512 and hence MaxVectorSize usage was suggested since problem of alignment will majorly surface for vector instructions and MaxVectorSize value can be set to 32 even on AVX512 targets, thus its a robust indicator of vector size and associated alignment constraints. |
It sounds like this will new alignment requirement will only be needed for stubs (initially?), but as proposed it will affect all other types of CodeBlobs. Just looking at the affect during startup, I saw padding for BufferBlobs go from 24 to 56, RuntimeBlobs go from 0 to 32 and 16 to 48, and nmethods go from 24 to 56. I would like to suggest again, to use the actual alignment requirements of the CodeBuffer to determine the alignment of the CodeBlob.
I agree.
I disagree. Let's not mark individual bytes. The call to align() is enough to allow us to record the maximum alignment required by the CodeBuffer, and the added complexity is not at the individual instruction copy, but just choosing the correct alignment value when creating the CodeBlob. For example, use MAX2(codebuffer->required_alignment(), CodeEntryAlignment) in place of CodeEntryAlignment. And for my own curiousity, I would like to hear from Intel what the expected affect on icache performance is from increasing the alignment of code. |
@asgibbons To me Vladimir Kozlov's suggestion of adding a align64() method calling pc() as you originally proposed looks the best. It meets our purpose and is limited in scope. |
I am back from vacation! I want to point out that when we generate code for these stubs we don't move them in CodeCache (in contrast to compiled methods): https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L268
Based on that, using |
Reverted |
HI @asgibbons, For stubs a new runtime constant StubCodeEntryAlignment can be added which determines the start address alignment of stub BufferBlobs. This will limit the impact of changes also will prevent any unwanted fragmentation issues. Your current patch with new align64 macro looks good, it will also restrict the scope of changes. |
@@ -1170,6 +1170,11 @@ void MacroAssembler::addpd(XMMRegister dst, AddressLiteral src) { | |||
} | |||
} | |||
|
|||
// See 8273459. Function for ensuring 64-byte alignment, intended for stubs only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to extend comment to explain why it will not work for nmethods - because they are copied when published.
I would be very careful changing loop code alignment. You will introduce a lot of NOP instruction into code to alight it which will be executed. I am fine with experimenting (running SPEC benchmarks) with different
May be but what benefit you can get with different alignment for stubs code?
|
|
Thanks for the comments. I reverted the .gitignore change and added comments as requested. I also found a couple of unmodified align(64) which were changed. As suggested, I added an assert in align() to flag spots where alignment cannot be ensured. |
Good. Let me test it before you push. |
.
It was an alternative to creating a different align64 macro in anticipation that we may introduce more stubs / code section constants which may give better performance if aligned to 64 byte boundary for wide vector targets like AVX512. New flag is specific to stubs thus has restrictive scope but at the same time parameterizable and can have different values for targets. As you pointed out currently there are handful of places where guaranteed 64 byte alignment is desired so current approach looks good.
|
Looks good to me as well. |
Tested tier 1-3 Linux. All pass. |
@asgibbons This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 10 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@vnkozlov, @sviswa7) but any other Committer may sponsor as well.
|
/integrate |
@asgibbons |
@vnkozlov @jatin-bhateja @dean-long - Thank you all for the comments and testing. Much appreciated. |
/sponsor |
Going to push as commit 53b25bc.
Your commit was automatically rebased without conflicts. |
@sviswa7 @asgibbons Pushed as commit 53b25bc. |
Minimal build fails after this PR. |
Change the default code entry alignment to 64 bytes from 32 bytes. This allows for maintaining proper 64-byte alignment of data within a code segment, which is required by several AVX-512 instructions.
I ran into this while implementing Base64 encoding and decoding. Code segments which were allocated with the address mod 32 == 0 but with the address mod 64 != 0 would cause the align() macro to misalign. This is because the align macro aligns to the size of the code segment and not the offset of the PC. So align(64) would align the PC to a multiple of 64 bytes from the start of the segment, and not to a pure 64-byte boundary as requested. Changing the alignment of the segment to 64 bytes fixes the issue.
I have not seen any measurable difference in either performance or memory usage with the tests I have run.
See this article for the discussion thread.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5547/head:pull/5547
$ git checkout pull/5547
Update a local copy of the PR:
$ git checkout pull/5547
$ git pull https://git.openjdk.java.net/jdk pull/5547/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 5547
View PR using the GUI difftool:
$ git pr show -t 5547
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5547.diff