-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8321137: Relax ICStub alignment #16911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back shade! A progress list of the required criteria for merging this PR into |
This is a good idea. I was working on something similar, making the logic smarter so code start could still align with |
Driveby comment to consider... We have a similar situation with static call stubs. But there we have an isb instruction in the static call stub which is there from when an nmethod is compiled, which deals with this exact problem. Now thatstub is used to transfer control to the interpreter and we can do inefficient things. As for ICStubs, it would probably not be very pretty. So instead, as I see it, we have kind of relied on ICStubs being spaced out in memory enough that the instruction fetcher would just not have it cached since the ICStub finalization which happens in a safepoint and is followed by cross modifying code fences on all threads. Because of this, I think I would sleep a bit worse at night, especially regarding AArch64, if these stubs start to get too cozy with each other in memory. I sleep better with the social distancing policy of today, where the stubs are aware there is a dangerous disease to be worried about from their friends and not to catch it. Having said that, I am currently removing ICStubs alltogether and hope to integrate in a not too distant future. Just my 50 I$... |
To address @fisk concerns, how about we reduce the alignment of ICStub code_begin to cache line size, and reduce the alignment of the ICStub header to alignof(ICStub)? The remaining problem, allowing ICStub_from_destination_address() to correctly map from code_begin back to the header, is still possible and is something I have been working on. The solution is to align code_begin first and compute the header location from that, instead of the other way around. |
That sounds like a good idea, @dean-long |
All right, keeping ICStub per cache line then! I think the easier way to achieve this is to align the entire stub at cache line size, and then align code section at smaller alignment. This assumes instruction and data cache line sizes are the same and covered by default. This also still assumes that calling into ICStub entry at more lax alignment is fine. The new patch still reduces the ICStub size from 128 bytes to 64 bytes on AArch64. (If I have time, I would look into actually aligning code_begin first, as @dean-long suggested, but I expect it to be more intrusive.) |
Grrr, this does not work, because Another attempt is align both stub and code section to |
This version looks good too. |
Depends on #16973 to avoid footprint regression on x86_64. |
This apparently morphed to "Let's allocate ICStub per cache line" instead of relaxing the alignment. The footprint improvement on most important architectures would be a side effect of this. I am going to restart this work in a separate PR to cleanly capture this development. |
Closed in favor of #17277. |
WIP, submitting for others to poke holes in it.
Similarly to JDK-8284578, we would like to handle
ICStub
alignment. Currently, the small stub that takes only 24 bytes of code is covered by 128 bytes on AArch64. This is due to the same thing fixed by JDK-8284578 for interpreter codelets: aligning twice theCodeEntryAlignment
.128 bytes per
ICStub
means we deplete 10KICBuffer
with only 79 stubs. This actually happens multiple times even on a simpleHelloWorld.java
invocation that invokes some javac code, causingICBufferFull
safepoints. We can increaseICBuffer
size, especially after JDK-8314220, but we cannot do this without limits, since it eats up code cache.But if we assume that code entry alignment is not a strict requirement, and used to improve performance for frequently used code, then maybe we do not have to over-align the IC stub, given it is probably only used during IC transitions? It would significantly improve
ICStub
footprint and require smallerICBuffer
.Current patch affects ICStub size in different ways on different platforms, since current size is effectively 2x
CodeEntryAlignment
, and new size is cache line size:Additional testing:
tier1 tier2 tier3
tier1 tier2 tier3
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16911/head:pull/16911
$ git checkout pull/16911
Update a local copy of the PR:
$ git checkout pull/16911
$ git pull https://git.openjdk.org/jdk.git pull/16911/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16911
View PR using the GUI difftool:
$ git pr show -t 16911
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16911.diff
Webrev
Link to Webrev Comment