Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8290025: Remove the Sweeper #9741

Closed
wants to merge 13 commits into from
Closed

Conversation

fisk
Copy link
Contributor

@fisk fisk commented Aug 4, 2022

When the world was still young, the sweeper was built to unload bad smelling nmethods. While it has been going through various revisions, the GCs got support for class unloading, and the need for the GCs to get rid of nmethods with a different unpleasant scent.

The two systems would now compete for unloading nmethods, and the responsibility of throwing away nmethods would blur. The sweeper was still good at throwing away nmethods faster as it only needs to scan stacks, and not do a full GC.

With the advent of Loom, the situation has gotten even worse. The stacks are now also in the Java heap. The sweeper is unable to throw away nmethods without the liveness analysis of a full GC, which also performs code cache unloading, but isn't allowed to actually delete nmethods due to races with the sweeper. In a way we have the worst of both worlds, where both the sweeper and GC are crippled, unable to unload nmethods without the help of the other. And there are a very large number of complicated races that the JVM needs to deal with, especially with concurrent code cache unloading not interfering with concurrent sweeping. And concurrent sweeping not interfering with the application.

The sweeper cycle exposes 2 flavours of nmethods that are "dead" to the system. So whenever nmethods are used, we have to know they are not dead. But we typically don't have the tools to really know they are not dead. For example, one might think grabbing the CodeCache_lock and using an iterator that only walks is_alive() nmethods would help make sure you don't get dead nmethods in your iterator. However, that is not the case, because the CodeCache_lock can't be held across the entire zombie transition due to "reasons" that are not trivial to actually change. Because of this, code has to deal with nmethods flipping around randomly to a dead state.

I propose to get out of this sad situation, by removing the sweeper. If we need a full GC anyway to remove nmethods, we might as well let the GC do everything. This removes the notion of is_zombie(), is_unloaded() and hence is_alive() from the JVM. It also removes the notion of the orthogonal but related nmethodLocker to keep nmethods around, without preventing them from dying. We instead throw away nmethods the way we throw away pretty much anything else in the unloading GC code:

  1. Unlink
  2. Global sync
  3. Throw away
  4. Profit!
    This way, if you get a reference to an nmethod, it won't go away until the next safepoint poll, and will not flip around liveness due to concurrent transitions.

In the new model, we use nmethod entry barriers to keep track of the last time an nmethod was on-stack. This is then used to 1) prove that not_entrant nmethods that haven't been on-stack for an entire GC can be removed, and 2) heuristically remove nmethods that have never been called for N full GCs, where N is calculated based on code cache allocation rate, GC frequency, remaining free memory until "trouble", etc. Similar to metaspace, there is also some threshold GC trigger to start GC when the code cache is filling up, and nothing else is triggering full GCs. The threshold gets smaller as we approach a point of being uncomfortably close to code cache exhaustion. Past said point, we GC very aggressively, and you probably want a larger code cache.

I have tested this in mach5 tier1-7, I have run through perf aurora with no regressions, and also run an "internal large application" to see how it scales, also with no regressions. Since testing tier1-7 a few small tweaks have been made so I am running some extra testing.

I have tried to be as compatible as possible to previous sweeping related JVM flags, arguing that nothing in the flags implies whether the implementation is using a GC or a separate sweeper thread. However, the UseCodeAging flag I have obsoleted, as UseCodeCacheFlushing is the flag for deciding cold nmethods should be removed, and with the new mechanism for doing that, there is no need for UseCodeAging flag as well.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9741/head:pull/9741
$ git checkout pull/9741

Update a local copy of the PR:
$ git checkout pull/9741
$ git pull https://git.openjdk.org/jdk pull/9741/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 9741

View PR using the GUI difftool:
$ git pr show -t 9741

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9741.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 4, 2022

👋 Welcome back eosterlund! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

/label add hotspot

@openjdk openjdk bot added rfr Pull request is ready for review hotspot hotspot-dev@openjdk.org labels Aug 4, 2022
@openjdk
Copy link

openjdk bot commented Aug 4, 2022

@fisk
The hotspot label was successfully added.

@mlbridge
Copy link

mlbridge bot commented Aug 4, 2022

stefank
stefank approved these changes Aug 4, 2022
Copy link
Member

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Reviewed most code, but left JVMCI, JVMTI, and tests, for others to review.

@openjdk
Copy link

openjdk bot commented Aug 4, 2022

@fisk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8290025: Remove the Sweeper

Reviewed-by: stefank, kvn, iveresov, coleenp, vlivanov, mdoerr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 14 new commits pushed to the master branch:

  • 5a20bc4: 8292715: Cleanup Problemlist
  • 7b81a9c: 8289764: gc/lock tests failed with "OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects"
  • 76ee549: 8292329: Enable CDS shared heap for zero builds
  • 14623c6: 8292739: Invalid legacy entries may be returned by Provider.getServices() call
  • 568be58: 8290469: Add new positioning options to PassFailJFrame test framework
  • 69448f9: 8292679: Simplify thread creation in gtest and port 2 tests to new way
  • 3c2289d: 8215916: The failure reason of an optional JAAS LoginModule is not logged
  • 71ab5c9: 8292816: GPL Classpath exception missing from assemblyprefix.h
  • c062397: 8292713: Unsafe.allocateInstance should be intrinsified without UseUnalignedAccesses
  • a45a4b9: 8292194: G1 nmethod entry barrier disarm value wraps around too early
  • ... and 4 more: https://git.openjdk.org/jdk/compare/ad2e0c4df045261c04b00bfa1faf5c21392edc58...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 4, 2022
@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

Thanks for the review, @stefank

@luhenry
Copy link
Member

luhenry commented Aug 4, 2022

Hi @fisk, thanks for the amazing cleanup! We've had a bunch of issues with the code sweeper and AsyncGetCallTrace with trying to unwind from dead code blobs (or what looked like it at least). Have you run some stress tests for AsyncGetCallTrace and this change? It may be something we can look into to make sure it doesn't increase the chances of crashes, or even reduces the chances of crashes. CC @jbachorik @parttimenerd

@bulasevich
Copy link
Contributor

Hi Erik,

The change breaks the ARM32 port as the nmethod entry barriers are not implemented there yet. We need a way to work without nmethod entry barriers for the ARM32 platform.

# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/barrierSetNMethod_arm.cpp:38
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/boris/jdk-fisk/src/hotspot/cpu/arm/gc/shared/barrierSetNMethod_arm.cpp:38), pid=10206, tid=10207
#  Error: ShouldNotReachHere()
#
# JRE version: OpenJDK Runtime Environment (20.0) (fastdebug build 20-internal-adhoc.boris.jdk-fisk)
# Java VM: OpenJDK Server VM (fastdebug 20-internal-adhoc.boris.jdk-fisk, mixed mode, g1 gc, linux-arm)
# Problematic frame:
# V  [libjvm.so+0x2bd610]  BarrierSetNMethod::is_armed(nmethod*)+0x2c

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

Hi @fisk, thanks for the amazing cleanup! We've had a bunch of issues with the code sweeper and AsyncGetCallTrace with trying to unwind from dead code blobs (or what looked like it at least). Have you run some stress tests for AsyncGetCallTrace and this change? It may be something we can look into to make sure it doesn't increase the chances of crashes, or even reduces the chances of crashes. CC @jbachorik @parttimenerd

I would certainly appreciate your help in stress testing that.

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

Hi Erik,

The change breaks the ARM32 port as the nmethod entry barriers are not implemented there yet. We need a way to work without nmethod entry barriers for the ARM32 platform.


# To suppress the following error report, specify this argument

# after -XX: or in .hotspotrc:  SuppressErrorAt=/barrierSetNMethod_arm.cpp:38

#

# A fatal error has been detected by the Java Runtime Environment:

#

#  Internal Error (/home/boris/jdk-fisk/src/hotspot/cpu/arm/gc/shared/barrierSetNMethod_arm.cpp:38), pid=10206, tid=10207

#  Error: ShouldNotReachHere()

#

# JRE version: OpenJDK Runtime Environment (20.0) (fastdebug build 20-internal-adhoc.boris.jdk-fisk)

# Java VM: OpenJDK Server VM (fastdebug 20-internal-adhoc.boris.jdk-fisk, mixed mode, g1 gc, linux-arm)

# Problematic frame:

# V  [libjvm.so+0x2bd610]  BarrierSetNMethod::is_armed(nmethod*)+0x2c

Is there a plan for the arm32 port to support nmethod entry barriers? I would really appreciate if the solution to this problem is that arm32 implements nmethod entry barriers. You only need to support the STW data and code patching variation, and it should do pretty much exactly what said AArch64 port does. Arm32 is the only platform I know of that doesn't support nmethod entry barriers, and IMO the JVM should be able to assume this is a feature that we can rely on.

@bulasevich
Copy link
Contributor

Erik,

A time ago (not the current code base under review) I noticed that your code sweeper algorithm is more aggressive.
Capturing data from output of the -XX:+PrintMethodFlushing option (since the option does not exist in your code,
I've added the printout of CodeCache::unallocated_capacity), I build a CodeCache capacity picture
while the application run (jmh renaissance benchmark with -f 0 option). Before the changes, free space was steadily declining.
With your change, cold methods were quickly cleaned up despite the fact that the cache is less than half full.

(I try to insert pictures here: before and after the change)
before
remove-the-sweeper

My concern is that there may be performance degradation in cases where the application periodically changes activity while running.

thanks,
Boris

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

Erik,

A time ago (not the current code base under review) I noticed that your code sweeper algorithm is more aggressive.

Capturing data from output of the -XX:+PrintMethodFlushing option (since the option does not exist in your code,

I've added the printout of CodeCache::unallocated_capacity), I build a CodeCache capacity picture

while the application run (jmh renaissance benchmark with -f 0 option). Before the changes, free space was steadily declining.

With your change, cold methods were quickly cleaned up despite the fact that the cache is less than half full.

(I try to insert pictures here: before and after the change)

before

remove-the-sweeper

My concern is that there may be performance degradation in cases where the application periodically changes activity while running.

thanks,

Boris

I would be interested in seeing what the graph looks like with my current proposed change. I don't know what state my prototype was in when you performed measurements, and the heuristics have been tweaked.

Having said that, if half the code cache was filled up, it sounds like it's getting to a point where you do want to flush things that haven't been used in a long while, to avoid getting into the "red zone" of aggressive sweeping at 90%, in order to free up space for currently hot code. To me it looks like it is working kind of as expected.

Was there a regression due to unloading perceivably cold code?

You can get more information about the heuristic decisions with -Xlog:codecache with my change.

@parttimenerd
Copy link
Contributor

@luhenry @fisk I'm going to stress test it with my jdk-profiling-tester on M1 and x86 over night (comparing it with the current master).

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

@luhenry @fisk I'm going to stress test it with my jdk-profiling-tester on M1 and x86 over night (comparing it with the current master).

Thank you!

@xmas92
Copy link
Member

xmas92 commented Aug 4, 2022

There are still 15 comments that refer to zombie in the hotspot/share/** codebase. Would be nice to nuke them all to avoid stale, confusing and misleading information.

There is also some printing and non product flags which needs an overhaul in what terminology to use, and with what meaning, in the world where the sweeper and zombie nmethods are gone.

The second part is just an observation of future work, but it would be nice to have this patch at least deliver with no stale zombie comments.

Details
  • Comments

    • src/hotspot/share/code/codeCache.cpp:1290
      • Remove? Or just call when unloaded/unloading.
    • src/hotspot/share/code/compiledIC.cpp:370
      • Rephrase, confusing.
    • src/hotspot/share/code/compiledMethod.cpp:313
      • Rephrase, confusing.
    • src/hotspot/share/code/compiledMethod.cpp:511
      • Remove?
    • src/hotspot/share/code/compiledMethod.cpp:515
      • Remove? Confusing.
    • src/hotspot/share/code/compiledMethod.cpp:596
      • Remove zombie? Also what is unloaded nmethod in this context.
    • src/hotspot/share/code/dependencyContext.cpp:71
      • Rephrase, confusing.
    • src/hotspot/share/code/nmethod.cpp:1166
      • Remove?
    • src/hotspot/share/code/nmethod.cpp:1171
      • Remove? Confusing.
    • src/hotspot/share/code/nmethod.cpp:1184
      • Remove? Confusing.
    • src/hotspot/share/code/nmethod.cpp:1422
      • Rephrase, confusing. Many zombie interactions/assumptions in one comment.
    • src/hotspot/share/code/nmethod.hpp:268
      • Remove. zombie, unloaded enums no longer exist
    • src/hotspot/share/compiler/compileTask.cpp:243
      • Rephrase or remove, confusing.
    • src/hotspot/share/oops/method.cpp:1168
      • Rephrase, confusing.
    • src/hotspot/share/prims/jvmtiImpl.cpp:1079
      • Rephrase, confusing. Also meaning of unloaded.
  • Not product flag

    • ZombieALot and ZombieALotInterval
      • Needs an overhaul with respects to what terminology to use and what it is suppose to represent
  • Printing

    • src/hotspot/share/code/codeCache.cpp:1643
    • src/hotspot/share/code/codeHeapState.cpp
      • Needs an overhaul with respects to what terminology to use and what it is suppose to represent

@bulasevich
Copy link
Contributor

I would be interested in seeing what the graph looks like with my current proposed change. I don't know what state my prototype was in when you performed measurements, and the heuristics have been tweaked.

Below is a CodeCache chart (numbers from -Xlog:codecache this time) on the current proposed change. App gets 250MB of CodeCache, and executes 23 independent benchmarks.

изображение

Was there a regression due to unloading perceivably cold code?

No, specific of my run is a set of independent benchmarks: warmup, iteration, warmup, iteration - it does not get regression unless the CodeHeap is completely over. Sweeper itself does not affect performance, I checked it. My concern is about user applications.

Having said that, if half the code cache was filled up, it sounds like it's getting to a point where you do want to flush things that haven't been used in a long while, to avoid getting into the "red zone" of aggressive sweeping at 90%, in order to free up space for currently hot code. To me it looks like it is working kind of as expected.

In my picture, we can see that the methods are swept out when the CodeCache is 70% free. For me it is not expected. Even a super-cold method can come back. We should not flush it when we are far from memory starvation.

@bulasevich
Copy link
Contributor

Is there a plan for the arm32 port to support nmethod entry barriers?

I created a JBS record for this task: JDK-8291302
I do not have time for this task now. If no one takes it, I hope to find time for this in the late autumn. For now, please, make a "short term solution where if nmethod aging is turned off, you can run without nmethod entry barriers".

thank you!

if (nmethod_needs_unregister) {
Universe::heap()->unregister_nmethod(this);
}
flush_dependencies(/*delete_immediately*/true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush_dependencies is now only ever called with false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I removed it, and some remove_dependent methods that were only used in the true version that is no longer called.

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

There are still 15 comments that refer to zombie in the hotspot/share/** codebase. Would be nice to nuke them all to avoid stale, confusing and misleading information.

There is also some printing and non product flags which needs an overhaul in what terminology to use, and with what meaning, in the world where the sweeper and zombie nmethods are gone.

The second part is just an observation of future work, but it would be nice to have this patch at least deliver with no stale zombie comments.

Details

I found all mentions of zombie I could find. Notably left ZombieALot as it has never created zombies, so the name was already wrong and misleading. It should arguably be called DeoptimizeALot or something... but that's obviously already taken. I'd prefer not to not discuss its naming in this PR as it was already wrong and there are enough things to consider as it is in here.

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

Is there a plan for the arm32 port to support nmethod entry barriers?

I created a JBS record for this task: JDK-8291302 I do not have time for this task now. If no one takes it, I hope to find time for this in the late autumn. For now, please, make a "short term solution where if nmethod aging is turned off, you can run without nmethod entry barriers".

thank you!

Okay. I put in a special ARM32 mode for you for now. It will work but it won't remove cold nmethods. I really hope we can remove the special mode soon and assume all platforms have nmethod entry barriers. I don't like having special modes.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review of few code/ files. Will do more later.

I am not comfortable with name unloading for this process because historically, for me, it is associated with class unloading and "unloading" corresponding nmethods. cleaning is more similar with sweeping. But it could be only me.

return _cold_gc_count;
}

void CodeCache::on_allocation() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about CodeCache::gc_on_allocation()? Otherwise it is not clear what it does in places where it is called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// Notify code cache unloading that we are about to allocate, which may
// or may not require freeing up memory first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing comment. As I understand it could be: Check if memory should be freed before allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// Calculate the number of GCs after which an nmethod is expected to have been
// used in order to not be classed as cold.
void CodeCache::update_cold_gc_count() {
if (!MethodFlushing || !UseCodeCacheFlushing || NmethodSweepActivity == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test these cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@@ -1,5 +1,5 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it Git hiccup with the file renaming?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, haha.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git automagically computes renaming based on how similar the files are. I just moved the CompilationLog class, and removed the file it came from, so it thinks it's a "rename". One of the pitfalls of git.


NoSafepointVerifier nsv;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is window after releasing all previous locks and here when nmethod could become dead. I don't see post_compiled_method() checking for that.
Also I don't see such NoSafepointVerifier in JVMCIRuntime::register_method().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nmethod can only get unloaded at safepoint polls. There aren't any when unlocking. The reason the NSV isn't just crossing the entire scope, is because some locks like Compile_lock will block when we take the lock, just not when we unlock it.
Unfortunately I can't put a similar NSV in the JVMCI installer, because when you run JVMCI with native image, there is a setter that uses JNI, which will check for safepoints. However, that only happens if the nmethod failed to install, which doesn't really matter, but it does prevent the NSV from being in the same corresponding location there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am talking about concurrent change of nmethod's state. When you publish nmethod (set Method::_code) and it is not locked it could be marked non-entrant. It depends on how OS's scheduler run/suspend this compiler's thread.
But you are right, unloading only happens in SFP. So the code should be still there and it is still alright to initialize tasks' fields.

nmethod::post_compiled_method_load_event() has NoSafepointVerifier already. Only task's fields settings is not covered. Do you think we still need it here?

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

I would be interested in seeing what the graph looks like with my current proposed change. I don't know what state my prototype was in when you performed measurements, and the heuristics have been tweaked.

Below is a CodeCache chart (numbers from -Xlog:codecache this time) on the current proposed change. App gets 250MB of CodeCache, and executes 23 independent benchmarks.

изображение

Was there a regression due to unloading perceivably cold code?

No, specific of my run is a set of independent benchmarks: warmup, iteration, warmup, iteration - it does not get regression unless the CodeHeap is completely over. Sweeper itself does not affect performance, I checked it. My concern is about user applications.

Having said that, if half the code cache was filled up, it sounds like it's getting to a point where you do want to flush things that haven't been used in a long while, to avoid getting into the "red zone" of aggressive sweeping at 90%, in order to free up space for currently hot code. To me it looks like it is working kind of as expected.

In my picture, we can see that the methods are swept out when the CodeCache is 70% free. For me it is not expected. Even a super-cold method can come back. We should not flush it when we are far from memory starvation.

  1. It isn't obvious based on the chart that my new strategy unloads more nmethods due to being cold, than we did before. The sweeper before would make nmethods not_entrant in one cycle, but might need 4 cycles to actually be able to delete them. That means that the previous approach had much more floating garbage in the code cache, but that doesn't necessarily imply that the number of nuked nmethods were smaller. The new approach is much more effective and can do all the things necessary to detect cold nmethods and unload them from the code cache, in a single cycle, which immediately brings down the code cache size. It still might be that my new algorithm is more aggressive, but the data does not conclude that.
  2. The UseCodeCacheFlushing flag is on by default. Its assumption is that you want to "Remove cold/old nmethods from the code cache". Based on what you are saying, it sounds more like we shouldn't do that at all for some applications where you want super cold nmethods to still be around. I guess this flag can be turned off then, if you have such an application.
    Note though that if the application for example idles, then there is no allocation rate in the code cache, and my new heuristics won't remove anything, because it is based on how long until the code cache becomes "too full", given the current allocation rate (which is likely 0 when idling), while the old heuristics have been reported to reap the entire code cache in such situations.

@fisk
Copy link
Contributor Author

fisk commented Aug 4, 2022

Initial review of few code/ files. Will do more later.

I am not comfortable with name unloading for this process because historically, for me, it is associated with class unloading and "unloading" corresponding nmethods. cleaning is more similar with sweeping. But it could be only me.

Thanks for reviewing this change! I'm using the term "code cache unloading", because that is what we have always called unloading of nmethods triggered by the GC. And now it is the GC that owns this completely, so code cache unloading is the way in which nmethods are freed. At least that was my reasoning. I'd like to split the "unloading" parts of the CodeCache to a separate file, but I decided it's better to do that in a separate patch, as this patch is large enough and I don't want to move around code as well in it. Hope this makes sense.

@fisk
Copy link
Contributor Author

fisk commented Aug 22, 2022

Can you please add a null check? C1 patching stubs contain null Oops on PPC64.

--- a/src/hotspot/share/gc/shared/barrierSetNMethod.cpp
+++ b/src/hotspot/share/gc/shared/barrierSetNMethod.cpp
@@ -72,7 +72,9 @@ bool BarrierSetNMethod::nmethod_entry_barrier(nmethod* nm) {
       // conversion that performs the load barriers. This is too subtle, so we instead
       // perform an explicit keep alive call.
       oop obj = NativeAccess<ON_PHANTOM_OOP_REF | AS_NO_KEEPALIVE>::oop_load(p);
-      Universe::heap()->keep_alive(obj);
+      if (obj != nullptr) {
+        Universe::heap()->keep_alive(obj);
+      }
     }
 
     virtual void do_oop(narrowOop* p) { ShouldNotReachHere(); }

Or should we fix that separately? Seems to be related to JDK-8292077.

Ah. Good find. I can add the null check.

@TheRealMDoerr
Copy link
Contributor

Can you please add a null check? C1 patching stubs contain null Oops on PPC64.

--- a/src/hotspot/share/gc/shared/barrierSetNMethod.cpp
+++ b/src/hotspot/share/gc/shared/barrierSetNMethod.cpp
@@ -72,7 +72,9 @@ bool BarrierSetNMethod::nmethod_entry_barrier(nmethod* nm) {
       // conversion that performs the load barriers. This is too subtle, so we instead
       // perform an explicit keep alive call.
       oop obj = NativeAccess<ON_PHANTOM_OOP_REF | AS_NO_KEEPALIVE>::oop_load(p);
-      Universe::heap()->keep_alive(obj);
+      if (obj != nullptr) {
+        Universe::heap()->keep_alive(obj);
+      }
     }
 
     virtual void do_oop(narrowOop* p) { ShouldNotReachHere(); }

Or should we fix that separately? Seems to be related to JDK-8292077.

Ah. Good find. I can add the null check.

I just noticed that there is already a bug open: https://bugs.openjdk.org/browse/JDK-8292368
I think I should better fix it separately in case it will be needed by backports. Thanks for taking a look!

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #9968 for the missing null check.

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the whole PR (there were already enough reviews), but I like the sweeper removal and what I've seen looks good. Impressive how strongly the sweeper is interwoven with the rest of hotspot! I'll be glad to see it go away.

Please note that there's a little more merging required after recent changes.

I'd like to rerun it in our nightly testing and discuss with Thomas. Heuristics fine tuning could still be done as follow-up, but we should try to avoid major drawbacks.

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated labels Aug 23, 2022
@openjdk openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Aug 24, 2022
@fisk
Copy link
Contributor Author

fisk commented Aug 24, 2022

I haven't reviewed the whole PR (there were already enough reviews), but I like the sweeper removal and what I've seen looks good. Impressive how strongly the sweeper is interwoven with the rest of hotspot! I'll be glad to see it go away.

Please note that there's a little more merging required after recent changes.

I'd like to rerun it in our nightly testing and discuss with Thomas. Heuristics fine tuning could still be done as follow-up, but we should try to avoid major drawbacks.

I fixed the merge conflict. Please let me know when you feel ready on your end.

@TheRealMDoerr
Copy link
Contributor

I think the term "full GC" has led to some confusion. You mean that we now need a complete marking cycle before we can flush any unused nmethods, right?
People often think about terrible stop-the-world GCs which tidy up the whole Java heap etc. when reading "full GC".

@fisk
Copy link
Contributor Author

fisk commented Aug 24, 2022

I think the term "full GC" has led to some confusion. You mean that we now need a complete marking cycle before we can flush any unused nmethods, right?

People often think about terrible stop-the-world GCs which tidy up the whole Java heap etc. when reading "full GC".

Yes that's right. By "full GC", I meant a GC that traverses live objects in the entire heap, as opposed to a subset of it. For me, whether that operation is performed STW, or concurrently doesn't make it more or less "full". But I realize that many people associate the term with a full [implied STW] GC.

@TheRealMDoerr
Copy link
Contributor

TheRealMDoerr commented Aug 24, 2022

Thanks for the confirmation. Our CI found one issue in a test with small code cache which was already reported by Thomas above:
"Thread ... died with java.lang.VirtualMachineError: Out of space in CodeCache for adapters"
Do we still free up unused adapters (Method->_adapter which are classified as CodeBlobType::NonNMethod) or could this be missing, now?

One more thing: You are using BarrierSetNMethod::arm with value regardless of Continuations::enabled(). Can you please add the missing PPC64 part?

--- a/src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp
+++ b/src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp
@@ -122,7 +122,12 @@ void BarrierSetNMethod::disarm(nmethod* nm) {
 }
 
 void BarrierSetNMethod::arm(nmethod* nm, int arm_value) {
-  Unimplemented();
+  if (!supports_entry_barrier(nm)) {
+    return;
+  }
+
+  NativeNMethodBarrier* barrier = get_nmethod_barrier(nm);
+  barrier->release_set_guard_value(arm_value);
 }
 
 bool BarrierSetNMethod::is_armed(nmethod* nm) {

@fisk
Copy link
Contributor Author

fisk commented Aug 25, 2022

+  if (!supports_entry_barrier(nm)) {
+    return;
+  }
+
+  NativeNMethodBarrier* barrier = get_nmethod_barrier(nm);
+  barrier->release_set_guard_value(arm_value);

I added the missing PPC code, thank you. As far as adapters go, I don't believe we have ever removed any. We have relied on caching/sharing of adapters of the same signature to keep the footprint down.

@TheRealMDoerr
Copy link
Contributor

Thanks for adding it! Looks good from our side. Our CI didn't find any other problems and we can still look into handling of adapters after this change.

@fisk
Copy link
Contributor Author

fisk commented Aug 25, 2022

/integrate

@fisk
Copy link
Contributor Author

fisk commented Aug 25, 2022

Thanks for adding it! Looks good from our side. Our CI didn't find any other problems and we can still look into handling of adapters after this change.

Thank you Martin!

@openjdk
Copy link

openjdk bot commented Aug 25, 2022

Going to push as commit 054c23f.
Since your change was applied there have been 15 commits pushed to the master branch:

  • dc7e256: 8290376: G1: Refactor G1MMUTracker::when_sec
  • 5a20bc4: 8292715: Cleanup Problemlist
  • 7b81a9c: 8289764: gc/lock tests failed with "OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects"
  • 76ee549: 8292329: Enable CDS shared heap for zero builds
  • 14623c6: 8292739: Invalid legacy entries may be returned by Provider.getServices() call
  • 568be58: 8290469: Add new positioning options to PassFailJFrame test framework
  • 69448f9: 8292679: Simplify thread creation in gtest and port 2 tests to new way
  • 3c2289d: 8215916: The failure reason of an optional JAAS LoginModule is not logged
  • 71ab5c9: 8292816: GPL Classpath exception missing from assemblyprefix.h
  • c062397: 8292713: Unsafe.allocateInstance should be intrinsified without UseUnalignedAccesses
  • ... and 5 more: https://git.openjdk.org/jdk/compare/ad2e0c4df045261c04b00bfa1faf5c21392edc58...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 25, 2022
@openjdk openjdk bot closed this Aug 25, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 25, 2022
@openjdk
Copy link

openjdk bot commented Aug 25, 2022

@fisk Pushed as commit 054c23f.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@TheRealMDoerr
Copy link
Contributor

Some more information about the "Out of space in CodeCache for adapters" issue: The test was running with less than 240MB ReservedCodeCacheSize. The VM doesn't enable SegmentedCodeCache in this case. Seems like the new implementation doesn't make sure we have enough space for adapters without SegmentedCodeCache, but we're not using this case in production. May still be a a good enhancement.

@fisk
Copy link
Contributor Author

fisk commented Aug 29, 2022

Some more information about the "Out of space in CodeCache for adapters" issue: The test was running with less than 240MB ReservedCodeCacheSize. The VM doesn't enable SegmentedCodeCache in this case. Seems like the new implementation doesn't make sure we have enough space for adapters without SegmentedCodeCache, but we're not using this case in production. May still be a a good enhancement.

Hmm. Good observation. Without segmented code cache, these issues are indeed more prevalent by nature. I'm not sure what the use case is for running without segmented code cache.

@TheRealMDoerr
Copy link
Contributor

I guess <240MB ReservedCodeCacheSize is relevant for people who run tiny Java apps in small containers and try to minimize all memory sizes. Some of them might observe this regression. We could reserve more space for adapters or we could try to enable SegmentedCodeCache for smaller code cache sizes as well.

@fisk
Copy link
Contributor Author

fisk commented Sep 1, 2022

I guess <240MB ReservedCodeCacheSize is relevant for people who run tiny Java apps in small containers and try to minimize all memory sizes. Some of them might observe this regression. We could reserve more space for adapters or we could try to enable SegmentedCodeCache for smaller code cache sizes as well.

Enabling SegmentedCodeCache for smaller code cache sizes sounds like a good idea.

@bulasevich
Copy link
Contributor

Enabling SegmentedCodeCache for smaller code cache sizes sounds like a good idea

Moving to SegmentedCodeCache for all platforms results in a performance regression for small platforms, but I remember there were plans (@eastig?) to halve the (ReservedCodeCacheSize >= 240*M) limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated