Skip to content

8317350: Move code cache purging out of CodeCache::UnloadingScope #16011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tschatzl
Copy link
Contributor

@tschatzl tschatzl commented Oct 2, 2023

Hi all,

please review this refactoring that moves actual code cache flushing/purging out of CodeCache::UnloadingScope. Reasons:

  • I prefer that a destructor does not do anything substantial - in some cases, 90% of time is spent in the destructor in that extracted method (due to https://bugs.openjdk.org/browse/JDK-8316959)
  • imho it does not fit the class which does nothing but sets/resets some code cache unloading behavior (probably should be renamed to UnloadingBehaviorScope too in a separate CR).
  • other existing methods at that level are placed out of that (or any other) scope object too - which is already the case for when doing concurrent unloading.
  • putting it there makes future logging of the various phases a little bit easier, not having GCTraceTimer et al. in various places.

Testing: gha

Thanks,
Thomas


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8317350: Move code cache purging out of CodeCache::UnloadingScope (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16011/head:pull/16011
$ git checkout pull/16011

Update a local copy of the PR:
$ git checkout pull/16011
$ git pull https://git.openjdk.org/jdk.git pull/16011/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16011

View PR using the GUI difftool:
$ git pr show -t 16011

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16011.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 2, 2023

👋 Welcome back tschatzl! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8317350 8317350: Move code cache purging out of CodeCache::UnloadingScope Oct 2, 2023
@openjdk
Copy link

openjdk bot commented Oct 2, 2023

@tschatzl The following labels will be automatically applied to this pull request:

  • hotspot
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels Oct 2, 2023
@tschatzl tschatzl marked this pull request as ready for review October 3, 2023 12:58
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 3, 2023
@mlbridge
Copy link

mlbridge bot commented Oct 3, 2023

Webrevs

Comment on lines +2064 to +2068
CodeCache::do_unloading(unloading_occurred);
}

// Unload nmethods.
CodeCache::do_unloading(purged_class);
// Release unloaded nmethods's memory.
CodeCache::flush_unlinked_nmethods();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that recycling resources for nmethods involves two steps, unlink and free-unlinked, seems an insignificant impl detail in this caller context. Can that be hidden away from the public API, e.g. do_unloading perform these two steps directly?

Is there a reason why flush_unlinked_nmethods is outside the scope?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is maybe an insignificant detail in the context of stw collectors, but for gcs doing concurrent class/code unloading there needs to be a handshake/memory synchronization between unlinking and freeing, so these two operations need to be split.

I.e. what they do at a high level is:

unlink classes
unlink code cache
unlink other stuff

handshake

free unlinked classes
free unlinked code
free other stuff

Similarly I would like to split G1 class unloading into a STW part (unlinking stuff) and make all the freeing concurrent to avoid the additional (mostly) implementation overhead. Or at least start with that and then see if it is worth making everything concurrent. (The unlinking can be trimmed down further afaics).

Back to this CR: as the description states I think the current code wrongly hides this free-unlinked procedure in that UnloadingScope (there is a reason it is only used for the STW collectors) unnecessarily making the observed behavior (of one of the most time-consuming parts of class/code unloading) surprising.

Putting it on this level also allows more straightforward logging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is maybe an insignificant detail in the context of stw collectors

Then, could Serial and Parallel use APIs that don't expose these details? For instance, move flush_unlinked_nmethods inside CodeCache::do_unloading, as it is used only by those collectors.

Why is flush_unlinked_nmethods outside of UnloadingScope? This newly-introduced scope in the caller context seems extremely out of place, IMO.

Putting it on this level also allows more straightforward logging.

I don't get this. Can't see any log-print logic inside flush_unlinked_nmethods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, could Serial and Parallel use APIs that don't expose these details?

There is no such API. This change does not intend to expose such a new API too. Just moving the various phases of code/class unloading to the same level in the source code as apparent to me to keep surprises low (and simplify logging to be able to see problems in the first place. Then we can fix them in subsequent PRs).

Also I would like to keep the existing structure of class/code unloading for all collectors for uniformity (e.g. separate unlinking from free-unlinking - maybe call it "purging" in the future?) - so that they will have the same structure and print the same logging messages in the same order in the future (at least for the STW collectors including G1).

Note that other collectors have exactly the same phases, and unlinking is separate from purging everywhere else too. They just re-implement some phases using their own code (doing some renames in the process).

Having both the same structure/phases and same (more comprehensive) logging output for more collectors will make troubleshooting much easier.

There is no renaming in this change, e.g. CodeCache::do_unloading() (="unlinking") as this method does not do the complete unloading either indicated by having this external flush_unlinked_nmethods (="purging"). This is not the change to do this imo.
(Iirc CodeCache::do_unloading() still has does some purging in it anyway, but let's not digress too much here).

For instance, move flush_unlinked_nmethods inside CodeCache::do_unloading, as it is used only by those collectors.

It could, but then for (future) logging you would have to pass GCTraceTimer as the common way to do timing in gc code into do_unloading() which is some compiler code. Not sure if this is what we want as it is imo awkward. It is imho better, cleaner and sufficient to have timing outside at least initially. I.e. I would at this point prefer the style of

{ // Phase x
  GCTraceTimer x("do phase X");
  do_phase_x();
}

as how to time can be heavily collector dependent too. I would at the end of all that refactoring see if there is some useful coalescing of the methods into helpers that can be done.
Maybe these scopes should be put into separate helper methods, I haven't decided yet what's best.

Why is flush_unlinked_nmethods outside of UnloadingScope? This newly-introduced scope in the caller context seems extremely out of place, IMO.

I believe most of your questions stem from the naming. Please, do not be too hung up on names. The description of this PR already mentioned that UnloadingScope is a misnomer (and misnamed code is common in Hotspot code, or responsibilities changed over the course of years without fixing up the naming). So in addition to this class, be aware that there are lots of misnamed and not uniformly named methods in class/code unloading code too. UnloadingScope controls unlinking behavior by setting some globals and does not control the whole unloading process (ie. unlinking + purging).

Additionally UnloadingScope actually did not really contain flush_unlinked_nmethods earlier either. It has been the last call in the destructor of the UnloadingScope, outside, after the unlinking behavior has been reset. This is even more strange if you think about it.

So from my understanding of the code this scope object ought to enclose only the unlinking part of the unloading (i.e. the decision of what to unlink, that is CodeCache::do_unloading()) before. This change only at most exposes the existing (as you say ugly - I agree) structure.

Which isn't a bad thing to me to have exposed. It's not in scope of this change to fix this imo because that would mix it with other unrelated changes.

(UnloadingScope should be called something different, maybe after this discussion something like UnlinkingScope or similar? Idk.)

Putting it on this level also allows more straightforward logging.
I don't get this. Can't see any log-print logic inside flush_unlinked_nmethods.

... in the future. (https://bugs.openjdk.org/browse/JDK-8315504)[https://bugs.openjdk.org/browse/JDK-8315504] intends to add more timing/logging for every phase.

There is currently no plan for comprehensive logging inside the various phases of the class/code unloading (at least not to the level of selected methods I did in recent investigations).

What I intend to do is revisiting the phases in the future and move around work to better reflect the unlinking/purging split, and also link them together with a real UnloadingScope covering the whole unloading process (not to be mixed up with the current UnloadingScope) to allow gc-specific replacements/optimizations of the phases.

Obviously there can be helper methods that simplify things for various gcs. This PR isn't this change though.

Hth,
Thomas

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnloadingScope should be called something different, maybe after this discussion something like UnlinkingScope or similar?

OK, the new-scope starts to make sense if it's named UnlinkingScope.

In my mind, the concept of UnloadingScope includes unlinking + purging; the implementation supports my understanding, as I believe the destructor still belongs to the obj-on-stack. However, I now realize that this interpretation could be subjective.

This change only at most exposes the existing (as you say ugly - I agree) structure.

OK, that's all I wanna convey. Maybe this is an inevitable transient state.

@albertnetymk
Copy link
Member

/label add hotspot-gc

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Oct 16, 2023
@openjdk
Copy link

openjdk bot commented Oct 16, 2023

@albertnetymk
The hotspot-gc label was successfully added.

Comment on lines +2064 to +2068
CodeCache::do_unloading(unloading_occurred);
}

// Unload nmethods.
CodeCache::do_unloading(purged_class);
// Release unloaded nmethods's memory.
CodeCache::flush_unlinked_nmethods();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnloadingScope should be called something different, maybe after this discussion something like UnlinkingScope or similar?

OK, the new-scope starts to make sense if it's named UnlinkingScope.

In my mind, the concept of UnloadingScope includes unlinking + purging; the implementation supports my understanding, as I believe the destructor still belongs to the obj-on-stack. However, I now realize that this interpretation could be subjective.

This change only at most exposes the existing (as you say ugly - I agree) structure.

OK, that's all I wanna convey. Maybe this is an inevitable transient state.

@openjdk
Copy link

openjdk bot commented Oct 17, 2023

@tschatzl This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8317350: Move code cache purging out of CodeCache::UnloadingScope

Reviewed-by: ayang, iwalulya

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 226 new commits pushed to the master branch:

  • 9cf334f: 8318383: Remove duplicated checks in os::get_native_stack() in posix implementation
  • defc7e0: 8318454: TestLayoutPaths broken on Big Endian platforms after JDK-8317837
  • 3c70f2c: 8318418: hsdis build fails with system binutils on Ubuntu
  • 15acf4b: 8318324: Drop redundant default methods from FFM API
  • 1a09835: 8317358: G1: Make TestMaxNewSize use createTestJvm
  • 47bb1a1: 8318415: Adjust describing comment of os_getChildren after 8315026
  • 80bd22d: 8316144: Remove unused field jdk.internal.util.xml.impl.XMLStreamWriterImpl.Element._Depth
  • c0e154c: 8318089: Class space not marked as such with NMT when CDS is off
  • 24bc5bd: 8318104: macOS 10.13 check in TabButtonAccessibility.m can be removed
  • e25a49a: 8318471: ProblemList compiler/sharedstubs/SharedTrampolineTest.java
  • ... and 216 more: https://git.openjdk.org/jdk/compare/795e5dcc856491031b87a1f2a942681a582673ab...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 17, 2023
@tschatzl
Copy link
Contributor Author

Thanks @walulyai @albertnetymk for your reviews
/integrate

@openjdk
Copy link

openjdk bot commented Oct 20, 2023

Going to push as commit bd3bc2c.
Since your change was applied there have been 235 commits pushed to the master branch:

  • 292aad2: 8316436: ContinuationWrapper uses unhandled nullptr oop
  • 387504c: 8317575: AArch64: C2_MacroAssembler::fast_lock uses rscratch1 for cmpxchg result
  • d9ce525: 8318150: StaticProxySelector.select should not throw NullPointerExceptions
  • c46a54e: 8312777: notifyJvmtiMount before notifyJvmtiUnmount
  • 8f5f440: 8317692: jcmd GC.heap_dump performance regression after JDK-8292818
  • 684b91e: 8315064: j.text.ChoiceFormat provides no specification on quoting behavior
  • 1740950: 8314901: AES-GCM interleaved implementation using AVX2 instructions
  • cc8f8da: 8318322: Update IANA Language Subtag Registry to Version 2023-10-16
  • 599560a: 8317635: Improve GetClassFields test to verify correctness of field order
  • 9cf334f: 8318383: Remove duplicated checks in os::get_native_stack() in posix implementation
  • ... and 225 more: https://git.openjdk.org/jdk/compare/795e5dcc856491031b87a1f2a942681a582673ab...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 20, 2023
@openjdk openjdk bot closed this Oct 20, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 20, 2023
@openjdk
Copy link

openjdk bot commented Oct 20, 2023

@tschatzl Pushed as commit bd3bc2c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tschatzl tschatzl deleted the submit/8317350-move-code-cache-out-of-unloadingscope branch October 24, 2023 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated shenandoah shenandoah-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

3 participants