Skip to content

Conversation

@fandreuz
Copy link
Contributor

@fandreuz fandreuz commented Oct 10, 2025

I propose to amend nmethod::is_cold to let GC collect not-entrant native nmethod instances.

Passes tier1 and tier2 (fastdebug).


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8369219: JNI::RegisterNatives causes a memory leak in CodeCache (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27742/head:pull/27742
$ git checkout pull/27742

Update a local copy of the PR:
$ git checkout pull/27742
$ git pull https://git.openjdk.org/jdk.git pull/27742/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27742

View PR using the GUI difftool:
$ git pr show -t 27742

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27742.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 10, 2025

👋 Welcome back fandreuzzi! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 10, 2025

@fandreuz This change is no longer ready for integration - check the PR body for details.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Oct 10, 2025
@openjdk
Copy link

openjdk bot commented Oct 10, 2025

@fandreuz The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@fandreuz fandreuz changed the title 8369219: Let GC reclaim cold native wrappers 8369219: JNI::RegisterNatives can cause a memory leak in CodeCache Oct 10, 2025
@fandreuz fandreuz marked this pull request as ready for review October 11, 2025 14:51
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 11, 2025
@mlbridge
Copy link

mlbridge bot commented Oct 11, 2025

@dholmes-ora
Copy link
Member

/label add hotspot-compiler

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Oct 13, 2025
@openjdk
Copy link

openjdk bot commented Oct 13, 2025

@dholmes-ora
The hotspot-compiler label was successfully added.

@fandreuz fandreuz changed the title 8369219: JNI::RegisterNatives can cause a memory leak in CodeCache 8369219: JNI::RegisterNatives causes a memory leak in CodeCache Oct 13, 2025
// nmethods that don't seem to be all that relevant any longer.
bool nmethod::is_cold() {
if (!MethodFlushing || is_native_method() || is_not_installed()) {
if (!MethodFlushing || (is_native_method() && is_in_use()) || is_not_installed()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess we need to decide what to do about native wrappers that are still "in use", but are "cold" because they haven't been called in a while. The above change would keep them around forever. We could instead allow them to be cleaned up like regular nmethods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could instead allow them to be cleaned up like regular nmethods.

That sounds reasonable to me, native methods seem to be tracked like all other nmethods.

Removing is_native_method() altogether from the condition was the first implementation I had, and as far as I remember there was no failure in tier1 or tier2. Should I propose this alternative implementation as part of this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am tempted to say yes, for consistency, but it probably won't make much of a difference either way. But now I am wondering, if these cold native wrappers continue to be immortal, then do they really need to give them nmethod entry barriers? Removing the barrier could remove some overhead. Whatever direction we decide to go, it would be good to add a comment here explaining the decision and/or trade-offs.

Copy link
Contributor Author

@fandreuz fandreuz Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually possible to remove entry barriers for any garbage collectable nmethod? How can we know an nmethod is not used anymore, even when it is made not entrant? is_cold() bails out when an nmethod does not support entry barriers:

// On platforms that don't support nmethod entry barriers, we can't
// trust the temporal aspect of the gc epochs. So we can't detect
// cold nmethods on such platforms.

So, the decision of removing entry barriers for native nmethods would make the memory leak I'm trying to fix here effectively unfixable? Let me know if I'm missing something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we mark them as not-entrant, then the is_not_entrant() check below will still catch them, right?

Copy link
Contributor Author

@fandreuz fandreuz Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I assumed an nmethod couldn't be marked as on-stack without entry barriers, but that doesn't seem to be the case.

But on second thought, do you agree with the fix I'm proposing in this PR? I think the following two work items could be implemented and reviewed in separate changesets:

  • Allow not-entrant nmethod to be collected during GC (I removed is_static_method() from L2599, so native nmethods are treated just like normal nmethods)
  • Evaluate the implications of removing entry barriers for native nmethods, thus letting GC reclaim them whenever !is_maybe_on_stack() && is_not_entrant(), but without the overhead of entry barriers.

I'm proposing this because I guess the latter will need more discussion and is technically not needed to fix the memory leak I address in this PR. Do you agree @dean-long ? I could create another ticket to handle the second item.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm fine with it being a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fandreuz fandreuz marked this pull request as draft October 18, 2025 09:51
@openjdk openjdk bot removed the rfr Pull request is ready for review label Oct 18, 2025
@fandreuz fandreuz marked this pull request as ready for review October 18, 2025 10:03
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 18, 2025
Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, but the test needs a bit more work.

/reviewers 2


WB.enqueueMethodForCompilation(method, 1 /* compLevel */);
while (WB.isMethodQueuedForCompilation(method)) {
Thread.onSpinWait();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are just waiting for compilation here. It is counter-productive to wait with a busy-loop. Insert a sleep for ~10...100ms instead. Same thing for the loop below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that sounds reasonable. I got this pattern from another test, but it looks counterproductive here indeed.

5d0c705

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 24, 2025
@openjdk
Copy link

openjdk bot commented Oct 24, 2025

@shipilev
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org hotspot-compiler hotspot-compiler-dev@openjdk.org rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

5 participants