Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8268088: Clarify Method::clear_jmethod_ids() related comments in ClassLoaderData::~ClassLoaderData() #4383

Closed
wants to merge 3 commits into from

Conversation

jianglizhou
Copy link
Contributor

@jianglizhou jianglizhou commented Jun 7, 2021

This PR clarifies Method::clear_jmethod_ids() related comments in ClassLoaderData::~ClassLoaderData(). It adds more explanations on why Method::clear_jmethod_ids only sets the jmethod_ids to NULL without releasing the memory for related JNIMethodBlocks and JNIMethodBlockNodes.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8268088: Clarify Method::clear_jmethod_ids() related comments in ClassLoaderData::~ClassLoaderData()

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4383/head:pull/4383
$ git checkout pull/4383

Update a local copy of the PR:
$ git checkout pull/4383
$ git pull https://git.openjdk.java.net/jdk pull/4383/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4383

View PR using the GUI difftool:
$ git pr show -t 4383

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4383.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 7, 2021

👋 Welcome back jiangli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 7, 2021
@openjdk
Copy link

openjdk bot commented Jun 7, 2021

@jianglizhou The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Jun 7, 2021
@mlbridge
Copy link

mlbridge bot commented Jun 7, 2021

Webrevs

// NULL after class unloading.
// Method::clear_jmethod_ids only sets the jmethod_ids to NULL without
// releasing the memory for related JNIMethodBlocks and JNIMethodBlockNodes.
// This is done intentionally because native code (e.g. JVMTI agent) obtains
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rewriting to "native code (...) holding jmethod_ids may access..." would be clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as suggested. Thanks!

// releasing the memory for related JNIMethodBlocks and JNIMethodBlockNodes.
// This is done intentionally because native code (e.g. JVMTI agent) obtains
// jmethod_ids may access them after the associated classes and class loader
// are unloaded. The Java native Interface Specification says "method ID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"native" -> "Native"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@openjdk openjdk bot removed the rfr Pull request is ready for review label Jun 7, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 7, 2021
Copy link
Member

@eastig eastig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jianglizhou
Copy link
Contributor Author

Thanks for the review!

@jianglizhou
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jun 7, 2021

@jianglizhou This PR has not yet been marked as ready for integration.

@jianglizhou
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jun 7, 2021

@jianglizhou This PR has not yet been marked as ready for integration.

// does not prevent the VM from unloading the class from which the ID has
// been derived. After the class is unloaded, the method or field ID becomes
// invalid". In real world usages, the native code may rely on jmethod_ids
// being NULL after class unloading. Hence, it is unsafe to free the memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any use cases of what a real world application can do with a NULL'ed out jmethodID (other than JVM TI will give an JVMTI_ERROR_INVALID_METHODID for it)?
All or most of the JNI functions will crash with a NULL'ed out jmethodID. You can catch this error by using -Xcheck:jni. Maybe we could deprecate this feature of the JVMTI spec and release the memory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks like an expanded version of the one I wrote a long time ago and correct. We were wondering if this is a feature that we actually need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any use cases of what a real world application can do with a NULL'ed out jmethodID (other than JVM TI will give an JVMTI_ERROR_INVALID_METHODID for it)?
All or most of the JNI functions will crash with a NULL'ed out jmethodID. You can catch this error by using -Xcheck:jni. Maybe we could deprecate this feature of the JVMTI spec and release the memory?

The JVMTI agent use case that ran into issues (with releasing memory) calls GetMethodDeclaringClass for a stale jmethod_id (with unloaded class and loader) and crashes in Method::checked_resolve_jmethod_id when it's dereferencing the 'mid'.

The agent native code does check for JVMTI_ERROR_NONE after calling GetMethodDeclaringClass in this case, but it's too late.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks like an expanded version of the one I wrote a long time ago and correct.

Thanks for confirming the correctness!

We were wondering if this is a feature that we actually need.

That seems to be the case. I learned this the hard way with crashes observed in production when attempting to release memory for stale jmethod_ids from VM side. Clarifying the comments hopefully will benefit others and avoid similar issues in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the issue is specific to JVMTI code only, the comment should reflect that. Currently it seems to suggest that all possible JNI code could use an invalid jmethodID. However, using an invalid jmethodID with JNI APIs such as CallStaticVoidMethod will lead to a crash. I don't think the JNI spec allows invalid jmethodID to be used.

Furthermore, we should consider freeing the JNIMethodBlock when JVMTI is not enabled, so we can avoid the memory leak.

We may even want to update the JVMTI spec to disallow invalid jmethodID, but will that introduce incompatibility with 3rd party code?

Lastly, do we have any existing test cases that cover the use of invalid jmethodIDs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments to both of these bugs JDK-8268088 and JDK-8268364.

There are places in the JNI spec that don't specify a "valid method ID" and some places just say "a method ID". We might be able to fix those. We will crash for invalid method ID in all the JNI entry points. There is -Xcheck:jni that does check the method ID though. Guess I'm repeating my comment above.

As far as the new comment goes, for now, it should include JNI until which time we clarify the spec. I don't see the harm in that.

I was trying to think of a way to check jmethodID with SafeFetch or something to to return JVMTI_ERROR_INVALID_METHODID without running into ABA problems. Maybe that would work and we can deallocate the jmethodID block.

That said, I never saw the PR 66ef04a and I'd want to review that carefully. My first quick click-through was not positive. I hid JMethodIDBlocks in method.cpp on purpose! Plus it seems like a huge change for JDK 11.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://docs.oracle.com/en/java/javase/16/docs/specs/jni/design.html

"A field or method ID does not prevent the VM from unloading the class from which the ID has been derived. After the class is unloaded, the method or field ID becomes invalid.
...
The programmer must not pass illegal pointers or arguments of the wrong type to JNI functions. Doing so could result in arbitrary consequences, including a corrupted system state or VM crash."

I had read the first quoted text in "Accessing Fields and Methods" when looking into the spec. It states that the method ID becomes invalid after class unloading, but the VM&JNI implementation has tolerated the invalid ID usages after unloading for long time. The second one that you quoted from "Reporting Programming Errors" has stronger language regarding the usages of illegal pointers. It would be good to clarify the language in "Accessing Fields and Methods" to not allow the usage of the invalid IDs after class unloading.

From above, we can conclude that both the JNI spec and implementation disallow the use of invalid jmethodIDs. No existing code 3rd party JNI code can run with invalid jmethodIDs without crashing.

Today's implementation can tolerate the 'invalid' jemthod_ids usages after unloading (by not releasing the memory).

In JDK-8268088, you described a JVMTI-based profiler that "captured stack traces and then symbolized methods using jmethod_ids at a later point during runtime execution, which could happen after unloading the associated classes and loaders."

This is unsafe and can crash with ZGC even if we don't free the JNIMethodBlocks, because checked_method could point to a deallocated Method. See more in JDK-8268364.

Agreed, that would become unsafe when we change to release memory for JNIMethodBlock and JNIMethodBlockNodes with unloaded classes. Thanks for filing JDK-8268364.

To handle this correctly, the profiler should ensure that the classes with the collected stacks cannot be GCed. This can be done by creating a GlobalRef on the classloader associated with the jmethodID:

  • GetMethodDeclaringClass -> GetClassLoader -> NewGlobalRef

The profiler can release the GlobalRef after it has processed the jmethodID.

I would be very cautious to recommend this practice. I added some comments in JDK-8268364:

That would keep more memory for longer time. The agent behavior would dictate the memory usages of the VM and prevent classes from being unloaded unnecessarily. If an agent only processes all jmethod_ids at the end, it would keep all loaders alive without being able to be unloaded. Also, a bug (fail to release any of the GoabalRef) in the agent might cause memory leaks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is -Xcheck:jni that does check the method ID though. Guess I'm repeating my comment above.

-Xcheck:jni is not enabled commonly during testing due to other issues. Catching the issue with -Xcheck:jni before production would help (this case does make it more important to enable -Xcheck:jni).

As far as the new comment goes, for now, it should include JNI until which time we clarify the spec. I don't see the harm in that.

Agreed.

That said, I never saw the PR 66ef04a and I'd want to review that carefully. My first quick click-through was not positive.

I opened PR 66ef04a mainly for reference purpose for now. Please don't spend much time reviewing yet :), since we need to address the invalid jmethod_id usages first.

I hid JMethodIDBlocks in method.cpp on purpose! Plus it seems like a huge change for JDK 11.

I moved JNIMethodBlock and JNIMethodBlockNodes class declarations to method.hpp to trigger destructor properly from classLoaderData.cpp in 66ef04a. I made the change as an internal fix and intended to only contribute it to JDK head if there was no issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From above, we can conclude that both the JNI spec and implementation disallow the use of invalid jmethodIDs. No existing code 3rd party JNI code can run with invalid jmethodIDs without crashing.

Today's implementation can tolerate the 'invalid' jemthod_ids usages after unloading (by not releasing the memory).

Today's JNI implementation in HotSpot does not in any way tolerate such invalid jmethodIDs. If an invalid jmethodID is passed to any APIs declared in jni.h, the VM will crash immediately. Please see JDK-8268406 for more details.

It's only JVMTI that tolerates invalid jmethodIDs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today's JNI implementation in HotSpot does not in any way tolerate such invalid jmethodIDs. If an invalid jmethodID is passed to any APIs declared in jni.h, the VM will crash immediately. Please see JDK-8268406 for more details.

It's only JVMTI that tolerates invalid jmethodIDs.

Thanks for checking into that! That narrows the issue, which is good.

I also created JDK-8268416 for clarifying the spec.

@jianglizhou
Copy link
Contributor Author

Hi Coleen,

Some more background context related to this:

Users/developers would be happy if we can release memory for jmethod_ids in unloaded cases and get rid of the memory leak. I was pinged recently for the leak after the earlier unsuccessful attempt to fix with 66ef04a change (on JDK 11, and was rollback due to agent crashes). I haven't found a general solution that would work well without the memory leak. So decided to clarify the comment for now and also open up the discussion in JDK-8268088.

The agent that ran into crashes with the attempted fix saved stacktraces and jmethod_ids from periodic sampling. It then tried to get the class and method information (name, etc) from the saved jmethod_ids at later points during the execution and crashed in Method::checked_resolve_jmethod_id with a stale jmethod_id. One possible solution that I thought of was to 'remove' the jmethod_ids in agent during unloading (described in https://bugs.openjdk.java.net/browse/JDK-8268088?focusedCommentId=14424635&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14424635). That could allow deallocating the memory related to those jmethod_ids within the VM. However, that's not a general and efficient solution because:

  • It would require a new call back being registered and implemented from the agent. An agent may not choose to do so. JVM cannot reliably free the memory without a potential SEGV risk with the approach.
  • It adds overhead to class unloading. The agent would need to iterate all stored stack traces and find the jmethod_ids related to the classes that are currently being unloaded. That can be expansive and would add noticeable overhead.
  • Also, agent may have bugs and fail to remove a jmethod_ids during the callback.

It would be good if we can find a workable solution for this (for long term).

Thanks!

@coleenp
Copy link
Contributor

coleenp commented Jun 8, 2021

Hi, I put a short comment in JDK-8268088 which I'll expand later. There's already a JVMTI callback for unloading classes that the user could use to walk the saved jmethodID stacks:

// notify the debugger
if (JvmtiExport::should_post_class_unload()) {
JvmtiExport::post_class_unload(ik);
}

Otherwise, don't change class unloading to add a callback because it could be sensitive to safepoints and unsafe places.

My comment in this bug echoes what Ioi said above. To prevent class unloading, the class mirror should be saved with the Method. This is the approach taken throughout the vm. I think hunting down applications that need to fix their profilers may be difficult, but we could add some deprecation messages and warnings (or even an option, which I thought was what you were going to do). I'm not sure why so many lines are changed in your above pull request.

For now the updated message seems fine to me for JDK 17.

Copy link
Member

@iklam iklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openjdk
Copy link

openjdk bot commented Jun 8, 2021

@jianglizhou This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8268088: Clarify Method::clear_jmethod_ids() related comments in ClassLoaderData::~ClassLoaderData()

Reviewed-by: iklam

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 67 new commits pushed to the master branch:

  • fafc4d9: 8268352: Rename javadoc Messager class to JavadocLog
  • b568e87: 8237388: serviceability/dcmd/framework/VMVersionTest.java fails with connection refused error.
  • c21cc93: 8248877: Document API contract for MetaspaceObj subtypes
  • 159cb6f: 8268083: JDK-8267706 breaks bin/idea.sh on a Mac
  • 8158b82: 8268327: Upstream: 8268169: The system lookup can not find stdio functions such as printf on Windows 10
  • 6843576: 8268227: java/foreign/TestUpcall.java still times out
  • 2717fcb: 8232948: javac -h should mangle the overload argument signature
  • 89da202: 8266159: macOS ARM + Metal pipeline shows artifacts on Swing Menu with Java L&F
  • 61ab4b9: 8267564: JDK-8252971 causes SPECjbb2015 socket exceptions on Windows when MKS is installed
  • 00c88f7: 8266918: merge_stack in check_code.c add NULL check
  • ... and 57 more: https://git.openjdk.java.net/jdk/compare/e2d5ff9d456dd339ccd21df2f75c4e34e5784d9a...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 8, 2021
@jianglizhou
Copy link
Contributor Author

Hi, I put a short comment in JDK-8268088 which I'll expand later. There's already a JVMTI callback for unloading classes that the user could use to walk the saved jmethodID stacks:

// notify the debugger
if (JvmtiExport::should_post_class_unload()) {
JvmtiExport::post_class_unload(ik);
}

Otherwise, don't change class unloading to add a callback because it could be sensitive to safepoints and unsafe places.

A new capability as Ioi suggested in https://bugs.openjdk.java.net/browse/JDK-8268364?focusedCommentId=14426310&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14426310 would be better than using the existing callback for unloading. The possible approach that I described in https://bugs.openjdk.java.net/browse/JDK-8268088?focusedCommentId=14424635&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14424635 also looks for the same direction (however there are still other issues as described in the bug comment). Also, we need a good strategy for existing code that expects the current behavior.

My comment in this bug echoes what Ioi said above. To prevent class unloading, the class mirror should be saved with the Method. This is the approach taken throughout the vm. I think hunting down applications that need to fix their profilers may be difficult, but we could add some deprecation messages and warnings (or even an option, which I thought was what you were going to do).

Prevent or defer class unloading would keep more memory for longer time. It could cause other unwanted side-effects.

I'm not sure why so many lines are changed in your above pull request.

Do you refer to 66ef04a? That's based on my internal change on JDK 11 (with other local changes). I restructured code by moving JNIMethodBlock and JNIMethodBlockNodes class declarations to the header file. That's needed to trigger destructor properly from classLoaderData.cpp. The original logic was kept the same, and only the code was being moved around.

I opened my internal change for reference purposes and assisting the discussions to address the memory leak.

@jianglizhou
Copy link
Contributor Author

LGTM

Thanks for the review!

@jianglizhou
Copy link
Contributor Author

/integrate

@openjdk openjdk bot closed this Jun 8, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated labels Jun 8, 2021
@openjdk openjdk bot removed the rfr Pull request is ready for review label Jun 8, 2021
@openjdk
Copy link

openjdk bot commented Jun 8, 2021

@jianglizhou Since your change was applied there have been 73 commits pushed to the master branch:

  • 5ad4a91: 8268127: Shenandoah: Heap size may be too small for region to align to large page size
  • 7a37816: 8264866: Remove unneeded WorkArounds.isAutomaticModule
  • 51e8201: 8267764: jpackage cannot handle window screensaver files when EXE renamed as SCR
  • f9b593d: 8266748: Move modifiers code to Signatures.java
  • 4dd0e7e: 8259806: Clean up terminology on the "All Classes" page
  • dc6c96b: 8263468: New page for "recent" new API
  • fafc4d9: 8268352: Rename javadoc Messager class to JavadocLog
  • b568e87: 8237388: serviceability/dcmd/framework/VMVersionTest.java fails with connection refused error.
  • c21cc93: 8248877: Document API contract for MetaspaceObj subtypes
  • 159cb6f: 8268083: JDK-8267706 breaks bin/idea.sh on a Mac
  • ... and 63 more: https://git.openjdk.java.net/jdk/compare/e2d5ff9d456dd339ccd21df2f75c4e34e5784d9a...master

Your commit was automatically rebased without conflicts.

Pushed as commit ae16052.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
4 participants