New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8268088: Clarify Method::clear_jmethod_ids() related comments in ClassLoaderData::~ClassLoaderData() #4383
Conversation
👋 Welcome back jiangli! A progress list of the required criteria for merging this PR into |
@jianglizhou The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
// NULL after class unloading. | ||
// Method::clear_jmethod_ids only sets the jmethod_ids to NULL without | ||
// releasing the memory for related JNIMethodBlocks and JNIMethodBlockNodes. | ||
// This is done intentionally because native code (e.g. JVMTI agent) obtains |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rewriting to "native code (...) holding jmethod_ids may access..." would be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated as suggested. Thanks!
// releasing the memory for related JNIMethodBlocks and JNIMethodBlockNodes. | ||
// This is done intentionally because native code (e.g. JVMTI agent) obtains | ||
// jmethod_ids may access them after the associated classes and class loader | ||
// are unloaded. The Java native Interface Specification says "method ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"native" -> "Native"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the review! |
/integrate |
@jianglizhou This PR has not yet been marked as ready for integration. |
/integrate |
@jianglizhou This PR has not yet been marked as ready for integration. |
// does not prevent the VM from unloading the class from which the ID has | ||
// been derived. After the class is unloaded, the method or field ID becomes | ||
// invalid". In real world usages, the native code may rely on jmethod_ids | ||
// being NULL after class unloading. Hence, it is unsafe to free the memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any use cases of what a real world application can do with a NULL'ed out jmethodID (other than JVM TI will give an JVMTI_ERROR_INVALID_METHODID for it)?
All or most of the JNI functions will crash with a NULL'ed out jmethodID. You can catch this error by using -Xcheck:jni. Maybe we could deprecate this feature of the JVMTI spec and release the memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment looks like an expanded version of the one I wrote a long time ago and correct. We were wondering if this is a feature that we actually need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any use cases of what a real world application can do with a NULL'ed out jmethodID (other than JVM TI will give an JVMTI_ERROR_INVALID_METHODID for it)?
All or most of the JNI functions will crash with a NULL'ed out jmethodID. You can catch this error by using -Xcheck:jni. Maybe we could deprecate this feature of the JVMTI spec and release the memory?
The JVMTI agent use case that ran into issues (with releasing memory) calls GetMethodDeclaringClass for a stale jmethod_id (with unloaded class and loader) and crashes in Method::checked_resolve_jmethod_id when it's dereferencing the 'mid'.
The agent native code does check for JVMTI_ERROR_NONE after calling GetMethodDeclaringClass in this case, but it's too late.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment looks like an expanded version of the one I wrote a long time ago and correct.
Thanks for confirming the correctness!
We were wondering if this is a feature that we actually need.
That seems to be the case. I learned this the hard way with crashes observed in production when attempting to release memory for stale jmethod_ids from VM side. Clarifying the comments hopefully will benefit others and avoid similar issues in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the issue is specific to JVMTI code only, the comment should reflect that. Currently it seems to suggest that all possible JNI code could use an invalid jmethodID. However, using an invalid jmethodID with JNI APIs such as CallStaticVoidMethod will lead to a crash. I don't think the JNI spec allows invalid jmethodID to be used.
Furthermore, we should consider freeing the JNIMethodBlock when JVMTI is not enabled, so we can avoid the memory leak.
We may even want to update the JVMTI spec to disallow invalid jmethodID, but will that introduce incompatibility with 3rd party code?
Lastly, do we have any existing test cases that cover the use of invalid jmethodIDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added comments to both of these bugs JDK-8268088 and JDK-8268364.
There are places in the JNI spec that don't specify a "valid method ID" and some places just say "a method ID". We might be able to fix those. We will crash for invalid method ID in all the JNI entry points. There is -Xcheck:jni that does check the method ID though. Guess I'm repeating my comment above.
As far as the new comment goes, for now, it should include JNI until which time we clarify the spec. I don't see the harm in that.
I was trying to think of a way to check jmethodID with SafeFetch or something to to return JVMTI_ERROR_INVALID_METHODID without running into ABA problems. Maybe that would work and we can deallocate the jmethodID block.
That said, I never saw the PR 66ef04a and I'd want to review that carefully. My first quick click-through was not positive. I hid JMethodIDBlocks in method.cpp on purpose! Plus it seems like a huge change for JDK 11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://docs.oracle.com/en/java/javase/16/docs/specs/jni/design.html
"A field or method ID does not prevent the VM from unloading the class from which the ID has been derived. After the class is unloaded, the method or field ID becomes invalid.
...
The programmer must not pass illegal pointers or arguments of the wrong type to JNI functions. Doing so could result in arbitrary consequences, including a corrupted system state or VM crash."
I had read the first quoted text in "Accessing Fields and Methods" when looking into the spec. It states that the method ID becomes invalid after class unloading, but the VM&JNI implementation has tolerated the invalid ID usages after unloading for long time. The second one that you quoted from "Reporting Programming Errors" has stronger language regarding the usages of illegal pointers. It would be good to clarify the language in "Accessing Fields and Methods" to not allow the usage of the invalid IDs after class unloading.
From above, we can conclude that both the JNI spec and implementation disallow the use of invalid jmethodIDs. No existing code 3rd party JNI code can run with invalid jmethodIDs without crashing.
Today's implementation can tolerate the 'invalid' jemthod_ids usages after unloading (by not releasing the memory).
In JDK-8268088, you described a JVMTI-based profiler that "captured stack traces and then symbolized methods using jmethod_ids at a later point during runtime execution, which could happen after unloading the associated classes and loaders."
This is unsafe and can crash with ZGC even if we don't free the JNIMethodBlocks, because
checked_method
could point to a deallocatedMethod
. See more in JDK-8268364.
Agreed, that would become unsafe when we change to release memory for JNIMethodBlock and JNIMethodBlockNodes with unloaded classes. Thanks for filing JDK-8268364.
To handle this correctly, the profiler should ensure that the classes with the collected stacks cannot be GCed. This can be done by creating a GlobalRef on the classloader associated with the jmethodID:
- GetMethodDeclaringClass -> GetClassLoader -> NewGlobalRef
The profiler can release the GlobalRef after it has processed the jmethodID.
I would be very cautious to recommend this practice. I added some comments in JDK-8268364:
That would keep more memory for longer time. The agent behavior would dictate the memory usages of the VM and prevent classes from being unloaded unnecessarily. If an agent only processes all jmethod_ids at the end, it would keep all loaders alive without being able to be unloaded. Also, a bug (fail to release any of the GoabalRef) in the agent might cause memory leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is -Xcheck:jni that does check the method ID though. Guess I'm repeating my comment above.
-Xcheck:jni is not enabled commonly during testing due to other issues. Catching the issue with -Xcheck:jni before production would help (this case does make it more important to enable -Xcheck:jni).
As far as the new comment goes, for now, it should include JNI until which time we clarify the spec. I don't see the harm in that.
Agreed.
That said, I never saw the PR 66ef04a and I'd want to review that carefully. My first quick click-through was not positive.
I opened PR 66ef04a mainly for reference purpose for now. Please don't spend much time reviewing yet :), since we need to address the invalid jmethod_id usages first.
I hid JMethodIDBlocks in method.cpp on purpose! Plus it seems like a huge change for JDK 11.
I moved JNIMethodBlock and JNIMethodBlockNodes class declarations to method.hpp to trigger destructor properly from classLoaderData.cpp in 66ef04a. I made the change as an internal fix and intended to only contribute it to JDK head if there was no issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From above, we can conclude that both the JNI spec and implementation disallow the use of invalid jmethodIDs. No existing code 3rd party JNI code can run with invalid jmethodIDs without crashing.
Today's implementation can tolerate the 'invalid' jemthod_ids usages after unloading (by not releasing the memory).
Today's JNI implementation in HotSpot does not in any way tolerate such invalid jmethodIDs. If an invalid jmethodID is passed to any APIs declared in jni.h, the VM will crash immediately. Please see JDK-8268406 for more details.
It's only JVMTI that tolerates invalid jmethodIDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today's JNI implementation in HotSpot does not in any way tolerate such invalid jmethodIDs. If an invalid jmethodID is passed to any APIs declared in jni.h, the VM will crash immediately. Please see JDK-8268406 for more details.
It's only JVMTI that tolerates invalid jmethodIDs.
Thanks for checking into that! That narrows the issue, which is good.
I also created JDK-8268416 for clarifying the spec.
Hi Coleen, Some more background context related to this: Users/developers would be happy if we can release memory for jmethod_ids in unloaded cases and get rid of the memory leak. I was pinged recently for the leak after the earlier unsuccessful attempt to fix with 66ef04a change (on JDK 11, and was rollback due to agent crashes). I haven't found a general solution that would work well without the memory leak. So decided to clarify the comment for now and also open up the discussion in JDK-8268088. The agent that ran into crashes with the attempted fix saved stacktraces and jmethod_ids from periodic sampling. It then tried to get the class and method information (name, etc) from the saved jmethod_ids at later points during the execution and crashed in Method::checked_resolve_jmethod_id with a stale jmethod_id. One possible solution that I thought of was to 'remove' the jmethod_ids in agent during unloading (described in https://bugs.openjdk.java.net/browse/JDK-8268088?focusedCommentId=14424635&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14424635). That could allow deallocating the memory related to those jmethod_ids within the VM. However, that's not a general and efficient solution because:
It would be good if we can find a workable solution for this (for long term). Thanks! |
Hi, I put a short comment in JDK-8268088 which I'll expand later. There's already a JVMTI callback for unloading classes that the user could use to walk the saved jmethodID stacks: // notify the debugger Otherwise, don't change class unloading to add a callback because it could be sensitive to safepoints and unsafe places. My comment in this bug echoes what Ioi said above. To prevent class unloading, the class mirror should be saved with the Method. This is the approach taken throughout the vm. I think hunting down applications that need to fix their profilers may be difficult, but we could add some deprecation messages and warnings (or even an option, which I thought was what you were going to do). I'm not sure why so many lines are changed in your above pull request. For now the updated message seems fine to me for JDK 17. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@jianglizhou This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 67 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
A new capability as Ioi suggested in https://bugs.openjdk.java.net/browse/JDK-8268364?focusedCommentId=14426310&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14426310 would be better than using the existing callback for unloading. The possible approach that I described in https://bugs.openjdk.java.net/browse/JDK-8268088?focusedCommentId=14424635&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14424635 also looks for the same direction (however there are still other issues as described in the bug comment). Also, we need a good strategy for existing code that expects the current behavior.
Prevent or defer class unloading would keep more memory for longer time. It could cause other unwanted side-effects.
Do you refer to 66ef04a? That's based on my internal change on JDK 11 (with other local changes). I restructured code by moving JNIMethodBlock and JNIMethodBlockNodes class declarations to the header file. That's needed to trigger destructor properly from classLoaderData.cpp. The original logic was kept the same, and only the code was being moved around. I opened my internal change for reference purposes and assisting the discussions to address the memory leak. |
Thanks for the review! |
/integrate |
@jianglizhou Since your change was applied there have been 73 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit ae16052. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This PR clarifies Method::clear_jmethod_ids() related comments in ClassLoaderData::~ClassLoaderData(). It adds more explanations on why Method::clear_jmethod_ids only sets the jmethod_ids to NULL without releasing the memory for related JNIMethodBlocks and JNIMethodBlockNodes.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4383/head:pull/4383
$ git checkout pull/4383
Update a local copy of the PR:
$ git checkout pull/4383
$ git pull https://git.openjdk.java.net/jdk pull/4383/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 4383
View PR using the GUI difftool:
$ git pr show -t 4383
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4383.diff