Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8259070: Add jcmd option to dump CDS #2737

Closed
wants to merge 17 commits into from
Closed

Conversation

@yminqi
Copy link
Contributor

@yminqi yminqi commented Feb 26, 2021

Hi, Please review

Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with
java -XX:DumpLoadedClassList=<classlist> ....
to collect shareable class names and saved in file <classlist> , then
java -Xshare:dump -XX:SharedClassListFile=<classlist> -XX:SharedArchiveFile=<archivefile> ...
With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive.
The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved.
New added jcmd option:
jcmd <pid or AppName> VM.cds static_dump <filename>
or
jcmd <pid or AppName> VM.cds dynamic_dump <filename>
To dump dynamic archive, requires start app with newly added flag -XX:+RecordDynamicDumpInfo, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process.
The file name is optional, if the file name is not supplied, the file name will take format of java_pid<number>_static.jsa or java_pid<number>_dynamic.jsa for static and dynamic respectively. The <number> is the application process ID.

Tests: tier1,tier2,tier3,tier4

Thanks
Yumin


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737
$ git checkout pull/2737

Update a local copy of the PR:
$ git checkout pull/2737
$ git pull https://git.openjdk.java.net/jdk pull/2737/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 2737

View PR using the GUI difftool:
$ git pr show -t 2737

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/2737.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Feb 26, 2021

👋 Welcome back minqi! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Feb 26, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Feb 26, 2021

@yminqi The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot label Feb 26, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Feb 26, 2021

@yminqi
Copy link
Contributor Author

@yminqi yminqi commented Feb 26, 2021

/label add hotspot-runtime
/label add serviceability

@openjdk
Copy link

@openjdk openjdk bot commented Feb 26, 2021

@yminqi
The hotspot-runtime label was successfully added.

@openjdk
Copy link

@openjdk openjdk bot commented Feb 26, 2021

@yminqi
The serviceability label was successfully added.

Copy link
Member

@calvinccheung calvinccheung left a comment

Looks like a good usability enhancement to CDS.
Some comments below...

Thanks,
Calvin

char exec_path[JVM_MAXPATHLEN]; // $JAVA_HOME/bin/java or %JAVA_HOME%\\jre\bin\\java ...
char classlist_name[JVM_MAXPATHLEN];

os::snprintf(classlist_name, sizeof(classlist_name), "%s.classlist", file);

This comment has been minimized.

@calvinccheung

calvinccheung Feb 26, 2021
Member

I think the file contains the ".jsa" suffix. So the classlist_name would be .jsa.classlist.
Maybe only the prefix of file should be passed into cmd_dump_static() and cmd_dump_dynamic() and have the functions append the suffix for the names for the classlist and archive?

Copy link
Member

@calvinccheung calvinccheung left a comment

Below are my comments...

src/hotspot/share/memory/dynamicArchive.cpp Outdated Show resolved Hide resolved
}
if (!RecordDynamicDumpInfo) {
output->print_cr("Please run with -Xshare:auto -XX:+RecordDynamicDumpInfo dumping dynamic archive!");
return;

This comment has been minimized.

@iklam

iklam Feb 26, 2021
Member

There are several error conditions:

(1) CDS is not configured for this VM build. In this case, INCLUDE_CDS is false. The DumpSharedArchiveDCmd should be placed inside #if INCLUDE_CDS, so the user won't be able to issue jcmd VM.cds at all, and we will never come to here.

(2) CDS is configured, but the JVM process is not running with CDS enabled. This could have several causes:

  • The JVM does not have a built-in archive.
  • User has specified -XX:SharedArchiveFile=foo.jsa, but foo.jsa doesn't exist
  • The shared archive exists, but has failed to map
  • -Xshare:off is specified in the command-line.

I think all of the above can be checked inside metaspce.cpp. Note that if you have specified DynamicDumpSharedSpaces but the base archive fails to map, you will get a similar error.

if (RecordDynamicDumpInfo && !UseSharedSpaces) {
  vm_exit_during_initialization("RecordDynamicDumpInfo is unsupported when base CDS archive is not loaded
");
}

(3) jcmd VM.cds dynamic_dump is used on a JVM process without RecordDynamicDumpInfo:

We can do the check here, but I think we can change the error message to be more specific:

if (!RecordDynamicDumpInfo) {
  output->print_cr("Unable to dump dynamic shared archive. "
                   "Please restart the JVM with -XX:+RecordDynamicDumpInfo");
}

Note that the user can get to (3) with incorrect command-line options such as java -Xshare:off ....., but we don't need to list all those conditions here. Instead, if the user follows the suggestion of (3) and add -XX:+RecordDynamicDumpInfo to the command-line, they will then get the error message of (2) and will know how to proceed further.

output->print_cr("Static dump");
if (file_name ==nullptr) {
os::snprintf(filename, sizeof(filename), "java_pid%d_static.jsa", os::current_process_id());
file = filename;

This comment has been minimized.

@iklam

iklam Feb 26, 2021
Member

I think the above os::snprintf can be combined for both dynamic and static case:

int n = os::snprintf(filename, sizeof(filename), "java_pid%d_%s.jsa", os::current_process_id()
                     (is_static ? "static" : "dynamic");
assert(n < sizeof(filename), "should not truncate");

snprintf man page says "a return value of size or more means that the output was truncated."

src/hotspot/share/memory/metaspaceShared.cpp Outdated Show resolved Hide resolved
char exec_path[JVM_MAXPATHLEN]; // $JAVA_HOME/bin/java or %JAVA_HOME%\\jre\bin\\java ...
char classlist_name[JVM_MAXPATHLEN];

os::snprintf(classlist_name, sizeof(classlist_name), "%s.classlist", file);

This comment has been minimized.

@iklam

iklam Feb 26, 2021
Member

Need to check for truncation. We should also add a test case for a very long file names specified in "jcmd VM.cds ...."

src/hotspot/share/runtime/arguments.cpp Show resolved Hide resolved
src/hotspot/share/runtime/globals.hpp Outdated Show resolved Hide resolved
src/hotspot/share/memory/metaspaceShared.cpp Outdated Show resolved Hide resolved
src/hotspot/share/memory/metaspaceShared.cpp Outdated Show resolved Hide resolved
Copy link
Member

@tstuefe tstuefe left a comment

Hi Yumin,

This is a very useful addition.

My biggest concern with this patch though is the use of os::fork_and_exec() in regular, non-fatal situations. I had a look at that function and I think that is very unsafe.

  • it does not close file descriptors, so the child vm will inherit all file descriptors not opened with FD_CLOEXEC. In Runtime.exec, we do this whole dance around safely closing off all unused file descriptors, which is missing here. Note that even though we mostly open all fds with CLOEXEC, this does not matter, since user code may not do that.
  • It uses vfork "if available", which is probably always, but that may be okay since the child exec's right away. Still, vfork makes me nervous.
  • Weirdly enough, it always spawns the child program via one indirection using a shell; so there is always the shell between you and your spawned VM, and you probably wont get the return code of your child vm process back.

Until now, os::fork_and_exec() was only used in fatal situations where the VM was about to die anyway. And where it did not really matter if it worked or not.

If we now want to use it in regular situations, we need to give that thing an overhaul and make it a first class fork api, and also test it better that it is today. The file descriptor issue has to be addressed at the very least.

I really would consider rewriting the whole thing using posix_spawn. Since JDK15 I think posix_spawn is the default for Runtime.exec, so we know it works well. I would also do this in a separate RFE.

Alternatively, you could call into java and use Runtime.exec().

Further remarks inline.

Thanks, Thomas

src/hotspot/share/memory/metaspaceShared.cpp Outdated Show resolved Hide resolved
src/hotspot/share/memory/metaspaceShared.cpp Outdated Show resolved Hide resolved
char* buff_start = exec_path + strlen(exec_path);
snprintf(buff_start, sizeof(exec_path), " -cp %s %s", app_class_path, java_command);
output->print_cr("%s", exec_path);
os::fork_and_exec(exec_path);

This comment has been minimized.

@tstuefe

tstuefe Feb 27, 2021
Member

Apart from what I wrote above, how do you handle fork/exec errors? There are a number of things that can go wrong here. At the very least I would scan the return code. And the return code of the child VM.

@iklam
Copy link
Member

@iklam iklam commented Feb 27, 2021

I really would consider rewriting the whole thing using posix_spawn. Since JDK15 I think posix_spawn is the default for Runtime.exec, so we know it works well. I would also do this in a separate RFE.

Alternatively, you could call into java and use Runtime.exec().

I think we should call into Java and use Runtime.exec(). Running a subprocess is very complicated, so we shouldn't try to duplicate the code in the VM.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Feb 28, 2021

Mailing list message from Thomas St��fe on hotspot-dev:

Oh right, then it could get truncated, but should not overflow.

On Sat, Feb 27, 2021 at 7:15 PM Ioi Lam <iklam at openjdk.java.net> wrote:

@yminqi
Copy link
Contributor Author

@yminqi yminqi commented Mar 2, 2021

@calvinccheung @iklam @tstuefe Thanks for review! I will use CDS.java to implement the dumping for next update. This way, we deal with CDS related code in a central place. Also using Runtime.exec will clear your concern, plus the code will be more readable though it will add some bridge functions between java/vm.

@openjdk openjdk bot removed the rfr label Mar 9, 2021
Yumin Qi Yumin Qi
@openjdk openjdk bot added the rfr label Mar 9, 2021
@yminqi
Copy link
Contributor Author

@yminqi yminqi commented Mar 31, 2021

@iklam Thanks. Yes, the comment out for the deletion is unintentional for test only and forgot to revert. I will revert it. Also I will merge with upstream. Since this repo has no merge from upstream for long time, it may cause conflicts. If too many conflicts to resolve I would like to withdraw this one and resubmit as a new PR. Thanks.

@@ -300,6 +300,8 @@
template(jdk_internal_misc_CDS, "jdk/internal/misc/CDS") \
template(generateLambdaFormHolderClasses, "generateLambdaFormHolderClasses") \
template(generateLambdaFormHolderClasses_signature, "([Ljava/lang/String;)[Ljava/lang/Object;") \
template(dumpSharedArchive, "dumpSharedArchive") \
template(dumpSharedArchive_signature, "(ZLjava/lang/String;)V") \

This comment has been minimized.

@iklam

iklam Mar 31, 2021
Member

Need to align the "dumpSharedArchive" part with the previous line.

src/hotspot/share/prims/jvm.cpp Show resolved Hide resolved
if (Arguments::init_shared_archive_paths()) {
DynamicArchive::dump();
} else {
THROW_MSG(vmSymbols::java_lang_RuntimeException(),

This comment has been minimized.

@iklam

iklam Mar 31, 2021
Member

Need to set ArchiveClassesAtExit to NULL before throwing the exception, since dynamic dump may not work anymore after the failure.

JVM_ENTRY(void, JVM_DumpDynamicArchive(JNIEnv *env, jstring archiveName))
#if INCLUDE_CDS
assert(UseSharedSpaces && RecordDynamicDumpInfo, "already checked in arguments.cpp?");
if (DynamicArchive::has_been_dumped_once()) {

This comment has been minimized.

@iklam

iklam Mar 31, 2021
Member

Maybe add a comment like this:?

// During dynamic archive dumping, some of the data structures are overwritten so
// we cannot dump the dynamic archive again. TODO: this should be fixed.

public class LingeredTestApp extends LingeredApp {
// Do not use default test.class.path in class path.
public boolean useDefaultClasspath() { return false; }

This comment has been minimized.

@iklam

iklam Mar 31, 2021
Member

It's not obvious that you're changing the behavior of the base class by overriding a member function. It's better to have

public LingeredTestApp() {
   setUseDefaultClasspath(false);
}

Also, the name of LingeredTestApp is kind of generic. How about renaming it to JCmdTestLingeredApp?

is_static = JNI_FALSE;
output()->print_cr("Dynamic dump:");
if (!UseSharedSpaces) {
output()->print_cr("CDS is not available for the JDK");

This comment has been minimized.

@iklam

iklam Apr 1, 2021
Member

I think we should use a message similar to this:

$ java -Xshare:off -XX:ArchiveClassesAtExit=xxx -version
Error occurred during initialization of VM
DynamicDumpSharedSpaces is unsupported when base CDS archive is not loaded

How about "dynamic_dump is unsupported when base CDS archive is not loaded".

}
Symbol* cds_name = vmSymbols::jdk_internal_misc_CDS();
Klass* cds_klass = SystemDictionary::resolve_or_fail(cds_name, true /*throw error*/, CHECK);
JavaValue result(T_OBJECT);

This comment has been minimized.

@iklam

iklam Apr 1, 2021
Member

Should be result(T_VOID) to match the signature of CDS.dumpSharedArchive()

cds_klass,
vmSymbols::dumpSharedArchive(),
vmSymbols::dumpSharedArchive_signature(),
&args, THREAD);

This comment has been minimized.

@iklam

iklam Apr 1, 2021
Member

Should be &args, CHECK);. Recently, we have started to to avoid using THREAD if the callee function may throw an exception.


try {
PrintStream prt = new PrintStream(fileName);
prt.println("Command:");

This comment has been minimized.

@iklam

iklam Apr 1, 2021
Member

Try-with-resources should be used to make sure the streams are closed when the method exits (normally or on exception).

try (InputStreamReader isr = new InputStreamReader(stream);
     BufferedReader rdr = new BufferedReader(isr);
     PrintStream prt = new PrintStream(fileName);)
{
    prt.println("Command:"); ....
@iklam
iklam approved these changes Apr 14, 2021
Copy link
Member

@iklam iklam left a comment

The latest versions looks good to me.

@openjdk
Copy link

@openjdk openjdk bot commented Apr 14, 2021

@yminqi This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8259070: Add jcmd option to dump CDS

Reviewed-by: ccheung, iklam, mli

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 39 new commits pushed to the master branch:

  • 5931948: 8265246: Fix macos-Aarch64 build after JDK-8263709
  • 79bff21: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines
  • f4c3efd: 8265192: [macos_aarch64] configure script fails if GNU uname in PATH
  • e167577: 8265078: jpackage tests on Windows leave large temp files
  • 05f851e: 8265236: ProblemList java/foreign/TestUpcall.java on macosx-aarch64
  • d1b28e7: 8265231: (fc) ReadDirect and WriteDirect tests fail after fix for JDK-8264821
  • 57f86a0: 8265235: ProblemList java/foreign/TestIntrinsics.java on macosx-aarch64
  • 4c83d24: 8058176: [mlvm] tests should not allow code cache exhaustion
  • 9406744: 8264976: Minor numeric bug in AbstractSplittableWithBrineGenerator.makeSplitsSpliterator
  • 80026d8: 8265174: Update Class.getDeclaredMethods to discuss synthetic and bridge methods
  • ... and 29 more: https://git.openjdk.java.net/jdk/compare/008fc75a290f0286fb30c3b0a2f5abada441149c...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Apr 14, 2021
Copy link
Member

@calvinccheung calvinccheung left a comment

Looks good. Just have a few minor comments.

@Hamlin-Li
Copy link

@Hamlin-Li Hamlin-Li commented Apr 15, 2021

I looked back at the history, and find out why it calls from JVM into java then into JVM again, nice solution!

@yminqi
Copy link
Contributor Author

@yminqi yminqi commented Apr 15, 2021

@iklam @calvinccheung @Hamlin-Li Thanks for review!

@yminqi
Copy link
Contributor Author

@yminqi yminqi commented Apr 15, 2021

/integrate

@openjdk openjdk bot closed this Apr 15, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Apr 15, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Apr 15, 2021

@yminqi Since your change was applied there have been 39 commits pushed to the master branch:

  • 5931948: 8265246: Fix macos-Aarch64 build after JDK-8263709
  • 79bff21: 8263709: Cleanup THREAD/TRAPS/CHECK usage in JRT_ENTRY routines
  • f4c3efd: 8265192: [macos_aarch64] configure script fails if GNU uname in PATH
  • e167577: 8265078: jpackage tests on Windows leave large temp files
  • 05f851e: 8265236: ProblemList java/foreign/TestUpcall.java on macosx-aarch64
  • d1b28e7: 8265231: (fc) ReadDirect and WriteDirect tests fail after fix for JDK-8264821
  • 57f86a0: 8265235: ProblemList java/foreign/TestIntrinsics.java on macosx-aarch64
  • 4c83d24: 8058176: [mlvm] tests should not allow code cache exhaustion
  • 9406744: 8264976: Minor numeric bug in AbstractSplittableWithBrineGenerator.makeSplitsSpliterator
  • 80026d8: 8265174: Update Class.getDeclaredMethods to discuss synthetic and bridge methods
  • ... and 29 more: https://git.openjdk.java.net/jdk/compare/008fc75a290f0286fb30c3b0a2f5abada441149c...master

Your commit was automatically rebased without conflicts.

Pushed as commit e7cbeba.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@yminqi yminqi deleted the yminqi:jdk-8259070 branch Apr 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment