Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8263670: pmap and pstack in jhsdb do not work on debug server #3027

Closed
wants to merge 3 commits into from

Conversation

YaSuenag
Copy link
Member

@YaSuenag YaSuenag commented Mar 16, 2021

jhsdb supports pmap (jhsdb jmap) and pstack (jhsdb jstack --mixed), and they work fine if they attach to live process or to coredump, however they do not work on debug server as following:

$ jhsdb jmap --connect localhost
Attaching to remote server localhost, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0.10+9
remote configuration is not yet implemented

pmap and pstack depend on CDebugger in SA, however it would not be set in case of remote debugger client. We can avoid it if we can delegate the process to debug server.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8263670: pmap and pstack in jhsdb do not work on debug server

Reviewers

Download

To checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3027/head:pull/3027
$ git checkout pull/3027

To update a local copy of the PR:
$ git checkout pull/3027
$ git pull https://git.openjdk.java.net/jdk pull/3027/head

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 16, 2021

👋 Welcome back ysuenaga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 16, 2021

@YaSuenag The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Mar 16, 2021
@YaSuenag YaSuenag marked this pull request as ready for review Mar 16, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 16, 2021
@mlbridge
Copy link

mlbridge bot commented Mar 16, 2021

Webrevs

@plummercj
Copy link
Contributor

plummercj commented Mar 17, 2021

If I understand correctly, this is very different than the way we normally implement remote debugging support. Normally pages are read in from the remote VM process and into the local SA client, where it can then implement SA support in much the same way as it normally does. For PStack and PMap, you've chosen to instead just execute the commands in the remote debugd server, and return the (text) result to the client. It's not clear to me why the normal model (of reading in the remote pages and leveraging existing PStack and PMAp supprt) can't instead be used so we can keep the remote debugging support consistent. How are PStack and PMap different from other SA support that does not require remote execution of commands.

@YaSuenag
Copy link
Member Author

YaSuenag commented Mar 17, 2021

If I understand correctly, this is very different than the way we normally implement remote debugging support. Normally pages are read in from the remote VM process and into the local SA client, where it can then implement SA support in much the same way as it normally does. For PStack and PMap, you've chosen to instead just execute the commands in the remote debugd server, and return the (text) result to the client.

Yes.

It's not clear to me why the normal model (of reading in the remote pages and leveraging existing PStack and PMAp supprt) can't instead be used so we can keep the remote debugging support consistent. How are PStack and PMap different from other SA support that does not require remote execution of commands.

SA would set various information when it attaches to debuggee, so it is difficult to catch up them when remote debugger attached because only debugd can attach to debuggee directly.
As I said in the description of this PR, both PMap and PStack depend on CDebugger. CDebugger holds low-level information such as library list and native thread list. They would be initialized when it attached - it is startup of debugd, and we cannot get them when we connect to debugd.

I thought to export CDebugger as RMI object, but it seems to complex. So I decided to run both pmap and pstack on debugd.

@plummercj
Copy link
Contributor

plummercj commented Mar 17, 2021

Do we have any issues with any other clhsdb commands that rely CDebugger. I haven't looked into the implementation, but I'm suspicious of thread, threads, findpc of an address in the executable, and findsym.

And speaking of CDebugger, the SA "debugger" hierarchy is too heavily abstracted. I counted about 18 classes and interfaces that have "Debugger" in their name, and some of the interfaces get inherited more than once. I really wonder if it has to be that complex. My first adventure into this mess was for JDK-8239062, a time when I knew close to nothing about SA. Was not fun and I eventually gave up. You might want to have a look at the CR, if not just to add some comments and maybe enlighten me on some things.

* @test
* @bug 8263670
* @requires vm.hasSA
* @requires os.family != "windows"
Copy link
Contributor

@plummercj plummercj Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You exclude windows but you check for ".dll" below.

if (SATestUtils.needsPrivileges()) {
// This tests has issues if you try adding privileges on OSX. The debugd process cannot
// be killed if you do this (because it is a root process and the test is not), so the destroy()
// call fails to do anything, and then waitFor() will time out. If you try to manually kill it with
// a "sudo kill" command, that seems to work, but then leaves the LingeredApp it was
// attached to in a stuck state for some unknown reason, causing the stopApp() call
// to timeout. For that reason we don't run this test when privileges are needed. Note
// it does appear to run fine as root, so we still allow it to run on OSX when privileges
// are not required.
throw new SkippedException("Cannot run this test on OSX if adding privileges is required.");
}
Copy link
Contributor

@plummercj plummercj Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pmap and pstack are not even supported on OSX with a live process. The automated github testing did not catch this because it passed due to the above check:

TEST RESULT: Passed. Skipped: jtreg.SkippedException: Cannot run this test on OSX if adding privileges is required.

If it passed the check, it would have failed below to find a .dylib. You could change the code below to something like what is in ClhsdbPmap.java:

            if (!withCore && Platform.isOSX()) {
                expStrMap.put("pmap", List.of("Not available for Mac OS X processes"));
            } else {
                expStrMap.put("pmap", List.of("jvm", "java", "jli", "jimage"));
            }

@YaSuenag
Copy link
Member Author

YaSuenag commented Mar 17, 2021

Do we have any issues with any other clhsdb commands that rely CDebugger. I haven't looked into the implementation, but I'm suspicious of thread, threads, findpc of an address in the executable, and findsym.

Yes, I confirmed findsim has same issue on remote debugger. It will be fixed if findsym use execCommandOnServer() in this PR, but it is not in this scope now.

And speaking of CDebugger, the SA "debugger" hierarchy is too heavily abstracted. I counted about 18 classes and interfaces that have "Debugger" in their name, and some of the interfaces get inherited more than once. I really wonder if it has to be that complex. My first adventure into this mess was for JDK-8239062, a time when I knew close to nothing about SA. Was not fun and I eventually gave up. You might want to have a look at the CR, if not just to add some comments and maybe enlighten me on some things.

I agree with you that Debugger (includes Debugger) and their inheritances are very complex, but it is very difficult to refactor IMHO - it could be the same as creating SA from scratch. It would be better if all commands run on client side of course, but it seems to be difficult, thus I decided to run some primitive commands (e.g. pmap, pstack) are run on server side (debugd).
If this PR is integrated, JDK-8239062 also can be fixed because it may be able to run on debugd.

OTOH this PR could be more complex to SA implementation for remote debugger, but I have no idea to fix this problem without refactoring (C)Debugger interfaces.

if (SATestUtils.needsPrivileges()) {
// This tests has issues if you try adding privileges on OSX. The debugd process cannot
// be killed if you do this (because it is a root process and the test is not), so the destroy()
// call fails to do anything, and then waitFor() will time out. If you try to manually kill it with
// a "sudo kill" command, that seems to work, but then leaves the LingeredApp it was
// attached to in a stuck state for some unknown reason, causing the stopApp() call
// to timeout. For that reason we don't run this test when privileges are needed. Note
// it does appear to run fine as root, so we still allow it to run on OSX when privileges
// are not required.
throw new SkippedException("Cannot run this test on OSX if adding privileges is required.");
}
Copy link
Contributor

@plummercj plummercj Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this code and the giant comment block will now be in 3 places. Maybe it should be moved to SATestUtils. Maybe something like validateSADebugDPrivileges().

@YaSuenag
Copy link
Member Author

YaSuenag commented Mar 18, 2021

Thanks @plummercj for the comment! I updated testcases.

Copy link
Contributor

@plummercj plummercj left a comment

The changes look good. Just curious though, why are all the debugd tests excluded on windows?

@openjdk
Copy link

openjdk bot commented Mar 19, 2021

@YaSuenag This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8263670: pmap and pstack in jhsdb do not work on debug server

Reviewed-by: cjplummer, sspitsyn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 108 new commits pushed to the master branch:

  • b23228d: 8263914: CDS fails to find the default shared archive on x86_32
  • a5e7a89: 8263904: compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java fails on x86_32
  • f62b100: 8263895: Test nsk/jvmti/GetThreadGroupChildren/getthrdgrpchld001/getthrdgrpchld001.cpp uses incorrect indices
  • f84b52b: 8263897: compiler/c2/aarch64/TestVolatilesSerial.java failed with "java.lang.RuntimeException: Wrong method"
  • f08bf4b: 8263891: Changes for 8076985 missed the fix.
  • b2df513: 8261785: Calling "main" method in anonymous nested class crashes the JVM
  • 840ab7b: 8263894: Convert defaultPrinter and printers fields to local variables
  • ba504fc: 8187450: JNI local refs exceeds capacity warning in NetworkInterface::getAll
  • 0abbfb2: 8263729: [test] divert spurious output away from stream under test in ProcessBuilder Basic test
  • 6c2220e: 8263861: Shenandoah: Remove unused member in ShenandoahGCStateResetter
  • ... and 98 more: https://git.openjdk.java.net/jdk/compare/c484d8904285652246c3af212a4211b9a8955149...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 19, 2021
@YaSuenag
Copy link
Member Author

YaSuenag commented Mar 20, 2021

@plummercj Thanks for your reivew!

The changes look good. Just curious though, why are all the debugd tests excluded on windows?

It is caused by JDK-8224252. I left a comment on it that we cannot destroy debugd process normally on Windows.

Copy link
Contributor

@sspitsyn sspitsyn left a comment

Hi Yasumasa,
The fix looks good. I've added one nit.
Thanks,
Serguei

@@ -77,7 +78,7 @@ public void run(PrintStream out, Debugger dbg) {
}
} else {
if (getDebugeeType() == DEBUGEE_REMOTE) {
out.println("remote configuration is not yet implemented");
out.print(((RemoteDebuggerClient)dbg).execCommandOnServer("pmap", null));
Copy link
Contributor

@sspitsyn sspitsyn Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Indent is incorrect now.

Copy link
Member Author

@YaSuenag YaSuenag Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I fixed it in new commit.

@YaSuenag
Copy link
Member Author

YaSuenag commented Mar 27, 2021

/integrate

@openjdk openjdk bot closed this Mar 27, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 27, 2021
@openjdk
Copy link

openjdk bot commented Mar 27, 2021

@YaSuenag Since your change was applied there have been 208 commits pushed to the master branch:

  • 38e0a58: 8264273: macOS: zero VM is broken due to no member named 'is_cpu_emulated' after JDK-8261966
  • c9d2d02: 8263632: Improve exception handling of APIs in classLoader.cpp
  • 59ed1fa: 8264087: Use the blessed modifier order in jdk.jconsole
  • 054e0a4: 8264017: Correctly report inlined frame in JFR sampling
  • d6bb153: 8264240: [macos_aarch64] enable appcds support after JDK-8263002
  • 7284f01: 8262110: DST starts from incorrect time in 2038
  • 3a28dc8: 8264178: Unused method Threads::nmethods_do
  • 33c94ff: 8263376: CTW (Shenandoah): assert(mems <= 1) failed: No node right after call if multiple mem projections
  • 4e74de4: 8264111: (fs) Leaking NativeBuffers in case of errors during UnixUserDefinedFileAttributeView.read/write
  • 57115fa: 8189198: Add "forRemoval = true" to Applet API deprecations
  • ... and 198 more: https://git.openjdk.java.net/jdk/compare/c484d8904285652246c3af212a4211b9a8955149...master

Your commit was automatically rebased without conflicts.

Pushed as commit a209ed0.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@YaSuenag YaSuenag deleted the JDK-8263670 branch Mar 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
3 participants