Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8324668: JDWP process management needs more efficient file descriptor handling #17588

Closed
wants to merge 10 commits into from

Conversation

jaikiran
Copy link
Member

@jaikiran jaikiran commented Jan 26, 2024

Can I please get a review of this change which proposes to address https://bugs.openjdk.org/browse/JDK-8324668?

This change proposes to fix the issue in jdwp where when launching a child process (for the launch=<cmd> option), it iterates over an extremely large number of file descriptors to close each one of those. Details about the issue and the proposed fixed are added in a subsequent comment in this PR #17588 (comment). tier1, tier2 and tier3 testing is currently in progress with this change.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8324668: JDWP process management needs more efficient file descriptor handling (Enhancement - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17588/head:pull/17588
$ git checkout pull/17588

Update a local copy of the PR:
$ git checkout pull/17588
$ git pull https://git.openjdk.org/jdk.git pull/17588/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17588

View PR using the GUI difftool:
$ git pr show -t 17588

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17588.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 26, 2024

👋 Welcome back jpai! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 26, 2024
@openjdk
Copy link

openjdk bot commented Jan 26, 2024

@jaikiran The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Jan 26, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 26, 2024

@RogerRiggs
Copy link
Contributor

The PR description appears in every email sent for the PR. For PRs with extensive descriptions, perhaps they can be included as a separate comment.

@jaikiran
Copy link
Member Author

The JVM can be launched with the jdwp agent, like -agentlib:jdwp=transport=xxx,server=y. Among other options to the jdwp agentlib, one option is the launch option, which is of the form launch=<cmdline>. cmdline can be any arbitrary command.

When that option is set, jdwp during initialization will then launch that command. The implementation of launching the command (on *nix) involves fork() followed by execvp(). After the child process is forked, jdwp implementation, closes the file descriptors that were copied over (by fork()) from the parent (JVM) process into the (arbitrary) child process. Since jdwp doesn't know how many such file descriptors exist, it queries a system level configuration for that launched process to know max allowed file descriptors (OPEN_MAX). It then iterates over each of these file descriptor and calls close() on them sequentially one at a time, irrespective of whether or not those many file descriptors were indeed present or not.

Until recently, on macOS, for the JVM we used to limit the max file descriptors to 10240. The jdwp code above would then iterate over these 10240 file descriptors (I'm leaving out the part where the code skips the standard input/output/error file descriptors) and close them one at a time (even if there weren't that many). This apparently would finish in reasonable amount of time (likely slow, but wasn't noticed to cause any issues).

In https://bugs.openjdk.org/browse/JDK-8300088 we stopped setting that 10240 limit on macOS for the JVM. After that change, on macos, the max file descriptors for JVM turns out to be 2147483647. When jdwp then launches the child process from the JVM, it then has to now iterate almost 2147483647 times and close each of the file descriptors in that logic. This now became extremely slow and noticable and caused 3 tests (https://bugs.openjdk.org/browse/JDK-8324578) which were using the launch=<cmdline> option to start failing with a timeout - the jdwp process hadn't yet finished the iteration to close the file descriptors for the child process, by the time jtreg noticed the test had gone past the allowed timeout.

In theory, this entire code which deals with launching a child process from the JVM is what the native implementation of Java's java.lang.ProcessBuilder.start() might do. So l had a look at how it's handled there and it turns out, that the implementation there has an optimized way to close these copied over descriptors for the child process. That's done in the closeDescriptors() function here https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libjava/childproc.c#L74. What that code does is, it relies on a platform specific directory whose entires contain the file descriptors that are open for the current process. Within the launched child process, that code then lists that platform specific directory and for each returned entry it closes that file descriptor. The advantage of this is that it returns a more accurate and realistic number of file descriptors that have been actually copied over to the child process, instead of using the (extremely high) OPEN_MAX value as the upper bound. The code in libjava has an additional fallback which it uses if this logic of closing descriptor doesn't succeed. The fallback is same as what jdwp current does as the default i.e. using OPEN_MAX as the upper bound and iterating over each and closing each descriptor.

The closeDescriptors() in libjava isn't exported, so we can't just use it in jdwp. The commit in this PR copies over closeDescriptors() function from libjava into the jdwp code and then uses that as the default approach of closing the copied over file descriptors of the child process. If that fails, then we fallback to the current approach of using OPEN_MAX as the upper bound for the iteration.

Relatively minor changes have also been introduced in the copied over code - changes like:

- adding a JDI_ASSERT at a couple of places
- additional code comments at relevant places
- usage of errno in one place
- using rlim_t (from limits.h) instead of int for max_fd variable

I have tested this change with and without the changes done in https://bugs.openjdk.org/browse/JDK-8300088 and the com/sun/jdi/OnThrowTest.java, com/sun/jdi/JITDebug.java and com/sun/jdi/JdwpOnThrowTest.java tests now pass in both cases.

My knowledge of C is elementary, so there might be a few more things that need to be considered in this change. Looking forward to any such changes that might be necessary. For example, I'm not sure if <limits.h> in AIX defines rlim_t type. However, looking at one other place where rlim_t gets used in the native JDK code is in IOUtil.c https://github.com/openjdk/jdk/blob/master/src/java.base/unix/native/libnio/ch/IOUtil.c#L161 and looking at that code, I don't see any AIX specific overridden files or code for its usage. So I'm guessing the usage of rlim_t that this change introduces should be OK on AIX too.

Finally, the libjava code for launching child process has some more specifics, which I didn't want to introduce/copy over here. I decided to focus solely on the file descriptors issue in this PR. I haven't yet reviewed whether jdwp would benefit from doing those additional things that libjava does when launching the child process.

@jaikiran
Copy link
Member Author

Hello Roger,

The PR description appears in every email sent for the PR. For PRs with extensive descriptions, perhaps they can be included as a separate comment.

I had forgotten this. I've updated the description to be much shorter now and added a separate comment for the details.

@jaikiran
Copy link
Member Author

The failures in Github actions jobs are related to this change:

/home/runner/work/jdk/jdk/src/jdk.jdwp.agent/unix/native/libjdwp/exec_md.c:129:15: 
error: unknown type name 'rlim_t'

I will address them soon.

Copy link

@gerard-ziemski gerard-ziemski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using fcntl(d, F_SETFD, 1) instead of the fancy closeDescriptors()?

I haven't tested it myself, but per the man close page:

Most of the descriptors can be rearranged with dup2(2) or deleted with close() before the execve is attempted, but if some of these descriptors will still be needed if the execve fails, it is necessary to arrange for them to be closed if the execve succeeds.  For this reason, the call “fcntl(d, F_SETFD, 1)” is provided, which arranges that a descriptor will be closed after a successful execve;

/* find max allowed file descriptors for a process
* and assume all were opened for the parent process and
* copied over to this child process. we close them all */
const rlim_t max_fd = sysconf(_SC_OPEN_MAX);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use int, like the original code, instead of rlim_t - as per man page close() expects int, not rlim_t, ex:

        const int max_fd = (int)sysconf(_SC_OPEN_MAX);
        JDI_ASSERT(max_fd != -1);
        int fd;
         /* leave out standard input/output/error file descriptors */
         for (fd = STDERR_FILENO + 1; fd < max_fd; fd++) {
             (void)close(fd);
         }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Gerard, my understanding is that the limit value configured may exceed the int range. I wanted to avoid the overflow by casting it to int in such cases. I had noticed close() takes an int, but I couldn't think of any other way of avoiding the overflow at this place.

In the JVM parent process we do however limit it to INT_MAX. So maybe I should assume that it will be an int cast it to int here, like you suggest, and add a code comment explaining this? Does that sound OK?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine, but in that case we should also do:

JDI_ASSERT(max_fd <= INT_MAX);

in case sysconf(_SC_OPEN_MAX) returns value greater than INT_MAX, since close() only accepts int.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've updated the PR to add this assert. Additionally I've also slightly changed the implementation in the newly introduced closeDescriptors() method to ensure that we don't pass values to close() that exceed INT_MAX.

close(fd);
}

closedir(dp);
Copy link

@gerard-ziemski gerard-ziemski Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be:

(void)close(fd);

and

(void)closedir(dp);

to show that we're ignoring the return value?

Since this area is pretty mysterious to deal with, should we consider checking whether closedir() actually returns am error, and at least be verbal and print a warning, as opposed to silently ignoring it?

Printing a warning, might help us diagnose any issue in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be:
(void)close(fd);
and
(void)closedir(dp);
to show that we're ignoring the return value?

Done. I've updated the PR to cast these calls to void.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this area is pretty mysterious to deal with, should we consider checking whether closedir() actually returns am error, and at least be verbal and print a warning, as opposed to silently ignoring it?
Printing a warning, might help us diagnose any issue in the future?

I think printing a warning if closedir() fails may not be useful and may be misleading too. But your point about logging a warning in this area sounds reasonable. I think what we could log as a warning is if we fallback to the slow sequential close() approach if/when this optimal closeDescriptor() implementation fails. I've updated the PR with such a log message. Do you think that would be enough?

@gerard-ziemski
Copy link

gerard-ziemski commented Jan 26, 2024

Looks good, thank you for debugging this and providing a fix!

I did have some questions and feedback, nothing major, just making sure we cover all our bases.

@jaikiran
Copy link
Member Author

Have you considered using fcntl(d, F_SETFD, 1) instead of the fancy closeDescriptors()?

I haven't tested it myself, but per the man close page:

Most of the descriptors can be rearranged with dup2(2) or deleted with close() before the execve is attempted, but if some of these descriptors will still be needed if the execve fails, it is necessary to arrange for them to be closed if the execve succeeds.  For this reason, the call “fcntl(d, F_SETFD, 1)” is provided, which arranges that a descriptor will be closed after a successful execve;

I hadn't considered that. Now that you mentioned it, I had a look at what man close says and man fcntl says. My understanding of it is that we use fcntl(fd, F_SETFD, FD_CLOEXEC) (the 1 in the example corresponds to FD_CLOEXEC) if we want to close a specific file descriptor only if a subsequent call to execvp (or similar exec function) succeeds. If the subsequent call to execvp fails, then the file descriptor won't be closed. From what I understand of the code in jdwp, I don't think we need that behaviour here. Plus using fcntl won't get rid of our closeDescriptors() function. We will still need it, it's just the call to close() within the loop that gets replaced by a call to fcntl.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on @jaikiran and for determining the underlying problem!

Seems a reasonable approach to copy. A few nits and a suggestion.

Thanks.

@@ -44,6 +48,104 @@ static char *skipNonWhitespace(char *p) {
return p;
}

int
isAsciiDigit(char c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard isdigit function from ctype.h does this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 71 to 75
// closes every file descriptor that is listed as a directory
// entry in "/proc/self/fd" (or its equivalent). standard
// input/output/error file descriptors will not be closed
// by this function. this function returns 0 on failure
// and 1 on success
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing nit: Please start sentences with a capital and end with a period. Thanks. This applies through the copied code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed the code comments to follow this suggestion. I've left single sentence comments as is. If there's a preference to change them too, do let me know and I'll update those too.

snprintf(aix_fd_dir, 32, "/proc/%d/fd", getpid());
#endif

if ((dp = opendir(FD_DIR)) == NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to log the failure here, where you can actually report the reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jaikiran
Copy link
Member Author

jaikiran commented Jan 29, 2024

Thank you David for those inputs, I've updated the PR to implement those suggestions. CI tests continue to pass with these latest changes.

Copy link

@gerard-ziemski gerard-ziemski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the changes make the code better already, I just want to make sure we take this opportunity to polish this code really well...

/* find max allowed file descriptors for a process
* and assume all were opened for the parent process and
* copied over to this child process. we close them all */
const rlim_t max_fd = sysconf(_SC_OPEN_MAX);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine, but in that case we should also do:

JDI_ASSERT(max_fd <= INT_MAX);

in case sysconf(_SC_OPEN_MAX) returns value greater than INT_MAX, since close() only accepts int.

" child process optimally, falling back to closing"
" %d file descriptors sequentially", (max_fd - i + 1)));
for (; i < max_fd; i++) {
(void)close(i);
Copy link

@gerard-ziemski gerard-ziemski Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, I'd really like to check for close()for errors here and report any.

And if we find errors here, do we really want to proceed knowing that something went wrong already?

int fd;
if (isdigit(dirp->d_name[0]) &&
(fd = strtol(dirp->d_name, NULL, 10)) >= from_fd) {
(void)close(fd);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really, really like to check close() for errors and report any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. If we are seeing issues with close(), probably it is best to fail. That way we'll get a bug report and can figure out exactly why it is failing rather than blindly proceeding without having properly closed all fds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to check close for errors then we will have to know for certain we are only trying to close open fd's, and we will have to deal with EINTR. I would think this is a "best effort" to close open FD's and not something we have to care about in detail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to check close for errors then we will have to know for certain we are only trying to close open fd's, and we will have to deal with EINTR. I would think this is a "best effort" to close open FD's and not something we have to care about in detail.

Right, EINTR would need to be ignored as it would be wrong to restart. I'm scratching my head as to why this code would even do with EIO or if it even possible here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at man close, EIO gets returned if:

[EIO] A previously-uncommitted write(2) encountered an input/output error.

If I understand this launch=<cmd> option correctly for the jdwp agent, then the arbitrary command gets launched very early in the lifecycle of the JVM. So in theory, I think there shouldn't be too many file descriptors that the JVM has opened so far and the chances of it having started writing to some of those opened file descriptors (other than standard out/err) perhaps is slim? But that's just a guess. Probably not a strong one too, given that if the JVM was launched with any other agent along with jdwp, then it's unknown if the other agent would have opened any file descriptors for write.

My understanding of these close() calls after fork() in this code, matches David's - I think it's a best effort attempt. So perhaps ignoring such failures, like we do now, is OK?

Copy link

@gerard-ziemski gerard-ziemski Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring errors just doesn't seem like a good strategy (especially in light of so many mysterious timeouts that are plaguing MACH5 tests mostly on macOS)

This fix right here that we are doing doesn't have to deal with making this code more robust, as long as we follow up in another issue. For that then, I filed JDK-8324979, and that frees us up to move on with this fix.

Comment on lines +126 to +129
if (closeDescriptors() == 0) { /* failed, close the old way */
/* Find max allowed file descriptors for a process
* and assume all were opened for the parent process and
* copied over to this child process. We close them all. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat leaning towards this just being a fatal error. Why would it ever fail? Do we think the version in childproc.c that also does this is doing so for a good reason, or perhaps just like jdwp it used to use the slow version and then kept it around as a backup when the faster version was introduced.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are sliding into " the perfect is the enemy of the good" here. If you want to investigate the resilience of closing FD's then I suggest you look at that in a different PR and let Jai integrate this optimization to avoid the timeouts that were seen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Chris, I had a look at the ProcessImpl implementation to see why that fallback is in place. Looking at the ProcessImpl implementation, it has "launch mechanisms" when launching a child process. It is configurable through a system property.
By default, when not configured, the ProcessImpl implementation uses posix_spawn() instead of fork(). So in its current form on macos (given the context of this PR), when the ProcessBuilder java API gets used to start child processes, the ProcessImpl's implementation ends up using posix_spawn() and thus this entire file descriptor closing in childproc.c never gets called - neither the closeDescriptors() nor the fallback of closing OPEN_MAX number of descriptors. (I won't go into implementation details of what ProcessImpl does when posix_spawn() is used, because after looking at it, I have more questions than answers and I think discussing those here is going to cause distraction).

As for this specific if block, the only time we enter here is if we can't opendir() the platform specific directory containing the open file descriptors. Whether or not it realistically happens, I'm unsure. I think having this fallback is perhaps a better option than just failing in such cases - this fallback when activated will certainly "slowdown" the launch of the arbitrary command but when no timeouts are in place, the arbritary command would indeed get launched.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates look good to me. Thanks.

@gerard-ziemski
Copy link

gerard-ziemski commented Jan 30, 2024

I filed https://bugs.openjdk.org/browse/JDK-8324979 as a followup to make sure that we can get this code as robust as possible.

Copy link

@gerard-ziemski gerard-ziemski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix looks very good, thank you @jaikiran !

Comment on lines 137 to 139
LOG_MISC(("warning: failed to close file descriptors of"
" child process optimally, falling back to closing"
" %d file descriptors sequentially", (max_fd - i + 1)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one is ever going to see this. This logging is off by default, and when turned on goes to the logfile, not stdout or stderr. Perhaps ERROR_MESSAGE() would be better, which goes to stderr and also gives the option of aborting if the debug agent is launched with errorexit=y.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Chris for pointing to ERROR_MESSAGE(). I've updated the PR to use that.

Copy link
Contributor

@plummercj plummercj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for taking this on.

@openjdk
Copy link

openjdk bot commented Jan 31, 2024

@jaikiran This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8324668: JDWP process management needs more efficient file descriptor handling

Reviewed-by: gziemski, dholmes, cjplummer

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 63 new commits pushed to the master branch:

  • 432756b: 8325024: java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java incorrect comment information
  • fe78c0f: 8325022: Incorrect error message on client authentication
  • 5b9b176: 8324174: assert(m->is_entered(current)) failed: invariant
  • 0cc8e5b: 8325042: remove unused JVMDITools test files
  • f292053: 8323621: JDK build should exclude snippet class in java.lang.foreign
  • 62c9530: 8324238: [macOS] java/awt/Frame/ShapeNotSetSometimes/ShapeNotSetSometimes.java fails with the shape has not been applied msg
  • 1f2922a: 8320342: Use PassFailJFrame for TruncatedPopupMenuTest.java
  • 2cd1ba6: 8324981: Shenandoah: Move commit and soft max heap changed methods into heap
  • 1733d2e: 8303374: Implement JEP 455: Primitive Types in Patterns, instanceof, and switch (Preview)
  • 6697160: 8320712: Rewrite BadFactoryTest in pure Java
  • ... and 53 more: https://git.openjdk.org/jdk/compare/10066cd4ef93db9d2bff3f7884d24a5c6e714775...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 31, 2024
@jaikiran
Copy link
Member Author

Thank you Chris, David and Gerard for the inputs and reviews. I'll integrate this in a day or two once I'm able to have a full CI run done.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor nits to fix before integrating but otherwise looks good. Thanks again for taking this on.

#include "sys.h"
#include "util.h"
#include "error_messages.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: to maintain include sort order this should have gone where log_messages.h was removed.

Copy link
Member Author

@jaikiran jaikiran Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello David, I had actually first put it in that line you noted, but that then lead to a compilation error:

In file included from /jdk/src/jdk.jdwp.agent/unix/native/libjdwp/exec_md.c:33:
/jdk/src/jdk.jdwp.agent/share/native/libjdwp/error_messages.h:35:20: error: unknown type name 'FILE'
void print_message(FILE *fp, const char *prefix,  const char *suffix,
                   ^
/jdk/src/jdk.jdwp.agent/share/native/libjdwp/error_messages.h:41:29: error: a parameter list without types is only allowed in a function definition
const char * jvmtiErrorText(jvmtiError);
                            ^
/jdk/src/jdk.jdwp.agent/share/native/libjdwp/error_messages.h:43:28: error: a parameter list without types is only allowed in a function definition
const char * jdwpErrorText(jdwpError);
                           ^
3 errors generated.

I think the reason for those compilation errors is that error_messages.h and a few other header files are not explicitly listing their necessary includes and are relying on transitive includes. So I decided to take the easy route out, by adding the error_messages.h after the util.h to rely on the transitive includes to get past that compilation error.

I think to properly fix that compile error, this following change (which I tested locally and works) would be needed:

diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/JDWP.h b/src/jdk.jdwp.agent/share/native/libjdwp/JDWP.h
index 92a9f457e8a..2683c6791a1 100644
--- a/src/jdk.jdwp.agent/share/native/libjdwp/JDWP.h
+++ b/src/jdk.jdwp.agent/share/native/libjdwp/JDWP.h
@@ -26,6 +26,8 @@
 #ifndef JDWP_JDWP_H
 #define JDWP_JDWP_H
 
+#include "jni_md.h"
+#include "jvmti.h"
 #include "JDWPCommands.h"
 
 /*
diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/error_messages.h b/src/jdk.jdwp.agent/share/native/libjdwp/error_messages.h
index 4126b76f226..810ab384f1a 100644
--- a/src/jdk.jdwp.agent/share/native/libjdwp/error_messages.h
+++ b/src/jdk.jdwp.agent/share/native/libjdwp/error_messages.h
@@ -26,6 +26,9 @@
 #ifndef JDWP_ERROR_MESSAGES_H
 #define JDWP_ERROR_MESSAGES_H
 
+#include <stdio.h>
+#include "JDWP.h"
+
 /* It is assumed that ALL strings are UTF-8 safe on entry */
 #define TTY_MESSAGE(args) ( tty_message args )
 #define ERROR_MESSAGE(args) ( \
diff --git a/src/jdk.jdwp.agent/unix/native/libjdwp/exec_md.c b/src/jdk.jdwp.agent/unix/native/libjdwp/exec_md.c
index b43d4de80fc..cf0d00db192 100644
--- a/src/jdk.jdwp.agent/unix/native/libjdwp/exec_md.c
+++ b/src/jdk.jdwp.agent/unix/native/libjdwp/exec_md.c
@@ -30,9 +30,9 @@
 #include <unistd.h>
 #include <string.h>
 #include <ctype.h>
+#include "error_messages.h"
 #include "sys.h"
 #include "util.h"
-#include "error_messages.h"
 
 static char *skipWhitespace(char *p) {
     while ((*p != '\0') && isspace(*p)) {

Do you think this should be fixed? I can file a JBS issue for it and address it as a separate PR.

(Edit: a more accurate patch/fix for missing includes)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please file a JBS issue, but no need for you to address it. Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created https://bugs.openjdk.org/browse/JDK-8325094 to track this.

Comment on lines 99 to 100
ERROR_MESSAGE(("failed to open dir %s while determining"
" file descriptors to close for process %d", FD_DIR, getpid()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: indent of second line is now off - please align "

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. PR has been update to fix this.

Comment on lines 137 to 139
ERROR_MESSAGE(("failed to close file descriptors of"
" child process optimally, falling back to closing"
" %d file descriptors sequentially", (max_fd - i + 1)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please realign lines on "

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jaikiran
Copy link
Member Author

tier1, tier2 and tier3 testing of the latest state of this PR has passed both with and without the changes that were done in https://bugs.openjdk.org/browse/JDK-8300088.

@jaikiran
Copy link
Member Author

jaikiran commented Feb 1, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Feb 1, 2024

Going to push as commit a663248.
Since your change was applied there have been 64 commits pushed to the master branch:

  • a2229b1: 8324838: test_nmt_locationprinting.cpp broken in the gcc windows build
  • 432756b: 8325024: java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java incorrect comment information
  • fe78c0f: 8325022: Incorrect error message on client authentication
  • 5b9b176: 8324174: assert(m->is_entered(current)) failed: invariant
  • 0cc8e5b: 8325042: remove unused JVMDITools test files
  • f292053: 8323621: JDK build should exclude snippet class in java.lang.foreign
  • 62c9530: 8324238: [macOS] java/awt/Frame/ShapeNotSetSometimes/ShapeNotSetSometimes.java fails with the shape has not been applied msg
  • 1f2922a: 8320342: Use PassFailJFrame for TruncatedPopupMenuTest.java
  • 2cd1ba6: 8324981: Shenandoah: Move commit and soft max heap changed methods into heap
  • 1733d2e: 8303374: Implement JEP 455: Primitive Types in Patterns, instanceof, and switch (Preview)
  • ... and 54 more: https://git.openjdk.org/jdk/compare/10066cd4ef93db9d2bff3f7884d24a5c6e714775...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 1, 2024
@openjdk openjdk bot closed this Feb 1, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 1, 2024
@openjdk
Copy link

openjdk bot commented Feb 1, 2024

@jaikiran Pushed as commit a663248.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@jaikiran jaikiran deleted the 8324668 branch February 1, 2024 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

6 participants