Skip to content

8379967: (process) Improve ProcessBuilder error reporting#30232

Open
tstuefe wants to merge 9 commits intoopenjdk:masterfrom
tstuefe:ProcessBuilder-better-error-infos
Open

8379967: (process) Improve ProcessBuilder error reporting#30232
tstuefe wants to merge 9 commits intoopenjdk:masterfrom
tstuefe:ProcessBuilder-better-error-infos

Conversation

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Mar 13, 2026

While working on a patch for https://bugs.openjdk.org/browse/JDK-8377907, we saw that error analysis in this area can take an unreasonable amount of time, especially for cases that are not directly reproducible on a developer machine.

This is because if an error occurs in the child process between the fork and exec stages (FORK mode) or between the first and second exec stages (POSIX_SPAWN mode), the ways in which we can communicate the errors are quite limited: standard IO may not yet work, and there is no logging. All we have is a single 32-bit value that we send back to the parent.

Today, this 32-bit value contains errno or a handful of our own error codes (which even overlap with errno values).

Following an idea by @RogerRiggs, this patch enhances the returned code by encoding more information:

  1. An 8-bit step number: this shows us the exact step at which the child program encountered an error.
  2. An optional 8-bit errno: like today, but only set for errors that are directly associated with an API call
  3. An optional 16-bit "hint": Free-form information whose meaning depends on the step.

The system is clean and easily expandable with more detailed error information, should we need it. We can also bump the error code to 64-bit to get more encoding space, but I left it at 32-bit for now.


Error handling:

When an error occurs, we now attempt to send the error code back to the parent as before, but if that fails (e.g., because the fail pipe was not established), we also print the error code and exit the child process with an exit code corresponding to the step number.

Where we print the error code, we use the form (<step>-<hint>-<errno>). For example, "(2-0-0)" is a jspawnhelper version mismatch. "(3-17-9)" means that in step 3 (early jspawnhelper setup), we found that file descriptor 17 was invalid (9=EBADF). We only do this for internal errors that are meant to be read by us, not by the end user.

As a by-product of this patch, we now get higher error granularity for some user-facing errors. E.g., if the caller handed in an invalid working directory, the IOE exception message now reads "Invalid working directory" instead of the generic "Exec failed".


Implementation notes:

  • The "CHILD_ALIVE" ping was changed to be a special errcode with a special step, ESTEP_CHILD_ALIVE; but it works the same.
  • Where functions were changed to return error codes in case of an error, I always followed the same scheme. See comment in childproc.c.
  • I started to adopt bool in functions I changed, since I think it reads cleaner. Apart from that, I kept changes as small as possible.

Testing:

  • I manually ran tests on Linux x64 (glibc and muslc), MacOS arm64 and AIX.
  • GHAs ran successfully

Note: I intend to push this fix first, and then rebase #29939 to use it. That will make it easier for us to hunt down the remaining issues in #29939.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8379967: (process) Improve ProcessBuilder error reporting (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/30232/head:pull/30232
$ git checkout pull/30232

Update a local copy of the PR:
$ git checkout pull/30232
$ git pull https://git.openjdk.org/jdk.git pull/30232/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 30232

View PR using the GUI difftool:
$ git pr show -t 30232

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/30232.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 13, 2026

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 13, 2026

Ping @RogerRiggs

@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe This change is no longer ready for integration - check the PR body for details.

@openjdk openjdk bot added build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe The following labels will be automatically applied to this pull request:

  • build
  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@tstuefe tstuefe force-pushed the ProcessBuilder-better-error-infos branch from 1fd31f8 to 8f950b4 Compare March 13, 2026 05:15
@tstuefe tstuefe force-pushed the ProcessBuilder-better-error-infos branch from 8f950b4 to 5afcf2b Compare March 13, 2026 05:17
@openjdk openjdk bot added security security-dev@openjdk.org serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org client client-libs-dev@openjdk.org compiler compiler-dev@openjdk.org labels Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe client, compiler, hotspot, security, serviceability have been added to this pull request based on files touched in new commit(s).

@tstuefe tstuefe changed the title JDK-8379967: (process) Improve Processbuilder error reportingRefs/heads/process builder better error infos JDK-8379967: (process) Improve Processbuilder error reporting Mar 13, 2026
@openjdk openjdk bot changed the title JDK-8379967: (process) Improve Processbuilder error reporting 8379967: (process) Improve Processbuilder error reporting Mar 13, 2026
@tstuefe tstuefe marked this pull request as ready for review March 13, 2026 05:53
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 13, 2026
@mlbridge
Copy link

mlbridge bot commented Mar 13, 2026

Webrevs

@tstuefe
Copy link
Member Author

tstuefe commented Mar 13, 2026

/label remove hotspot
/label remove serviceability
/label remove security
/label remove compiler
/label remove client
/label remove build

@openjdk openjdk bot removed the hotspot hotspot-dev@openjdk.org label Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe
The hotspot label was successfully removed.

@openjdk openjdk bot removed the serviceability serviceability-dev@openjdk.org label Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe
The serviceability label was successfully removed.

@openjdk openjdk bot removed the security security-dev@openjdk.org label Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe
The security label was successfully removed.

@openjdk openjdk bot removed the compiler compiler-dev@openjdk.org label Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe
The compiler label was successfully removed.

@openjdk openjdk bot removed the client client-libs-dev@openjdk.org label Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe
The client label was successfully removed.

@openjdk openjdk bot removed the build build-dev@openjdk.org label Mar 13, 2026
@openjdk
Copy link

openjdk bot commented Mar 13, 2026

@tstuefe
The build label was successfully removed.

@openjdk openjdk bot changed the title 8379967: (process) Improve Processbuilder error reporting 8379967: (process) Improve ProcessBuilder error reporting Mar 13, 2026
Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, A bit more detailed than I was thinking but is should be worth it when things take an unexpected turn.

* @requires vm.flagless
* @library /test/lib
* @run main/othervm -Xmx64m -Djdk.lang.Process.launchMechanism=FORK InvalidWorkDir
* @summary Check that passing an invalid work dir yields a corresponding IOE text.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tag order is usually @test, @bug, @summary, @library, @requires, @run.

public class InvalidWorkDir {

public static void main(String[] args) {
ProcessBuilder bld = new ProcessBuilder("ls").directory(new File("/doesnotexist"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the / root directory seems unusual, the current directory (for tests) is safe and the directory is cleared before running.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 13, 2026

Looks good, A bit more detailed than I was thinking but is should be worth it when things take an unexpected turn.

I am just sick of outguessing the jspawnhelper logic :-)

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add bugid's and you're good to go.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 13, 2026
tstuefe and others added 4 commits March 14, 2026 07:00
Co-authored-by: Roger Riggs <Roger.Riggs@Oracle.com>
Co-authored-by: Roger Riggs <Roger.Riggs@Oracle.com>
Co-authored-by: Roger Riggs <Roger.Riggs@Oracle.com>
Co-authored-by: Roger Riggs <Roger.Riggs@Oracle.com>
@tstuefe
Copy link
Member Author

tstuefe commented Mar 14, 2026

Thank you, Roger!

Let's see if Skara lets me integrate without re-approval.

/integrate

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Mar 14, 2026
@openjdk
Copy link

openjdk bot commented Mar 14, 2026

@tstuefe Your integration request cannot be fulfilled at this time, as the status check jcheck-openjdk/jdk-30232 has not been performed on commit 8c2b41e yet

@tstuefe
Copy link
Member Author

tstuefe commented Mar 14, 2026

@RogerRiggs Need a re-review :-(

@tstuefe
Copy link
Member Author

tstuefe commented Mar 14, 2026

/integrate

@openjdk
Copy link

openjdk bot commented Mar 14, 2026

@tstuefe This pull request has not yet been marked as ready for integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs core-libs-dev@openjdk.org rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

2 participants