Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8263729: [test] divert spurious output away from stream under test in ProcessBuilder Basic test #3049

Closed
wants to merge 4 commits into from

Conversation

RogerRiggs
Copy link
Contributor

@RogerRiggs RogerRiggs commented Mar 17, 2021

Intermittent failures on Windows in a test of destroying the child warrant extending the time the parent waits after starting the child before destroying the child.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8263729: [test] divert spurious output away from stream under test in ProcessBuilder Basic test

Reviewers

Download

To checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3049/head:pull/3049
$ git checkout pull/3049

To update a local copy of the PR:
$ git checkout pull/3049
$ git pull https://git.openjdk.java.net/jdk pull/3049/head

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Mar 17, 2021

👋 Welcome back rriggs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Mar 17, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Mar 17, 2021

@RogerRiggs The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs label Mar 17, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 17, 2021

Webrevs

@@ -2179,7 +2179,7 @@ public void run() {

thread.start();
latch.await();
Thread.sleep(10);
Thread.sleep(100L); // Wait for child initialization to settle

Copy link
Contributor

@plevart plevart Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Roger,
Shouldn't the code that follows this sleep already be enough to wait for child thread to get a lock on the stream?

                Thread.sleep(100L);     // Wait for child initialization to settle

                if (s instanceof BufferedInputStream) {
                    // Wait until after the s.read occurs in "thread" by
                    // checking when the input stream monitor is acquired
                    // (BufferedInputStream.read is synchronized)
                    while (!isLocked(s, 10)) {
                        Thread.sleep(100);
                    }
                }

From 1st glance I would conclude that the 1st sleep is unnecessary. Is there a platform where Input/Error stream is not a BufferedInputStream? Or is it calling Process.destory() immediately after the child thread enters BufferedInputStream.read synchronized method perhaps too early to trigger appropriate exception in the child thread? In that case the additional sleep should be added after the while(!isLocked... loop.

Copy link
Contributor Author

@RogerRiggs RogerRiggs Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That loop is checking that the Thread (in the parent) reading from the child is in the correct state (blocked). On Windows, it is a BufferedInputStream.

But it does not indicate anything about the state of the child process.
From the scant information from previous failures, it appears that the the creation of some thread (not a java thread) in the child has failed.
The additional time is to avoid destroying the child while it is still initializing.

Copy link
Member

@iklam iklam Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failures happened in tiers 6 and 8. The system may be overloaded so even 100ms may not be enough for the child process to start sleeping. From the error log, the child process tried to spawn a thread (probably one of those usually started during VM bootstrap) at around 118ms

[0.118s][warning][os,thread] Failed to start thread - _beginthreadex failed (EACCES) for attributes: stacksize: default, flags: 

The test runs 4 times. Each time it checks only STDOUT or STDERR, but not both. So I think we can use the other stream to signal to the main process that the child process is ready. That would be more reliable than an arbitrary wait time.

Copy link
Contributor Author

@RogerRiggs RogerRiggs Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That complicates the test and the child quite a bit for minimal gain.

Copy link
Member

@iklam iklam Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arbitrary time out has been a reliable source of intermittent failures.

Since we have spent a lot of time analyzing this failure, I think it's worthwhile to fix it properly, which doesn't seem that complicated. That's better than the same bug happening again a year later and a different set of people would spend hours to analyze it again.

Copy link
Contributor Author

@RogerRiggs RogerRiggs Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expects there to be zero output from the child (and it doesn't matter what state the child is in).
Can the logging from the VM be disabled or re-directed?

Copy link
Member

@tstuefe tstuefe Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expects there to be zero output from the child (and it doesn't matter what state the child is in).
Can the logging from the VM be disabled or re-directed?

Not to the extend that it would be guaranteed never to happen. Even if we control all output in the hotspot, there are other libraries. E.g. glibc writes a lengthy report to stderr in case of a heap corruption, which I believe does not result in a hs-err file.

One simple solution, simpler than using two threads, could be to use ProcessBuilder::redirectError(Redirect.INHERIT) if reading stdout resp. ProcessBuilder::redirectOutput(Redirect.INHERIT) if reading stderr. One line, takes care of the stream you don't read does not block, and, we can see the child output in the parent stdout/err.

Copy link
Contributor

@plevart plevart Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That complicates the test and the child quite a bit for minimal gain.

Suppose that the child process writes 1 byte to STDOUT and 1 byte to STDERR. In the monitoring thread in the parent process, you could read 1 byte from either STDOUT or STDERR, then count-down the latch and then block in next read. The main thread could then just wait for the latch, wait for the monitoring thread to re-enter the synchronized read method and then destroy the process... It doesn't sound too complicated. WDYT?

Copy link
Contributor Author

@RogerRiggs RogerRiggs Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redirecting the stream not being tested to INHERIT may be useful.
But the failing case is when the vm output is being directed to the stream being tested.
The command line arguments are:

    Arrays.asList(javaExe,
        "-XX:+DisplayVMOutputToStderr",
        "-classpath", absolutifyPath(classpath),
        "Basic$JavaChild");```
A combination of redirecting to inherited and sending the VM output to the same (untested) stream seems like a good option.

Copy link
Contributor Author

@RogerRiggs RogerRiggs Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plevart, I prototyped using that technique but was concerned that it might change the test case to be different in some subtle way from the original. It adds a new subcommand to the JavaChild to add the specific behavior.
Not particularly complex but more code and dependencies to explain.

Copy link
Member

@tstuefe tstuefe left a comment

Hi Roger, thanks for taking my suggestion. Minor point below.

Cheers, Thomas

break;
case 1:
childArgs.set(1, "-XX:+DisplayVMOutputToStdout");
pb.redirectInput(INHERIT);
Copy link
Member

@tstuefe tstuefe Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not redirectOutput() ?

Copy link
Contributor Author

@RogerRiggs RogerRiggs Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. will fix.
(The Process streams are from the point of view of parent).

Copy link
Member

@tstuefe tstuefe Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have to look this up every time. Its not intuitive.

iklam
iklam approved these changes Mar 19, 2021
Copy link
Member

@iklam iklam left a comment

LGTM. Just a small nit.

List<String> childArgs = new ArrayList<String>(javaChildArgs);
List<String> childArgs = new ArrayList<>(javaChildArgs);
final ProcessBuilder pb = new ProcessBuilder(childArgs);
{
Copy link
Member

@iklam iklam Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little confused until I looked up the JavaDoc of ProcessBuilder.

Maybe add a comment like "ProcessBuilder does NOT make a copy of childArgs so it's OK to modify it below"?

Copy link
Member

@tstuefe tstuefe left a comment

Looks good to me, if the intent was to fix potential write blockages in the child process (in which case maybe reformulate the issue title).

I'm sorry I derailed your original wait idea, as well as Ioi's ideas of syncing with the child. Both may still be needed after this fix.

Cheers, Thomas

@RogerRiggs RogerRiggs changed the title 8263729: [test] Extend time to wait before destroying child in ProcssBuilder Basic test [test] divert spurious output away from stream under test in ProcessBuilder Basic test Mar 21, 2021
@openjdk openjdk bot removed the rfr label Mar 21, 2021
@RogerRiggs RogerRiggs changed the title [test] divert spurious output away from stream under test in ProcessBuilder Basic test 8263729: [test] divert spurious output away from stream under test in ProcessBuilder Basic test Mar 22, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Mar 22, 2021

@RogerRiggs This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8263729: [test] divert spurious output away from stream under test in ProcessBuilder Basic test

Reviewed-by: stuefe, iklam

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 68 new commits pushed to the master branch:

  • 6c2220e: 8263861: Shenandoah: Remove unused member in ShenandoahGCStateResetter
  • 5262d95: 8263855: Use the blessed modifier order in java.management/naming
  • 6f1bcb0: 8263593: Fix multiple typos in hsdis README
  • a9d2267: 8260589: Crash in JfrTraceIdLoadBarrier::load(_jclass*)
  • 42104e5: 8263488: Verify CWarningWindow works with metal rendering pipeline
  • 5a7f22a: 8263579: ZGC: Concurrent mark hangs with debug loglevel
  • 35cd945: 8263908: Build fails due to initialize_static_field_for_dump defined but not used after JDK-8263771
  • cd45538: 8263771: Refactor javaClasses initialization code to isolate dumping code
  • 118a49f: 8263846: Bad JNI lookup getFocusOwner in accessibility code on Mac OS X
  • cb742f9: 8255255: Update Apache Santuario (XML Signature) to version 2.2.1
  • ... and 58 more: https://git.openjdk.java.net/jdk/compare/000012a3b08d5f21d4cb2369d2122adfae1a3e9b...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added ready rfr labels Mar 22, 2021
@RogerRiggs
Copy link
Contributor Author

@RogerRiggs RogerRiggs commented Mar 22, 2021

Thanks for the reviews and the suggestion to redirect the output to the inherited streams.
/integrate

@openjdk openjdk bot closed this Mar 22, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Mar 22, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Mar 22, 2021

@RogerRiggs Since your change was applied there have been 68 commits pushed to the master branch:

  • 6c2220e: 8263861: Shenandoah: Remove unused member in ShenandoahGCStateResetter
  • 5262d95: 8263855: Use the blessed modifier order in java.management/naming
  • 6f1bcb0: 8263593: Fix multiple typos in hsdis README
  • a9d2267: 8260589: Crash in JfrTraceIdLoadBarrier::load(_jclass*)
  • 42104e5: 8263488: Verify CWarningWindow works with metal rendering pipeline
  • 5a7f22a: 8263579: ZGC: Concurrent mark hangs with debug loglevel
  • 35cd945: 8263908: Build fails due to initialize_static_field_for_dump defined but not used after JDK-8263771
  • cd45538: 8263771: Refactor javaClasses initialization code to isolate dumping code
  • 118a49f: 8263846: Bad JNI lookup getFocusOwner in accessibility code on Mac OS X
  • cb742f9: 8255255: Update Apache Santuario (XML Signature) to version 2.2.1
  • ... and 58 more: https://git.openjdk.java.net/jdk/compare/000012a3b08d5f21d4cb2369d2122adfae1a3e9b...master

Your commit was automatically rebased without conflicts.

Pushed as commit 0abbfb2.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@RogerRiggs RogerRiggs deleted the 8263729-basic-wait branch Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs integrated
4 participants