Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8271471: [IR Framework] Rare occurrence of "<!-- safepoint while printing -->" in PrintIdeal/PrintOptoAssembly can let tests fail #4932

Closed
wants to merge 6 commits into from

Conversation

chhagedorn
Copy link
Member

@chhagedorn chhagedorn commented Jul 29, 2021

A test VM used by the IR framework sometimes prints <!-- safepoint while printing --> in the middle of emitting a PrintIdeal or PrintOptoAssembly output which could lead to IR matching failures:

void ttyLocker::break_tty_lock_for_safepoint(intx holder) {
if (defaultStream::instance != NULL &&
defaultStream::instance->writer() == holder) {
if (xtty != NULL) {
xtty->print_cr("<!-- safepoint while printing -->");
}
defaultStream::instance->release(holder);
}
// (else there was no lock to break)
}

I thought about just bailing out of IR matching if this string is found after a failure but this issue also affects internal framework tests (I observed one case locally where this happened in the test TestIRMatching, letting it fail).

Handling <!-- safepoint while printing --> makes things more complicated for the IR framework tests. I'm not sure about the value of printing this message in the first place. But if nobody objects, I suggest to either remove it or at least guard it with Verbose, for example. I went with the latter solution in this PR.

Thanks,
Christian


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8271471: [IR Framework] Rare occurrence of "" in PrintIdeal/PrintOptoAssembly can let tests fail

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4932/head:pull/4932
$ git checkout pull/4932

Update a local copy of the PR:
$ git checkout pull/4932
$ git pull https://git.openjdk.java.net/jdk pull/4932/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4932

View PR using the GUI difftool:
$ git pr show -t 4932

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4932.diff

…ting -->" in PrintIdeal/PrintOptoAssembly can let tests fai
@bridgekeeper
Copy link

bridgekeeper bot commented Jul 29, 2021

👋 Welcome back chagedorn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8271471: [IR Framework] Rare occurrence of "<!-- safepoint while printing -->" in PrintIdeal/PrintOptoAssembly can let tests fai 8271471: [IR Framework] Rare occurrence of "<!-- safepoint while printing -->" in PrintIdeal/PrintOptoAssembly can let tests fail Jul 29, 2021
@openjdk
Copy link

openjdk bot commented Jul 29, 2021

@chhagedorn The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Jul 29, 2021
@chhagedorn
Copy link
Member Author

/label add hotspot-compiler

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jul 29, 2021
@openjdk
Copy link

openjdk bot commented Jul 29, 2021

@chhagedorn
The hotspot-compiler label was successfully added.

@chhagedorn chhagedorn marked this pull request as ready for review July 29, 2021 12:28
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 29, 2021
@mlbridge
Copy link

mlbridge bot commented Jul 29, 2021

Webrevs

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

@openjdk
Copy link

openjdk bot commented Jul 29, 2021

@chhagedorn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8271471: [IR Framework] Rare occurrence of "<!-- safepoint while printing -->" in PrintIdeal/PrintOptoAssembly can let tests fail

Reviewed-by: kvn, thartmann

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 65 new commits pushed to the master branch:

  • d06d0b9: 8272112: Arena code simplifications
  • 0c4be76: 8058704: Nimbus does not honor JTextPane background color
  • 020aec5: 8271366: [REDO] JDK-8266054 VectorAPI rotate operation optimization
  • 4d4ba5c: 8272116: Update PerfDisableSharedMem with FLAG_SET_ERGO in PerfMemory::create_memory_region
  • 09ab86b: 8269909: getStack method in hprof.parser.Reader should use try-with-resource
  • e4766ee: 8272391: Undeleted debug information
  • 428d516: 8140241: (fc) Data transfer from FileChannel to itself causes hang in case of overlap
  • 93cab7d: 8272315: Improve assert_different_registers
  • 9980b41: 8272107: Removal of Unsafe::defineAnonymousClass left a dangling C++ class
  • 464e874: 8048190: NoClassDefFoundError omits original ExceptionInInitializerError
  • ... and 55 more: https://git.openjdk.java.net/jdk/compare/0aca4f72ce48bc75f2c466d1cba4ad70d3c7b875...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 29, 2021
@dholmes-ora
Copy link
Member

I think the expectation is that a safepoint whilst printing and holding the tty lock should be a very rare thing, possibly indicative of an error, and so something we want to know about. If this is hidden by Verbose then it will effectively always be hidden and we won't spot this.

Do you understand why this is getting printed in this context, and that it doesn't indicate a problem?

Thanks,
David

@chhagedorn
Copy link
Member Author

I see your concern. Let me try to reproduce it and get some more information when it happens again. This might take some time. I'll get back to you.

@chhagedorn
Copy link
Member Author

I played around with it and interestingly this code path is hit quite often when just running java -Xbatch -XX:+PrintIdeal -XX:+PrintOptoAssembly --version.

I had a closer look at one of these occurrences. In this case, the reason we are executing break_tty_lock_for_safepoint() with a tty lock (i.e. at a safepoint while printing) was the following:
At some point, SharedRuntime::handle_wrong_method_ic_miss is called in thread X due to having cached the wrong method. This code then creates a new inline cache stub. In the meantime, the VM thread Y is waiting for a next VM operation in the loop. When the VM thread is calling VMThread::setup_periodic_safepoint_if_needed() it notices that a cleanup is needed (SafepointSynchronize::is_cleanup_needed() is true) due to the new IC stub. The cleanup operation is chosen as next VM operation. To execute this operation, it requires a safepoint. It sets the polling word of all threads, including the C2 compiler thread and SafepointMechanism::local_poll_armed() is now true. Unfortunately, the C2 compiler thread is currently printing with a tty lock acquired either at:

if (C->print_assembly()) {
ttyLocker ttyl; // keep the following output all in one block

or at
if (print_ideal()) {
ttyLocker ttyl; // keep the following output all in one block

Therefore, we break the lock inside ThreadInVMfromNative() at the next print call in the C2 compiler thread because SafepointMechanism::local_poll_armed() is true.

To me this seems okay and not a problem but just bad luck due to timing.

@dholmes-ora
Copy link
Member

Thanks @chhagedorn . I can see that this message is somewhat expected in this context. Is it really that difficult to ignore in the IR matching? This is XML-ouput and it is formatted as a comment, so shouldn't the matching logic ignore XML comments?

@chhagedorn
Copy link
Member Author

I can think of a few options:

  1. Keep matching code but if we catch an IRViolationException we additionally check if this tty message was found on the matched output. If that is the case, we bail out without an exception:
  • +: Easy and works well for normal user written IR tests. We only need to do this tty if there is actually an IR failure being reported. I'd say that it is unlikely that an IR test is written and is not executed a single time successfully without a bailout before integrating it.
  • -: Does not work well with framework internal tests which sometimes expect an exception (to check a few things on it) and sometimes not. Each test somehow needs to check if there was a bailout to reliably work. I guess it is okay to just ignore it for the "no exception expected" cases. We would catch possible errors in the framework the next time. For the "exception expected" tests, it's a little more difficult. The most simple thing to do is probably to just repeat a single test or a group of tests in an internal test file.
  1. Change the regex matching code to ignore the tty message:
  • +: No changes to internal tests.
  • -: Given that it occurs very rarely, it has quite an overhead.
  1. Strip a tty message when reading the PrintIdeal/PrintOptoAssembly entries from the log file when setting up the output to be parsed:
  • +: No changes to internal tests.
  • -: Given that it occurs very rarely, it has some overhead but probably less than the second option. We are reading each line anyways (I'm not actually parsing XML).

In terms of performance, it is probably best to bail out and do a one time effort to make the internal framework tests aware of this bailout scenario (option 1). While thinking about it for a second time, it might not be that complicated as I thought before. My original idea was just to first propose a removal of this tty message before going into fixing this rare case. But of course, given that there are concerns, I'm considering one of these options (my favorite is option 1).

@TobiHartmann
Copy link
Member

I would also go with option 1). For the framework internal tests that expect and exception, couldn't you simply add something like a hitSafepointWhilePrinting() method to IRViolationException that you can use to check if verification of the exception should be skipped?

@chhagedorn
Copy link
Member Author

chhagedorn commented Aug 10, 2021

I would also go with option 1). For the framework internal tests that expect and exception, couldn't you simply add something like a hitSafepointWhilePrinting() method to IRViolationException that you can use to check if verification of the exception should be skipped?

Thanks Tobias, I therefore went with option 1 and reverted the HotSpot change. The problem is when no IRViolationException is thrown due to the bailout in the IRMatcher. I directly made a check before throwing it (which is probably the same as your suggestion with hitSafepointWhilePrinting()). I made the internal framework tests aware the bailout by replacing "shouldNotReach" checks by shouldHaveCaughtException methods which check if we had this tty message somewhere. If not then the test fails. Normal user written IR tests just pass when an IRViolationException would have been thrown but the matched input contained the tty message.

import compiler.lib.ir_framework.driver.TestVMProcess;

public class Utils {
public static void shouldHaveCaughtException() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these methods be renamed to shouldHaveThrownException? Because that method could also be used in a context where we don't catch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that's a better name!

if (!hitSafePointWhilePrinting) {
throw new ShouldHaveCaughtException();
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we simply do:

    for (Scenario s : scenarios) {
        shouldHaveCaughtException(s.getTestVMOutput());
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed that code, see comment below.

@chhagedorn
Copy link
Member Author

I did a renaming and changed the way TestScenarios throws exceptions (was wrong how I had it before). TestScenarios should actually do some more finegrained checks and depending on which scenarios bailed out do different things. I added some comments for them.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'm just wondering if we need all that notAllBailedOut/noneBailedOut complexity for handling scenarios. Given that safepoints during printing are rare, couldn't we simply skip verification of the exception message if one scenario hit a bailout?

return !Arrays.stream(scenarios).allMatch(s -> s.getTestVMOutput().contains(IRMatcher.SAFEPOINT_WHILE_PRINTING_MESSAGE));
}

private static void shouldHaveThrownException(String testVMOutput) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this method is only used by shouldHaveThrownException(). I would remove it.

// Not all scenarios had a bailout which means that at least one exception should have been thrown.
Asserts.fail("Should have thrown an exception");
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline can be removed

@chhagedorn
Copy link
Member Author

Looks good to me. I'm just wondering if we need all that notAllBailedOut/noneBailedOut complexity for handling scenarios.

I think we need the notAllBailedOut() function to verify that there was not a scenario without a bailout (i.e. missed to throw an exception). The code of notAllBailedOut() could be simplified to use anyMatch instead.

Given that safepoints during printing are rare, couldn't we simply skip verification of the exception message if one scenario hit a bailout?

Yes, that's probably easier. We can also first check if the expected message was even mismatched in TestScenarios before actually doing the tty message check. I changed that by turning noneBailedOut into anyBailedOut which uses anyMatch.

But yes, I agree that it's some complexity for a rather rare case. I pushed an updated with these changes.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for making these changes!

@chhagedorn
Copy link
Member Author

Thank you Tobias for your careful review!

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good

@chhagedorn
Copy link
Member Author

Thanks Vladimir for your review!

@chhagedorn
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Aug 16, 2021

Going to push as commit 3677734.
Since your change was applied there have been 80 commits pushed to the master branch:

  • 0a03481: 8272231: G1: Refactor G1CardSet::get_card_set to return G1CardSetHashTableValue*
  • 83d0e12: 8267833: Improve G1CardSetInlinePtr::add()
  • 69cc588: 8272235: G1: update outdated code root fixup
  • 5db36ce: 8272158: SoftReference related bugs under memory pressure
  • 7a5b37b: 8272439: G1: add documentation to G1CardSetInlinePtr
  • 0209d9f: 8272461: G1: remove empty declaration of cleanup_after_scan_heap_roots
  • 36e2dda: 8272348: Update CDS tests in anticipation of JDK-8270489
  • 3f38a50: 8271203: C2: assert(iff->Opcode() == Op_If || iff->Opcode() == Op_CountedLoopEnd || iff->Opcode() == Op_RangeCheck) failed: Check this code when new subtype is added
  • 6a5241c: 8272491: Problem list javax/swing/JFrame/NSTexturedJFrame/NSTexturedJFrame.java on macos
  • 17b9350: 8266079: Lanai: AlphaComposite shows differences on Metal compared to OpenGL
  • ... and 70 more: https://git.openjdk.java.net/jdk/compare/0aca4f72ce48bc75f2c466d1cba4ad70d3c7b875...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Aug 16, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 16, 2021
@openjdk
Copy link

openjdk bot commented Aug 16, 2021

@chhagedorn Pushed as commit 3677734.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@chhagedorn chhagedorn deleted the JDK-8271471 branch October 1, 2021 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
4 participants