Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() #376

Closed

Conversation

@pchilano
Copy link
Contributor

@pchilano pchilano commented Sep 28, 2020

Hi all,

Please review the following patch. Current ThreadCrashProtection() implementation uses static members which requires the use of Thread::muxAcquire() to allow only one user at a time. We can avoid this synchronization requirement if each thread has its own ThreadCrashProtection *data.
I tested it builds on Linux, macOS and Windows. Since the JfrThreadSampler is the only one using this I run all the tests from test/jdk/jdk/jfr/. I also run some tests with JFR enabled while forcing a crash in OSThreadSampler::protected_task() and tests passed with several "Thread method sampler crashed" UL output. Also run tiers1-3.

Thanks,
Patricio


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8253694: Remove Thread::muxAcquire() from ThreadCrashProtection()

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/376/head:pull/376
$ git checkout pull/376

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Sep 28, 2020

👋 Welcome back pchilanomate! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Sep 28, 2020

@pchilano The following label will be automatically applied to this pull request: hotspot-runtime.

When this pull request is ready to be reviewed, an RFR email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label (add|remove) "label" command.

Loading

@pchilano
Copy link
Contributor Author

@pchilano pchilano commented Sep 28, 2020

/label add hotspot-jfr

Loading

@openjdk openjdk bot added the hotspot-jfr label Sep 28, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Sep 28, 2020

@pchilano
The hotspot-jfr label was successfully added.

Loading

@pchilano pchilano marked this pull request as ready for review Sep 28, 2020
@openjdk openjdk bot added the rfr label Sep 28, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 28, 2020

Webrevs

Loading

}
}
#endif

Copy link
Contributor

@coleenp coleenp Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be pushed down into osThread ?

Loading

Copy link
Contributor Author

@pchilano pchilano Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can but it's not that clean I think given the osthread indirections. Here's how it looks: pchilano@73a41ad

I think another alternative could be to remove the "#ifndef _WINDOWS" clause and define an empty check_crash_protection() method in os_windows.hpp

Loading

Copy link
Contributor

@coleenp coleenp Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yes, I saw the ifdef WINDOWS and thought it should be more os platform specific, but if you had an empty method in os_windows.hpp that would be better imo. It's not as nice in osthread. Thank you for considering it.

Loading

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Hi Patricio,

This seems fine, but I'm wondering what the motivation for this change was? Adding more per-thread state is arguably just adding to the clutter of per-thread state. I don't know if this approach was considered when @robehn fixed JDK-8183925.

Thanks,
David

Loading

@robehn
Copy link
Contributor

@robehn robehn commented Sep 30, 2020

I don't think so.

I have not seen crash protection catching anything since pre-JDK 9. (we did a lot of fixes to the stack-walking code)
I would remove it completely instead :) Not sure what JFR team says...

Loading

@mgronlun
Copy link

@mgronlun mgronlun commented Sep 30, 2020

Like David I would also like to know more about the motivation. Is the feature expected to be used by a larger number of threads? If so, there might be concerns about scalability that was not considered initially.

I agree that we have seen less, and for a long time almost no, asserts related to thread sampling in our testing with fastdebug builds (only product builds run with the protection). At the same time, I am not sure how representative that is considering all the code that is out there. We should also keep in mind that we have upcoming features that will have slightly different stack layouts which will affect how stackwalking is achieved, so I would recommend keeping the established safety mechanism.

Loading

@pchilano
Copy link
Contributor Author

@pchilano pchilano commented Sep 30, 2020

I looked at the users of Thread::muxAcquire/muxRelease and this was one of the two places where it is used. If we are going to have a crash protection mechanism for general use then the fields should not be static. Now, if we know only the JfrThreadSampler uses it and we want to optimize away that pointer in the thread object then that makes sense, but then we should remove _crash_mux.

Loading

@pchilano
Copy link
Contributor Author

@pchilano pchilano commented Oct 5, 2020

Please see the update. Since there was concern about adding a new pointer in the Thread class I kept the fields as static and just removed _crash_mux. I also fixed the comment about the class being used by the JfrSampler instead of the Watcher thread.

Loading

Copy link
Member

@dcubed-ojdk dcubed-ojdk left a comment

Looks good. I only have minor suggested changes, but it's your call
on whether to make those changes.

Loading

assert(Thread::current()->is_JfrSampler_thread(), "should be JFRSampler");
_protected_thread = Thread::current();
Copy link
Member

@dcubed-ojdk dcubed-ojdk Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this:
_protected_thread = Thread::current();
assert(_protected_thread->is_JfrSampler_thread(), "should be JFRSampler");
would be a little more clean.

Loading

assert(Thread::current()->is_JfrSampler_thread(), "should be JFRSampler");
_protected_thread = Thread::current();
Copy link
Member

@dcubed-ojdk dcubed-ojdk Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this:
_protected_thread = Thread::current();
assert(_protected_thread->is_JfrSampler_thread(), "should be JFRSampler");
would be a little more clean.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Oct 5, 2020

@pchilano This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for more details.

After integration, the commit message for the final commit will be:

8253694: Remove Thread::muxAcquire() from ThreadCrashProtection()

Reviewed-by: dholmes, dcubed, coleenp

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 116 new commits pushed to the master branch:

  • 4fe68f5: 8253426: jpackage is unable to generate working EXE for add-launcher configurations
  • c9d0407: 8253794: TestAbortVMOnSafepointTimeout never timeouts
  • f2f77f7: 8253761: Wrong URI syntax printed by jar --describe-module
  • b29e108: 8253944: Certain method references to VarHandle methods should fail
  • 88d75c9: 8156071: List.of: reduce array copying during creation
  • ea27a54: 8224509: Incorrect alignment in CDS related allocation code on 32-bit platforms
  • 4d29116: 8253433: Remove -XX:+Debugging product option
  • 81dae70: 8253948: Memory leak in ImageFileReader
  • 65cab55: 8253971: ZGC: Flush mark stacks after processing concurrent roots
  • 19219a9: 8253960: Memory leak in Java_java_lang_ClassLoader_defineClass0()
  • ... and 106 more: https://git.openjdk.java.net/jdk/compare/f014854ac71a82b307667ba017f01b13eed54330...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Loading

@openjdk openjdk bot added the ready label Oct 5, 2020
@pchilano
Copy link
Contributor Author

@pchilano pchilano commented Oct 5, 2020

Looks good. I only have minor suggested changes, but it's your call
on whether to make those changes.

Thanks Dan! Updated.

Loading

Copy link
Member

@dcubed-ojdk dcubed-ojdk left a comment

Thumbs up.

Loading

coleenp
coleenp approved these changes Oct 5, 2020
Copy link
Contributor

@coleenp coleenp left a comment

LGTM!

Loading

}
}
#endif

Copy link
Contributor

@coleenp coleenp Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yes, I saw the ifdef WINDOWS and thought it should be more os platform specific, but if you had an empty method in os_windows.hpp that would be better imo. It's not as nice in osthread. Thank you for considering it.

Loading

Copy link
Member

@dholmes-ora dholmes-ora left a comment

So let me see if I've got this straight. Prior to JDK-8183925 CrashProtection was exclusively for the WatcherThread. JDK-8183925 generalized that to allow any(?) thread to use it. Now as the only client is the JfrSampler thread we are making crash protection exclusively only available to it.

Loading

@pchilano
Copy link
Contributor Author

@pchilano pchilano commented Oct 6, 2020

So let me see if I've got this straight. Prior to JDK-8183925 CrashProtection was exclusively for the WatcherThread. JDK-8183925 generalized that to allow any(?) thread to use it. Now as the only client is the JfrSampler thread we are making crash protection exclusively only available to it.

Exactly.

Loading

@pchilano
Copy link
Contributor Author

@pchilano pchilano commented Oct 6, 2020

Thanks @dcubed-ojdk, @coleenp and @dholmes-ora for the reviews.

Loading

@pchilano
Copy link
Contributor Author

@pchilano pchilano commented Oct 6, 2020

/integrate

Loading

@openjdk openjdk bot closed this Oct 6, 2020
@openjdk openjdk bot added the integrated label Oct 6, 2020
@openjdk openjdk bot removed ready rfr labels Oct 6, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Oct 6, 2020

@pchilano Since your change was applied there have been 124 commits pushed to the master branch:

  • d2b1dc6: 8254054: Pre-submit testing using GitHub Actions should not use the deprecated set-env command
  • a34f48b: 8253832: CharsetDecoder : decode() mentioning CoderMalfunctionError behavior not as per spec
  • f397b60: 8251123: doclint warnings about missing javadoc tags and comments
  • c9d1dcc: 8253902: G1: Starting a new marking cycle before the conc mark thread fully completed causes assertion failure
  • 9199783: 8253565: PPC64: Fix duplicate if condition in vm_version_ppc.cpp
  • 1728547: 8254010: GrowableArrayView::print fails to compile
  • 6e61861: 8254046: Remove double semicolon introduced by JDK-8235521
  • 5d84e95: 8204256: improve jlink error message to report unsupported class file format
  • 4fe68f5: 8253426: jpackage is unable to generate working EXE for add-launcher configurations
  • c9d0407: 8253794: TestAbortVMOnSafepointTimeout never timeouts
  • ... and 114 more: https://git.openjdk.java.net/jdk/compare/f014854ac71a82b307667ba017f01b13eed54330...master

Your commit was automatically rebased without conflicts.

Pushed as commit 57493c1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

@pchilano pchilano deleted the 8253694-threadcrashprotection branch Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
6 participants