-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix halted thread being allowed to continue execution v2 #15227
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Jenkins test sanity xLinux jdk11 |
amicic
force-pushed
the
fix_halt_inspection
branch
from
June 4, 2022 15:52
58abb3a
to
9805ab1
Compare
This PR is a variant of eclipse-openj9#15201 with handling of saveObjects NULL case Primary issue: It was possible for a thread which is halted for inspection to resume execution after a GC, resulting in the stack walk crashing in the inspecting thread. This was introduced in eclipse-openj9#12257 . Secondary issue: If instrumentable object allocate is hooked, it was possible to HCR at JIT object allocation points. New restriction: The JVMTI extension event for instrumentable object allocate may now only be acquired at startup, not during late attach. As far as I know, there are no active users of this event. Note that NOT_AT_SAFE_POINT has two distinct if related meanings: - Normal exclusive VM access has priority over safe point Setting NOT_AT_SAFE_POINT around all GCing operations accomplishes this. - JIT optimization requires that there be no possibility of HCR at object allocation points Extending the range of NOT_AT_SAFE_POINT to cover the entire allocation path and disabling safe point if the instumentable object allocate event is hooked ensures this. This fix has several parts: - Pause after GC if halt has been requested After any possibly-GCing path, check to see if the thread should halt before resuming mutation. This fixes the reported problem of inspected threads continuing to run. This also requires that NOT_AT_SAFE_POINT be set across the entirety of the allocation path to prevent safe point from being acquired by the new halting code. - Fix object allocation event reporting If the instrumentable object allocation extension event was hooked, there was a timing hole where HCR could occur at an object allocation from the JIT, which is specificially what safe point HCR is meant to avoid. This is fixed by marking the thread NOT_AT_SAFE_POINT for the duration of the allocate. If the instrumentable object allocate event is hooked, disable safe point HCR as there is no way to safely report the event. - Mark all possibly GCing paths NOT_AT_SAFE_POINT This ensures that the GC has priority over safe point HCR. haltThreadForInspection also now marks the inspecting thread NOT_SAFE for the duration of the halt (see timing below). Signed-off-by: Aleksandar Micic <amicic@ca.ibm.com>
amicic
force-pushed
the
fix_halt_inspection
branch
from
June 4, 2022 15:53
9805ab1
to
9a795f1
Compare
tajila
approved these changes
Jun 4, 2022
Jenkins test sanity,extended plinux jdk8 |
Thanks, was just signing in to do exactly this. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a variant of
#15201 with
handling of saveObjects NULL case
Primary issue: It was possible for a thread which is halted for
inspection to resume execution after a GC, resulting in the stack walk
crashing in the inspecting thread. This was introduced in
#12257 .
Secondary issue: If instrumentable object allocate is hooked, it was
possible to HCR at JIT object allocation points.
New restriction: The JVMTI extension event for instrumentable object
allocate may now only be acquired at startup, not during late attach. As
far as I know, there are no active users of this event.
Note that NOT_AT_SAFE_POINT has two distinct if related meanings:
Setting NOT_AT_SAFE_POINT around all GCing operations accomplishes this.
object allocation points
Extending the range of NOT_AT_SAFE_POINT to cover the entire allocation
path and disabling safe point if the instumentable object allocate event
is hooked ensures this.
This fix has several parts:
After any possibly-GCing path, check to see if the thread should halt
before resuming mutation. This fixes the reported problem of inspected
threads continuing to run. This also requires that NOT_AT_SAFE_POINT be
set across the entirety of the allocation path to prevent safe point
from being acquired by the new halting code.
If the instrumentable object allocation extension event was hooked,
there was a timing hole where HCR could occur at an object allocation
from the JIT, which is specificially what safe point HCR is meant to
avoid.
This is fixed by marking the thread NOT_AT_SAFE_POINT for the duration
of the allocate. If the instrumentable object allocate event is hooked,
disable safe point HCR as there is no way to safely report the event.
This ensures that the GC has priority over safe point HCR.
haltThreadForInspection also now marks the inspecting thread NOT_SAFE
for the duration of the halt (see timing below).
Signed-off-by: Aleksandar Micic amicic@ca.ibm.com