New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8256811: Delayed/missed jdwp class unloading events #635
Conversation
|
This backport pull request has now been updated with issue from the original commit. |
Webrevs
|
This fix results a couple of followup CRs: JDK-8290908 and JDK-8291456. JDK-8291456 has yet been resolved, probably this backport should wait till JDK-8291456 is resolved. |
@RealCLanger The settings for my repo look like they already enable actions. Under Actions/General/Actions Permissions the radio button is selected for Allow all actions and reusable workflows. Do I need to set something else? What do you mean by 'trigger a run via the workflow-dispatch event'? |
@zhengyu123 Thanks for the heads-up regarding JDK-8291456. I was not aware there was another dependent issue. |
@RealCLanger Ah, ok, I think I see now what you mean. I merged master and pushed and I can see that workflows are now running for the updated branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backport mostly looks clean. The changes in JvmtiExport::post_object_free
are appropriate, given the additions in JDK 19+ for virtual threads that are not present in 17.
I'm not 100% on the copyright additions which are not part of the original patch, as they may interfere with other backports. However, I guess they are needed at some point.
Most curious to me is this addition:
+- JvmtiThreadState *state = thread->as_Java_thread()->jvmti_thread_state();
++ JvmtiThreadState *state = ((JavaThread*)thread)->jvmti_thread_state();
Is there a reason for altering this line? It doesn't seem related to any of the other changes, and, while the direct cast may be more performant than the method call, the existing usage with the method call seems safer.
No, there is no reason for it other that that it was in the patch Zhengyu originally started working on. Of course, the cast is safe because of the guard in the preceding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the rest looks good.
Note that JDK-8291456 seems to mainly be about the test and has spawned two related test fixes, JDK-8292880 and JDK-8291650, which may be worth a look.
@adinn This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 130 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
|
Next step is to label the bug with |
Thanks for the review @gnu-andrew
Yes, I am aware of both of those follow-up issues. Although they both try to fix the test and, indeed, both appear to have lowered the chance of a failure, neither of them was 100% successful. So, I have been waiting for a proper test fix upstream (which is pending but seems not to be imminent). I was also hoping that I could maybe help out with an upstream fix but have not had the time to do so. We could continue on that course or we could go ahead and push both these patches as is and ignore any failure in the test. Do you have any opinion on how to proceed? |
My impression from the bugs was that the code change itself is an improvement, and fewer events are missed, but it is difficult to get a reliable test which triggers class unloading and picks up all the events. I'm tending towards going ahead with backporting the four existing patches, but I expect you have a better idea of the risk than I do. If we do this now, it'll be targeted for the January release so we still have some time to add further follow-up fixes or even revert in 17u. I don't get the impression that the main patch is flawed enough to warrant being reverted in trunk, but if we want to be extra safe, we could wait until some time after OpenJDK 20 is released next March. |
The upstream work on this issue is still in progress as fixes up to now have improved but neither eradicated the problem nor even diagnosed root cause - in fact the fixes actually stumbled into a secondary issue on Windows. So, I have been hesitating before pushing something that might not remove or workaround the problem. It looks like Chris Plummer has identified the source of the Windows issue and provided a workaround for it (see https://bugs.openjdk.org/browse/JDK-8292879 and associated PR openjdk/jdk#10519). So, I think it is probably worth proceeding with all the fixes up to and including this one. |
I think we are now ready to queue and push this backport to jdk17u (JDK-8256811) as long as it is followed up by backports for these follow-up JIRAs. I propose the following sequence:
I already have a dependent PR queued for JDK-8290908 and am preparing further dependent PRs for the next 4 JIRAs. A few comments: JDK-8291456 is required to ensure that events are not missed but it is not enough in itself to fix the test Two of the three following fixes to the test code (JDK-8291650, JDK-JDK-8292879) also do not guarantee to remove the problem since all they do is add delays to skew races in favour of the desired outcome. However, absent any proper fix to these timing problems this is the best option we have to remedy the failure. Strictly, JDK-8292880 is not needed. However, it should not cause any harm and it will be very useful for diagnosis if we run into problems later on which demand a proper fix. |
Correction: JDK-8291456 needs to be added after JDK-8291650 as it removes the delay added by JDK-8291650. |
Aargh! These were all pushed out of order.
|
@adinn, It seems they all miss the Fix request comment in the JBS issue. |
Fix Request The failure to correctly report class unloading events both in the OpenJDK tests and in actual deployments is causing spurious failures for the eclipse team's testing regime. This initial patch and the follow up patches mentioned above were adopted as remedies in upstream releases. Although they are still only an ad hoc fix, do in practice avoid the OpenJDK test failures and should produce the same results for other users. Some of the upstream patches were superseded by later ones in order to arrive at a satisfactory solution. However, rather than create a new patch to arrive at the same end result it seems best to backport them all in sequence. They all apply cleanly apart from the first one. One patch merely enables debug output in a relevant test. That has been retained, 1) to maintain patch compatibility with upstream, 2) to ensure follow-on patches apply cleanly, 3) because it will be useful if the ad hoc nature of the current fixes means further problems are encountered and 4) because its effect is minimal, being local only to the test and only resulting in a small amount more info in the test output file. The overall risk is low because these patches only modify the class unloading code. The changes have been tested by running the relevant class unloading unit tests. This does not guarantee the problem is fixed (because it is intermittent). However, it does indicate that the patches are not flawed. |
@GoeLin Apologies for forgetting to include Fix request comments. See above for a comment that covers the full set of patches. |
Thanks, the reasoning is fine, but it usually goes into the JBS issue. |
Thanks both. I've approved 8256811. I see @GoeLin already approved the others, so this should all be good to go. As they are clean, it should be possible to integrate the others once testing on the PR completes successfully. |
@GoeLin Apologies for my all too obviously rusty knowledge of the backport process and many thanks for fixing up my mistakes. Clearly, I have been spending too much time working on the GraalVM project. @gnu-andrew Thanks for checking all the PRs. The test failures on this PR don't seem to be anything to do with the changes in the patch. The Windows one appears to be a (transient?) download failure during the gate run. The Mac failures are all timeouts (in TestUnsignedByteCompare1, TestTrichotomyExpressions and TestLinearScanOrderMain, none of which will can anything to do with this change). Is another run needed? If so is there a simple way to force that to happen? |
I agree with your analysis and I don't think another run is necessary. I was thinking more of the follow-on PRs, as, being clean, they don't need to wait on a review. I now see you've already filed these, and, given that the tests on those pass (https://github.com/adinn/jdk17u-dev/actions/) with this change included, that seems to be enough. For reference, you should be able to re-run the job from the button on the right-hand side of https://github.com/adinn/jdk17u-dev/actions/runs/3006655779 |
@gnu-andrew @GoeLin Ok, so does that mean I now proceed to integrate this PR and the dependent ones? |
Yes, feel free! |
/integrate |
Going to push as commit b2061e0.
Your commit was automatically rebased without conflicts. |
Backport of patch to fix class unloading event notification.
This backport was originally developed by @zhengyu123 and involved some minor changes to the original patch.
n.b. the original patch required a follow up fix (cf JDK-8290908) which also needs backporting after this has been included.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk17u-dev pull/635/head:pull/635
$ git checkout pull/635
Update a local copy of the PR:
$ git checkout pull/635
$ git pull https://git.openjdk.org/jdk17u-dev pull/635/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 635
View PR using the GUI difftool:
$ git pr show -t 635
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk17u-dev/pull/635.diff