-
Notifications
You must be signed in to change notification settings - Fork 5.8k
4200096: OffScreenImageSource.removeConsumer NullPointerException #13408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Merge openjdk/jdk into mickleness/jdk
Merge openjdk/jdk
Merge from openjdk/jdk
Updating mickleness/jdk from openjdk/jdk
Removing a consumer (which can happen implicitly if an ImageObserve returns false) could cause a NPE in OffScreenImageSource. This NPE was caught, but it was printed to System.err. This came to my attention when a particular set of steps flooded our log with NPEs that were actually harmless/meaningless. The new attached test case includes two related tests: one mimics the NPE I observed, and the other is more aligned with the original wording in ticket JDK-4200096.
Code cleanup: fixing error condition, remove redundant 'this'.
Rewriting this resolution. I think the previous approach would be too big a change for such a low priority bug.
👋 Welcome back mickleness! A progress list of the required criteria for merging this PR into |
@mickleness The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
So the repeated checks for null make it look like you are dealing with some other thread setting it to null If that were so, I don't think this approach is correct because the repeated null checks aren't a guarantee. But I see that add/removeConsumer are synchronized methods So it seems impossible for some other thread to do this. I tracked down that the catch of null and printStackTrace were added under this bug There it is implied that the callbacks such as The person who suggested the fix (not the person who implemented the fix) It also points out that you're unlikely to need the repeated checks since it can only be nulled out If you think what you are doing is correct (I think the architect of the code supposed otherwise Thinking about this a bit more, I think I understand why you are doing what you are doing.
I can see both sides of it. However I suggest that if the null check finds it is null on entry that there be an ImageConsumer localConsumer = theConsumer; |
public void println(Object x) { | ||
super.println(x); | ||
if (x instanceof Throwable) | ||
System.exit(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests should not use the System.exit() as it might affect the execution of other tests. It is better to set some flag and check it at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks; this is fixed.
In code review mrserb pointed out we shouldn't use System.exit(1) as an error condition. openjdk#13408 (comment)
In code review prrace pointed out: > I'd capture "theConsumer" into a local var in each of the methods sendPixel() and produce() and check and use the local var openjdk#13408 (comment) This is actually my previous/original draft for JDK-4200096 ( 49a49ee ), but I shied way from this approach because I was afraid code reviewers would point out its changes are excessively invasive for the original complaint. (For ex: now OSIS supports multithreading, which nobody asked for. Should I re-add the `synchronized` method modifiers? (although they are theoretically no longer necessary, keeping them reduces the risk of unintended consequences...) Or if multithreaded support is worth keeping: should I at least add a unit test for it? These questions seemed like they're straying off-topic from the original bug.)
Re-adding the 'synchronized' keyword. We can probably do without it now, but I'm just trying to minimize possible unintended side-effects of this PR. Nobody asked for improved multithreaded support. And this class is used to iterate over BufferedImages (which are already in memory), so (as long as the ImageConsumer isn't taking too long -- or blocking) this should be very fast.
Adding createAbstractImage() with a little documentation to clarify why we're creating an image the way we are.
Changing `runImageConsumerTest` to make sure we're explicitly testing a `OffScreenImageSource`. By contrast: the `runImageDimensionTest` is more of an integration-style test that *happens* to test the `OffScreenImageSource`.
Make sure OffScreenImageSource#addConsumer doesn't throw a NPE if the argument is null. The rationale here is: The preexisting implementation wouldn't throw a NPE, so we shouldn't change that now. (Or more specifically: the preexisting implementation *would* throw a NPE, but it would also catch it and print it to System.err. The caller wouldn't need to anticipate a NPE.)
Thanks for the thorough review. To recap: Yes, the original problem (as I understand it) has to do with listeners that detach mid-production. I apologize if I failed to explain this up-front. I tried to describe the problem I set out to "solve" in the comments preceding the first unit test:
In the master openjdk branch this test fails as follows:
So in this test: I'm not explicitly adding or removing an ImageConsumer or ImageObserver. My implementation of I just pushed a few revisions that, among other things, pass in the ImageConsumer as an argument as you suggested. (But they also still constantly check |
mrserb recommended against this in a separate PR openjdk#13408 (comment)
mrserb recommended against this in a separate PR openjdk#13408 (comment)
mrserb recommended against this in a separate PR openjdk#13408 (comment)
I think I am starting to get the overall picture here. The fix I cited (4905411) was essentially a fix for 4200096, since it prevented NPEs breaking the app So
|
Also I'm breaking out the unit tests (now there are 4 of them) into their own directory. The new LegitimateNullPointerTest confirms that we DO want to see NullPointerExceptions printed to System.err IF they come from the ImageConsumer itself. In this ticket most of our focus has been on the NPE's that stem from removing an ImageConsumer from OSIS mid-production (so the NPE was when OSIS tried to interact with its `theConsumer` field). This test tries to add other possible NPE's to our consideration. This new test passed in the master branch before this branch. I want to preserve this existing behavior if this proposal is accepted.
I don't like the idea of removing I'll add a new (4th) unit test for this condition: I DO expect a NPE from the ImageConsumer to be printed to System.err. I think what I'm hearing (please correct me if I'm wrong) is: constantly confirming the ImageConsumer is still attached to the OffScreenImageSource hurts readability. So I'll propose a new approach for your consideration shortly. |
Yes, that's a good point. Just check if consumer is null and if it isn't, then the NPE can't be from de-referemcing Also note my suggestion from yesterday that the NPE could still be reported if a system property is set. If you prefer to have more discussion to agreement/clarification before coding up something just to find we revise it again , or if you prefer to code it up and ask "what do you think about this ? ", so there's clarity on what you mean, either is fine. |
I don't like that we're *expecting* a NullPointerException to get thrown in the course of an allowed use case, but that's probably just a preferences on my part. More importantly: this branch passes all new unit tests. Plus it's readable and easy to code review. This is related to the latest round of PR feedback here: openjdk#13408 (comment)
I just pushed an update that uses the null-check for Regarding the System property / debugger message: If we established that removing the listener before production is complete is an OK thing to do, then do you still want this debugger output? (Or to put that another way: do you think that will really get used?) I don't have a lot of experience with debugger output from sun classes, so if you think it's helpful I'll take your word for it. |
@@ -175,7 +175,7 @@ else if (cm instanceof DirectColorModel) { | |||
scanline[x] = image.getRGB(x, y); | |||
} | |||
theConsumer.setPixels(0, y, width, 1, newcm, scanline, 0, | |||
width); | |||
width); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we revert the white space changes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean before pushing.
|
@mickleness This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 510 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@prrace, @mrserb) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
prrace pointed out this branch introduced some pointless whitespace changes. openjdk#13408 (comment)
/integrate |
@mickleness |
e.printStackTrace(); | ||
// If theConsumer is null and we throw a NPE when interacting with it: | ||
// That's OK. That is an expected use case that can happen when an | ||
// ImageConsumer detaches itself from this ImageProducer mid-production. | ||
|
||
if (theConsumer != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need here and a few lines above save the theConsumer to the local, then check to null, then call imageComplete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is:
This ticket focuses on the use case where an ImageConsumer detaches itself from this OffScreenImageSource mid-production by calling removeConsumer(ImageConsumer)
. So in this situation: I'd argue no, the ImageConsumer should not get an imageComplete(int)
notification. Because it opted to remove itself.
Does this answer your question, or are you considering a different use case?
(Note: addConsumer
is synchronized, so it's usually reasonable to assume the field OffScreeImageSource#theConsumer
acts mostly like a local variable. I could contrive a new failing test case if one ImageConsumer called addConsumer(newDifferentConsumer)
while receiving a notification, but that's a separate discussion. It's beyond the scope of this ticket, and (to my knowledge) it's a contrived edge case that nobody has ever actually complained about in the real world. And it would fail with or without the changes in this PR...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this answer your question, or are you considering a different use case?
Yes, if it is safe to assume that it is not possible to change the theConsumer
by a different thread.
/sponsor |
Going to push as commit 63cd0a3.
Your commit was automatically rebased without conflicts. |
@prrace @mickleness Pushed as commit 63cd0a3. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This resolves a 25 year old P4 ticket: a NullPointerException is printed to System.err needlessly.
This resolution involves confirming that an ImageConsumer is still registered before every notification.
I'll understand if this is rejected as unimportant, but I stumbled across this in the real world the other day and thought this was a simple enough bug to practice on.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13408/head:pull/13408
$ git checkout pull/13408
Update a local copy of the PR:
$ git checkout pull/13408
$ git pull https://git.openjdk.org/jdk.git pull/13408/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 13408
View PR using the GUI difftool:
$ git pr show -t 13408
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13408.diff
Webrev
Link to Webrev Comment