Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8088198: Exception thrown from snapshot if dimensions are larger than max texture size #68

Closed
wants to merge 7 commits into from

Conversation

@fthevenet
Copy link
Contributor

fthevenet commented Dec 17, 2019

This PR aims to address the following issue: JDK-8088198 Exception thrown from snapshot if dimensions are larger than max texture size

In order to do that, it simply captures snapshots in multiple tiles of maxTextureSize^2 dimensions (or less, as needed), and then recomposes all the tiles into a a single image.
Other than that, the logic used to do the actual snapshot is unchanged.

Tests using the existing SnapshotCommon class have been added in a new file named Snapshot3Test under SystemTest/test/javafx/scene.
These tests pass with the proposed fix, and fail without, throwing " java.lang.IllegalArgumentException: Unrecognized image loader: null"

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

JDK-8088198: Exception thrown from snapshot if dimensions are larger than max texture size

Approvers

  • Ambarish Rapte (arapte - Reviewer)
  • Kevin Rushforth (kcr - Reviewer)
… max texture size
@bridgekeeper bridgekeeper bot added the oca label Dec 17, 2019
@bridgekeeper
Copy link

bridgekeeper bot commented Dec 17, 2019

Hi fthevenet, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user fthevenet" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

@fthevenet
Copy link
Contributor Author

fthevenet commented Dec 17, 2019

/signed

@bridgekeeper bridgekeeper bot added the oca-verify label Dec 17, 2019
@bridgekeeper
Copy link

bridgekeeper bot commented Dec 17, 2019

Thank you! Please allow for up to two weeks to process your OCA, although it is usually done within one to two business days. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

@fthevenet fthevenet changed the title 8088198: Exception thrown from snapshot if dimensions are larger than max texture size 8088198: Exception thrown from snapshot if dimensions are larger than max texture size Dec 19, 2019
@openjdk openjdk bot added the rfr label Dec 19, 2019
@mlbridge
Copy link

mlbridge bot commented Dec 19, 2019

Webrevs

@kevinrushforth
Copy link
Member

kevinrushforth commented Dec 20, 2019

This will need two reviewers. I want to review it, and I request @arapte to also review.

I won't have time to do a detailed review until the new year. One quick comment: in addition to the new tests you have provided, there are 4 @Ignored tests in Snapshot2Test.java that can likely be re-enabled. Look for TODO: Re-enable this test when RT-22073 is fixed (RT-22073 was mapped to JDK-8088198).

@fthevenet
Copy link
Contributor Author

fthevenet commented Dec 20, 2019

This will need two reviewers. I want to review it, and I request @arapte to also review.

I won't have time to do a detailed review until the new year. One quick comment: in addition to the new tests you have provided, there are 4 @Ignored tests in Snapshot2Test.java that can likely be re-enabled. Look for TODO: Re-enable this test when RT-22073 is fixed (RT-22073 was mapped to JDK-8088198).

I hadn't noticed these tests before, but at a glance it does indeed look like they make the ones I added redundant.

@fthevenet
Copy link
Contributor Author

fthevenet commented Dec 24, 2019

Upon closer inspection, I believe that the tests I added in Snapshot3Test are indeed redundant and less complete than the 4 ignored test in Snapshot2Test.
I therefore propose to remove Snapshot3Test.java entirely and to re-enable the 4 testSnapshotBig* test instead.

Copy link

arapte left a comment

The fix itself looks good and seems safe for 14.
I do have few minor changes in test file and suggestions in fix code, please take a look.
Also verified that large sized snapshots are created correctly.

@arapte
Copy link

arapte commented Jan 16, 2020

Upon closer inspection, I believe that the tests I added in Snapshot3Test are indeed redundant and less complete than the 4 ignored test in Snapshot2Test.
I therefore propose to remove Snapshot3Test.java entirely and to re-enable the 4 testSnapshotBig* test instead.

I did miss this comment, The test in Snapshot2Test look sufficient and you can get rid of Snapshot3Test.java and so all my comments :)

fthevenet added 2 commits Jan 16, 2020
…est.java
- Fixed a typo in "the screenshot is to(o) big"
- Moved x offset calculation outside of innermost loop.
- renamed verticalTileNb & horizontalTileNb to numVerticalTiles & numHorizontalTiles
Copy link
Contributor

nlisker left a comment

I tested this fix against the repro code in javafxports/openjdk-jfx#433 (which is JDK-8222238), but there is still an NPE. I'm not certain that this fix is supposed to solve that bug, but according to the comments, the root cause is the same as JDK-8189082, which is related to this one. It's worth to take a look to see if something was missed.

int numVerticalTiles = (int) Math.ceil(height / (double) maxTextureSize);
int numHorizontalTiles = (int) Math.ceil(width / (double) maxTextureSize);
for (int i = 0; i < numHorizontalTiles; i++) {
int xOffset = i * maxTextureSize;
int tileWidth = Math.min(maxTextureSize, width - xOffset);
for (int j = 0; j < numVerticalTiles; j++) {
int yOffset = j * maxTextureSize;
int tileHeight = Math.min(maxTextureSize, height - yOffset);
WritableImage tile = doSnapshotTile(scene, xMin + xOffset, yMin + yOffset, tileWidth, tileHeight, root, transform, depthBuffer, fill, camera, null);
wimg.getPixelWriter().setPixels(xOffset, yOffset, tileWidth, tileHeight, tile.getPixelReader(), 0, 0);
}
}
Comment on lines 1305 to 1316

This comment has been minimized.

Copy link
@nlisker

nlisker Jan 16, 2020

Contributor

I would extract this code into its own method similar to doSnapshotTile:

assemble(scene, xMin, yMin, width, height, root, transform, depthBuffer, fill, camera, wimg, maxTextureSize);

(assemble is a bad name, I didn't think about a better one).

The method can return he resulting WritableImage, but it is not needed since it is manipulated via "side-effects". I would, however, bring it line with the else clause - either both use wimg = methodName(..., wimg, ...); or just methodName(..., wimg, ...);. This is fine since the input WritableImage is never null. From a readability point of view, using return values seems better.

This comment has been minimized.

Copy link
@fthevenet

fthevenet Jan 17, 2020

Author Contributor

I'm not 100% convinced this would really add much to the readability of the code; I extracted the code from doSnapshotTile in its own method because it is called twice (on both sides of the if (height > maxTextureSize || width > maxTextureSize) condition, actually), but this isn't the case here.
I've got no strong feeling against it either, so I don't know; anybody else care to comment?

This comment has been minimized.

Copy link
@kevinrushforth

kevinrushforth Jan 17, 2020

Member

I also don't have a strong opinion, so I'm OK with you leaving it as-is.

This comment has been minimized.

Copy link
@nlisker

nlisker Jan 17, 2020

Contributor

Fine to leave as-is.

@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 17, 2020

I tested this fix against the repro code in javafxports/openjdk-jfx#433 (which is JDK-8222238), but there is still an NPE. I'm not certain that this fix is supposed to solve that bug, but according to the comments, the root cause is the same as JDK-8189082, which is related to this one. It's worth to take a look to see if something was missed.

Although JDK-8189082 (and potentially others) are indeed likely to share the same root cause, the fix proposed here won't have an effect on anything else than snapshots, since the tiling is done at the Scene level rather than within QuantumToolkit.
I purposefully choose to take the "easy" way out to limit the potential side-effects of the change, but I agree that on the other hand supporting tiling when needed directly in QuantumToolkit::renderToImage would have the potential to solve a whole category of issues.
Also, from a very pragmatic angle, because of the increased complexity and scope and the risks that come with it, I'm not very optimistic that this change could realistically make it into jfx14 and to be honest I'm quite eager to get the screenshot feature in my app working properly ASAP.

I'd be willing to try and work on a more generic fix under the umbrella of JDK-8189082, that'd be targeted at 15, while still having this fix included as part 14; but that means this should (ideally) be reverted once it becomes useless.
I don't know if the idea of having a "temporary fix" approach seems acceptable in that context.

@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 17, 2020

Wel,l upon closer inspection it appears QuantumToolkit::renderToImage is only really used by Scene::doSnapshot anyway. Rendering of a the content of an NGSubScene (the underlying issue in JDK-8189082) is handled in a completely different code path, and I suspect this is also true for Canvas, where I hear a similar kind of issue exists.
So from my (admittedly limited) understanding, fixing all problems at once would requires handling the tiling transparently within RTTexture (or more exactly inside all of its GraphicsPipeline specifics implementations) which seems quite a bit more complicated and risky.

@kevinrushforth
Copy link
Member

kevinrushforth commented Jan 17, 2020

upon closer inspection it appears QuantumToolkit::renderToImage is only really used by Scene::doSnapshot anyway.

Correct.

So from my (admittedly limited) understanding, fixing all problems at once would requires handling the tiling transparently within RTTexture (or more exactly inside all of its GraphicsPipeline specifics implementations) which seems quite a bit more complicated and risky.

That would be one approach (extending RTT to handle tiling automatically), but I'm not at all sure that it would be the best one. It would as you note, be more complicated and risky.

I have no problem with your current proposed approach. I agree with @arapte that this seems safe for openjfx14. To that end, can you retarget this PR to the jfx14 branch? Since there are no post-jfx14 commits from master in your local branch, this should be as simple as editing the PR and changing the target.

Then we can complete the review, hopefully next week.

@fthevenet fthevenet changed the base branch from master to jfx14 Jan 18, 2020
@arapte
arapte approved these changes Jan 20, 2020
Copy link

arapte left a comment

Looks good to me.
Below is just an observation about time taken by the API,
Platform: Windows10, maxTextureSize: 4096
For a snapshot of (4096 * n, 4096 * n): each call to doSnapshotTile() takes ~100 ms, and each call to setPixels() takes ~30 ms.

Please wait for one more approval before integrating.

@kevinrushforth
Copy link
Member

kevinrushforth commented Jan 27, 2020

I would be very cautious of using multi-threading here. In any case, I think that the issues around absolute performance could be handled separately.

Having said that, given my review comments, along with the concerns over performance regressions for those cases that will now be tiled, but formerly weren't, I no longer think this is a good candidate for openjfx14. This PR should be retargeted to the master branch for openjfx15.

@kevinrushforth
Copy link
Member

kevinrushforth commented Jan 27, 2020

I thought of one possibility that might be worth looking into for a short term fix (i.e., could still go into openjfx14). Rather than relying on PrismSettings.maxTextureSize you could instead try to render the entire image (as is done today) in a try / catch block, and only fall back to tiling if there is an exception. That way there would be no concern over any possible performance regression, since the only time we would use tiling would be in the case where it fials today.

What do you think?

@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 27, 2020

I have tested using a recipient WritableImage with an IntARGB pixel format (so that is aligned with that of the tile), that I construct by supplying a PixelBuffer, and as expected that performance of the tiled code is much more in line with that of the non-tiled version. One overhead left is the allocation of the extra buffer, but it is far less significant.
The problem with this approach is that the "best" pixel format (as in similar to that of RRT) isn't the same depending on the rendering pipeline (e.g. d3d is intARGB while es2 is byteBGRA) so that would require adding a fair amount of code to determine the best possible scenario depending on all the constraints (keeping in mind the caller can also provide its own WritableImage...) that in turn would need thorough testing.
All in all, I agree the risk of regression is probably too great for this to make it into openjfx14 with so little time left.

Instead, I will prototype Kevin's proposal to only use tiling when it would fail with the current code and use what I've learned here to propose a more robust fix targeted at openjfx15

@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 28, 2020

I've made the change suggested by Kevin, and it works as expected.
It is not very elegant but it does provide relief with regard to the core issue while avoiding risks of performance regressions.

Now with regard to a follow-up PR targeted at the next release, I assume a new issue needs first be filed into the JBS first; should I just file a new bug via https://bugreport.java.com/bugreport (I don't have access to https://bugs.openjdk.java.net)?

@kevinrushforth
Copy link
Member

kevinrushforth commented Jan 28, 2020

The change looks like what I would expect, so I'll finish my review in the next day or so. @arapte will need to re-review it as well.

Yes, we will need a new JBS bug ID for the follow-on.

Copy link

arapte left a comment

I need to re-review the updated PR.

@kevinrushforth
Copy link
Member

kevinrushforth commented Jan 29, 2020

The code changes look good to me. As long as you are willing to address the follow-on issues for openjfx15, I'll approve this PR for openjfx14, once I run my last test.

I did a fair bit of testing of the functionality, which reinforces the decision to only try tiling as a fallback when an exception occurs, since there could otherwise be functional regressions in addition to performance regressions.

With the latest update to the PR, any case that works today will continue to behave exactly as it does today. The worst that will happen is that the tiling might produce very slightly different results for interpolated colors, or possibly fail with a different exception (e.g., if a 4K x 4K texture is still too big for some hypothetical device, or if the user passes in a WritableImage created with a PixelBuffer (see below)).

I added a verbose println to snapshot to validate my testing:

  1. Snapshot with image size = 5k x 5k (should still fit on Windows D3D with no tiling)
  2. Snapshot with image size = 20k x 3k (should tile in X)
  3. Snapshot with image size = 3K x 20k (should tile in Y)
  4. Snapshot with image size = 20K x 20k (should tile in both X and Y)

RESULT: All of the above work as expected

  1. Modify the jfx runtime to force tiling, set the tile size to 128 bytes, and run various tests (e.g., a stress test of tiling)
    RESULT: Everything looks visually correct, but four of the tests in test.robot.test3d.Snapshot3DTest fail with a color mismatch [can be addressed in a follow-up bug]

  2. Pass in a WritableImage created with a PixelBuffer object (NIO buffer)
    RESULT: It works without tiling and throws an exception with tiling [can be addressed in a follow-up bug]

  3. Pass in a larger than needed WritableImage (e.g., using a viewport that is large enough to cause tiling, but smaller than the passed in image) into snapshot when using tiling. Is the entire image cleared?
    RESULT: I still need to run this test, but even if it doesn't work correctly when tiling, it wouldn't be a regression, and could be addressed in a follow-up bug.

Issues to file for follow-on bug(s):

  1. Improve performance of tiled snapshot rendering
    • As part of this, we need to get max texture size from the toolkit
    • Consider moving the tiled rendering into QuantumRenderer thread
    • Reuse a single temporary buffer rather than creating a new one for each tile
  2. Passing in a WritableImage created with a PixelBuffer object (NIO buffer) fails if tiling is needed. I'll file this one.
  3. Pixel accuracy issues with tiled image; this can be seen in Snapshot3DTest by instrumenting the code to force tiling with a tile size of 128. It looks OK visually, but fails the color comparison test. I'll file this one.
  4. Presuming that the above test 7 produces incorrect results, file a follow-up bug to fix it.
@nlisker
Copy link
Contributor

nlisker commented Jan 29, 2020

Improve performance of tiled snapshot rendering

I'll have a closer look at improving IntTo4ByteSameConverter::doConvert.

@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 29, 2020

Yes, I'm fine with working further on this and the related issues for openjfx15, and I have fairly good idea on how to approach it.

Copy link
Member

kevinrushforth left a comment

I ran the additional test (test case 7 as described above) and it behaves the same with or without tiling, so no follow-up issue is needed.

Approved.

@arapte
arapte approved these changes Jan 30, 2020
Copy link

arapte left a comment

With my basic testing, the change looks good, scaled up and scaled down snapshots seem correct visually.

After this change the tiled snapshot will be limited by dimensions of the WritableImage.
We can not create a WritableImage of dimension (8192 * 3, 8192 * 3) or greater.
new WritableImage(8192 * 3, 8192 * 3) causes an exception.

java.lang.IllegalArgumentException: capacity < 0: (-1879048192 < 0)
	at java.base/java.nio.Buffer.createCapacityException(Buffer.java:257)

This is an existing behavior of WritableImage.
May be we should consider to wrap and re-throw the exception and update the API JavaDoc.
Anyway not a stopper for this PR.
Approving.

@openjdk openjdk bot added the ready label Jan 30, 2020
@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 31, 2020

Since @kevinrushforth and @arapte have completed their review, is this ready to integrate?
I'm a little confused by the fact this has both rfr and ready labels attached; is this an expected behaviour?

@kevinrushforth
Copy link
Member

kevinrushforth commented Jan 31, 2020

Yes, this is ready for you to integrate. The rfr label is intentionally left there, since there might be review comments or questions that are still being addressed. In this case, there are no outstanding questions, so go ahead and integrate it. Either @arapte or I will sponsor it.

@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 31, 2020

Ok, thank you for the clarification.

@fthevenet
Copy link
Contributor Author

fthevenet commented Jan 31, 2020

/integrate

@openjdk openjdk bot added the sponsor label Jan 31, 2020
@openjdk
Copy link

openjdk bot commented Jan 31, 2020

@fthevenet
Your change (at version fce986e) is now ready to be sponsored by a Committer.

@kevinrushforth
Copy link
Member

kevinrushforth commented Jan 31, 2020

/sponsor

@openjdk openjdk bot closed this Jan 31, 2020
@openjdk openjdk bot added integrated and removed sponsor ready labels Jan 31, 2020
@openjdk
Copy link

openjdk bot commented Jan 31, 2020

@kevinrushforth @fthevenet The following commits have been pushed to jfx14 since your change was applied:

  • 5a0e71b: 8237372: NullPointerException in TabPaneSkin.stopDrag
  • 79fc0d0: 8232824: Removing TabPane with strong referenced content causes memory leak from weak one
  • aa6f3a4: 8236912: NullPointerException when clicking in WebView with Button 4 or Button 5
  • da99e24: 8237823: Mark TextTest.testTabSize as unstable
  • 9ae37f1: 8236753: Animations do not play backwards after being stopped
  • 16cea41: Merge
  • f907026: 8236808: javafx_iio can not be used in static environment
  • e6587ff: 8236448: Remove unused and repair broken Android/Dalvik code
  • c9519b6: 8232589: Remove CoreAudio Utility Classes
  • df8b3c5: 8233798: Ctrl-L character mistakenly removed from gstreamer.md
  • e0b45f8: 8236648: javadoc warning on Text::tabSizeProperty method
  • 587f195: 8234474: [macos 10.15] Crash in file dialog in sandbox mode
  • f1108b0: 8233747: JVM crash in com.sun.webkit.dom.DocumentImpl.createAttribute
  • 63520a0: Merge
  • 3c4d68d: 8236626: Update copyright header for files modified in 2019
  • 580a2a9: 8087980: Add property to disable Monocle cursor
  • 3bbcbfb: 8225571: Port DND source to use GTK instead of GDK
  • 1952606: 8232811: Dialog's preferred size no longer accommodates multi-line strings
  • 4c6ebfb: 8236484: Compile error in monocle dispman
  • 8367e1a: 8130738: Add tabSize property to Text and TextFlow
  • 69e4ef3: 8235627: Blank stages when running JavaFX app in a macOS virtual machine
  • 5e0fb91: 8235364: Update copyright header for files modified in 2019
  • 8a5344c: Merge
  • 935e99d: 8207957: TableSkinUtils should not contain actual code implementation
  • 4e005e4: 8227808: Make GTK3 libraries mandatory for building on Linux
  • 1140d34: 8196587: Remove use of deprecated finalize method from JPEGImageLoader
  • d2d44b4: 8207759: VK_ENTER not consumed by TextField when default Button exists
  • 2e0d01c: Merge
  • a96704e: Merge
  • 71fa9af: Merge
  • a3711e2: Merge
  • 2d5d7e0: Merge
  • 962bdd1: 8232214: Improved internal validations
  • 2d2b824: 8232128: Better formatting for numbers
  • 81f7738: 8232121: Better numbering system
  • 8e55294: Merge
  • 1325f11: Merge
  • f759dd9: Merge
  • fed185a: Merge
  • 592d745: Merge
  • a0b4b14: 8227473: Improve gstreamer media support

Your commit was automatically rebased without conflicts.

Pushed as commit 1823f6e.

@openjdk openjdk bot removed the rfr label Jan 31, 2020
@mlbridge
Copy link

mlbridge bot commented Jan 31, 2020

Mailing list message from Kevin Rushforth on openjfx-dev:

Changeset: 1823f6e
Author: Frederic Thevenet <thevenet.fred at free.fr>
Committer: Kevin Rushforth <kcr at openjdk.org>
Date: 2020-01-31 14:11:47 +0000
URL: https://git.openjdk.java.net/jfx/commit/1823f6ec

8088198: Exception thrown from snapshot if dimensions are larger than max texture size

Reviewed-by: arapte, kcr

! modules/javafx.graphics/src/main/java/javafx/scene/Scene.java
! tests/system/src/test/java/test/javafx/scene/Snapshot2Test.java

@fthevenet fthevenet deleted the fthevenet:8088198 branch Jan 31, 2020
@mlbridge
Copy link

mlbridge bot commented Jan 31, 2020

Mailing list message from Kevin Rushforth on openjfx-dev:

Changeset: 1823f6e
Author: Frederic Thevenet <thevenet.fred at free.fr>
Committer: Kevin Rushforth <kcr at openjdk.org>
Date: 2020-01-31 14:11:47 +0000
URL: https://git.openjdk.java.net/jfx/commit/1823f6ec

8088198: Exception thrown from snapshot if dimensions are larger than max texture size

Reviewed-by: arapte, kcr

! modules/javafx.graphics/src/main/java/javafx/scene/Scene.java
! tests/system/src/test/java/test/javafx/scene/Snapshot2Test.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.