Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8274939: Incorrect size of the pixel storage is used by the robot on macOS #5864

Closed
wants to merge 3 commits into from

Conversation

mrserb
Copy link
Member

@mrserb mrserb commented Oct 8, 2021

In JDK 9 the native code for the robot class was reworked to get an access to the HiDPI quality screenshots. So we allocate the data storage for the HiDPI quality and then request the best quality from the macOS.

It works fine if the user request the screenshot of some area, since we properly scale this area. Unfortunately it does not work well if the user request only one pixel, in this case we allocate the array of one element and does not multiply the size by the scale, so if the system scale is 2 then the macOS returns the 2x2 pixels, which does not fit properly to the array of one element. This can be checked by the Xcheck:jni option which produce fatal error in this case.

Solution is to allocate the storage of the proper size 1 * scale * 1 * scale


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8274939: Incorrect size of the pixel storage is used by the robot on macOS

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5864/head:pull/5864
$ git checkout pull/5864

Update a local copy of the PR:
$ git checkout pull/5864
$ git pull https://git.openjdk.java.net/jdk pull/5864/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5864

View PR using the GUI difftool:
$ git pr show -t 5864

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5864.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 8, 2021

👋 Welcome back serb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 8, 2021

@mrserb The following label will be automatically applied to this pull request:

  • client

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the client client-libs-dev@openjdk.org label Oct 8, 2021
@mrserb mrserb marked this pull request as ready for review October 8, 2021 10:26
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 8, 2021
@mlbridge
Copy link

mlbridge bot commented Oct 8, 2021

Webrevs

@prrace
Copy link
Contributor

prrace commented Oct 13, 2021

Whilst more in keeping with the code that grabs an area It's not easy to know if this really makes things better in any practical way.
It definitely does not fix a single one of the tests described in https://bugs.openjdk.java.net/browse/JDK-8274106
(after of course un-doing the problem listing and work arounds of setting) as tested on both imacs on which they were previously observed to fail. As noted in that bug report, it seems like there may be an Apple bug.
Meanwhile I will run more tests (not just those previously failing ones) to make sure this change doesn't introduce any new issues.

@mrserb
Copy link
Member Author

mrserb commented Oct 13, 2021

It definitely does not fix a single one of the tests described in https://bugs.openjdk.java.net/browse/JDK-8274106

As I mentioned in that bug, the scaleUI option does not affect the macOS API we use, and if the option solves the tests failures mean that we still have some other bugs somewhere.

@mrserb
Copy link
Member Author

mrserb commented Oct 29, 2021

Does anybody have some comments or suggestions? I would like to fix this before adding more changes to the robot code.

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 26, 2021

@mrserb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@mrserb
Copy link
Member Author

mrserb commented Dec 1, 2021

Any volunteers?

@prsadhuk
Copy link
Contributor

prsadhuk commented Dec 1, 2021

It seems CheckCommonColors still fails intermittently even with this fix in macos aarch64 along with others
open/test/jdk/java/awt/Robot/HiDPIScreenCapture/ScreenCaptureTest.java,
open/test/jdk/java/awt/ColorClass/AlphaColorTest.java,
open/test/jdk/java/awt/AlphaComposite/WindowAlphaCompositeTest.java

CheckCommonColors fails with this log
---------System.out:(3/175)----------
color = java.awt.Color[r=255,g=255,b=255], useRect = true
color = java.awt.Color[r=255,g=255,b=255], useRect = false
color = java.awt.Color[r=192,g=192,b=192], useRect = true
----------System.err:(17/897)----------
Expected: java.awt.Color[r=192,g=192,b=192]
Actual: java.awt.Color[r=199,g=199,b=199]
Point: java.awt.Point[x=960,y=600]
Dump screen to: ScreenCapture.png

@mrserb
Copy link
Member Author

mrserb commented Dec 1, 2021

It seems CheckCommonColors still fails intermittently even with this fix in macos aarch64 along with others open/test/jdk/java/awt/Robot/HiDPIScreenCapture/ScreenCaptureTest.java, open/test/jdk/java/awt/ColorClass/AlphaColorTest.java, open/test/jdk/java/awt/AlphaComposite/WindowAlphaCompositeTest.java

They could fail because of cursor location, or some other bug in our code or in the macOS. It will be good to debug it on that system where it fail.

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 29, 2021

@mrserb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@mrserb
Copy link
Member Author

mrserb commented Jan 3, 2022

Any volunteers?

@mrserb
Copy link
Member Author

mrserb commented Jan 25, 2022

Any volunteers to look at this buffer overrun?

Copy link
Member

@aivanov-jdk aivanov-jdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think increasing the buffer according to the scale is correct as the scale × scale image where scale != 1 can't fit into the buffer of size 1.

It may not fix other bugs or test failures; buffer overrun is not a good thing and it's fixed.

@openjdk
Copy link

openjdk bot commented Jan 26, 2022

@mrserb This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8274939: Incorrect size of the pixel storage is used by the robot on macOS

Reviewed-by: aivanov, prr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 99 new commits pushed to the master branch:

  • fd8a3dc: 8280820: Clean up bug8033699 and bug8075609.java tests: regtesthelpers aren't used
  • 178b962: 8265765: DomainKeyStore may stop enumerating aliases if a constituting KeyStore is empty
  • c5c8c06: 8279822: CI: Constant pool entries in error state are not supported
  • f823bed: 8280007: Enable Neoverse N1 optimizations for Arm Neoverse V1 & N2
  • 8b384b9: 8281470: tools/jar/CreateMissingParentDirectories.java fails with "Should have failed creating jar file"
  • bb2e10c: 8281274: deal with ActiveProcessorCount in os::Linux::print_container_info
  • 69e390a: 8262721: Add Tests to verify single iteration loops are properly optimized
  • f092bab: 8281195: Mistakenly used logging causes significant overhead in interpreter
  • f924e50: 8281440: AWT: Conversion from string literal loses const qualifier
  • 072e7b4: 8272807: Permit use of memory concurrent with pretouch
  • ... and 89 more: https://git.openjdk.java.net/jdk/compare/0e70d4504c267174738485c7da82a2ac0ef09770...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 26, 2022
@aivanov-jdk
Copy link
Member

I wonder if a similar problem exist on Windows and Linux. The pixelArray is allocated as new int[bounds.width*bounds.height] in both. And for macOS too: CRobot.getRGBPixels allocates the array of size bounds.width*bounds.height, shouldn't it be twice as large if uiScale=2.0 is in effect?

@prrace
Copy link
Contributor

prrace commented Jan 26, 2022

I noted I was intending to run tests with this patch applied .. I think I did that but of course I've lost them now. I'll try again.

@prrace
Copy link
Contributor

prrace commented Jan 27, 2022

Why doesn't this need updating too ? Seems like it must be OK else we'd have massive over-runs ..

@OverRide
public int [] getRGBPixels(final Rectangle bounds) {
int[] c = new int[bounds.width * bounds.height];
getScreenPixels(bounds, c);
return c;
}

@prrace
Copy link
Contributor

prrace commented Jan 27, 2022

Looking at what happens in native it seems like it should be possible to add a check that the
Java array has enough capacity to satisfy what the call to retrieve pixels expects.
I think it would be reasonable to throw InternalError if it does not !

@prrace
Copy link
Contributor

prrace commented Jan 27, 2022

My testing looks good - but waiting on answers to the other questions

@mrserb
Copy link
Member Author

mrserb commented Jan 27, 2022

I wonder if a similar problem exist on Windows and Linux. The pixelArray is allocated as new int[bounds.widthbounds.height] in both. And for macOS too: CRobot.getRGBPixels allocates the array of size bounds.widthbounds.height, shouldn't it be twice as large if uiScale=2.0 is in effect?

On Linux and Windows we pass the size of the area to the system in the device space, so we store an image of the size we just allocated. The macOS is different we pass the coordinates in the users space -> when we pass 1x1 we may get 4x4 on the HiDPI screen, this is what the current fix changed.

Why doesn't this need updating too ? Seems like it must be OK else we'd have massive over-runs ..

We do not need to change other getRGBPixels() method because its coordinates already converted to the pixels(device space), we do this in the shared code. Note that the method which changed by this fix however use the "1" as a width and height in the users space - this is a bug.

Looking at what happens in native it seems like it should be possible to add a check that the
Java array has enough capacity to satisfy what the call to retrieve pixels expects.
I think it would be reasonable to throw InternalError if it does not !

The logic we use is this:
1 We prepared the pixel storage using device space coordinates
2 We capture the 1x1 unit in the users space, usually this is 1x1 pixel on lowdpi and 4x4 on hidpi screen, actually macOS may return any size
3 If the macOS return some other size we down/up scale it to the size we calculated at step 1

@prrace
Copy link
Contributor

prrace commented Jan 27, 2022

I get that getPixelColor() / getRGBPixel() takes only a coordinate but when looking at CRobotPeer it is far from obvious why there is this inconsistency. If we can't make the shared code consistent - maybe add a scale parameter ? Then at least we should have some comments.
But why are X11 and Windows fine with no scale ?

X11:
@OverRide
public int getRGBPixel(int x, int y) {
int[] pixelArray = new int[1];
getRGBPixelsImpl(xgc, x, y, 1, 1, pixelArray, useGtk);
return pixelArray[0];
}

Windows :
@OverRide
public int getRGBPixel(int x, int y) {
// See 7002846: that's ineffective, but works correctly with non-opaque windows
return getRGBPixels(new Rectangle(x, y, 1, 1))[0];
}

And in native what I was suggesting was a safeguard that would have detected this problem - which I still think would be a good idea.

BTW "usually this is 1x1 pixel on lowdpi and 4x4 on hidpi screen, actually macOS may return any size"

When might it return "any size" ?
My experiments show scale==2 no matter what scale I use in the macos Display settings for the built-in retina

@mrserb
Copy link
Member Author

mrserb commented Jan 27, 2022

But why are X11 and Windows fine with no scale ?

Since the coordinates are in the device space, and we pass this coordinates to the OS as is, and the OS returns the images of the requested size we do not need to use a scale there. Unlike macOS where we should have coordinates in the user space and device space at the same time. So instead of the second coordinates we pass the scale and calculate others coordinate.

BTW "usually this is 1x1 pixel on lowdpi and 4x4 on hidpi screen, actually macOS may return any size"

When might it return "any size" ? My experiments show scale==2 no matter what scale I use in the macos Display settings for the built-in retina

Per the specification of the method we use: kCGWindowImageBestResolution. So when we request the area in the user space, that method may return low/hi/something in between/ image, then we convert/scale that image to fit the array we allocated.

@prrace
Copy link
Contributor

prrace commented Jan 28, 2022

  • So this is crying out for comments in the source code
  • I still see a need for a safety check in native code

@mrserb
Copy link
Member Author

mrserb commented Jan 28, 2022

* I still see a need for a safety check in native code

I can add some check but which one? In the native we should use the bounds we passed from java side, the problem is that we pass "1 * scale" = 'scale", but allocate the array as "new int[1]" so this is an issue on that java side in the changed method.

@prrace
Copy link
Contributor

prrace commented Jan 28, 2022

* I still see a need for a safety check in native code

I can add some check but which one? In the native we should use the bounds we passed from java side, the problem is that we pass "1 * scale" = 'scale", but allocate the array as "new int[1]" so this is an issue on that java side in the changed method.
Isn't the over-run supposed to be here :

JNIEXPORT void JNICALL
Java_sun_lwawt_macosx_CRobot_nativeGetScreenPixels
(JNIEnv *env, jobject peer,
jint x, jint y, jint width, jint height, jdouble scale, jintArray pixels)

void *jPixelData = (*env)->GetPrimitiveArrayCritical(env, pixels, 0);

CGContextRef jPicContextRef = CGBitmapContextCreate(
jPixelData,
picWidth, picHeight,
8, picWidth * sizeof(jint),
picColorSpace,
kCGBitmapByteOrder32Host |
kCGImageAlphaPremultipliedFirst)

And then the apple docs at https://developer.apple.com/documentation/coregraphics/1455939-cgbitmapcontextcreate/ say
about the 1st parameter :
Data

A pointer to the destination in memory where the drawing is to be rendered. The size of this memory block should be at least (bytesPerRow*height) bytes.

and picWidth * sizeof(jint), is bytes per row.
So if it the Java array pixels is just one int (4 bytes) and we have a scale of 2 when it needs to be 4 ints (16 bytes) we'd have the over-run ?
Then why can't we just make sure (*env)->GetArrayLength(env, pixels) >= picWidth * picHeight ??

@mrserb
Copy link
Member Author

mrserb commented Jan 28, 2022

A pointer to the destination in memory where the drawing is to be rendered. The size of this memory block should be at least (bytesPerRow*height) bytes. and picWidth * sizeof(jint), is bytes per row.
So if it the Java array pixels is just one int (4 bytes) and we have a scale of 2 when it needs to be 4 ints (16 bytes) we'd have the over-run ?

If the scale is 2 then we will need 4 ints overall, but only 2 ints per row, since we will have 2 rows.

Then why can't we just make sure (*env)->GetArrayLength(env, pixels) >= picWidth * picHeight ??

We can for sure, I can update the fix.

@mkartashev
Copy link
Member

FWIW there's a related problem on Linux where the size of the captured area is scaled down in the native code, which in case of capturing just one pixel might end up being 0 for scales 300%+. I just filed JDK-8280861 for that and the issue with coordinates scaling on Linux.

@mrserb
Copy link
Member Author

mrserb commented Jan 28, 2022

I just filed JDK-8280861 for that and the issue with coordinates scaling on Linux.

It looks similar to what is done on macOS where two types of coordinates are needed: the user's space to pass to the GTK and the device space to create storage for the pixels. Be careful to not desync them as we did on macOS here.

@mrserb
Copy link
Member Author

mrserb commented Jan 28, 2022

You are absolutely right! in case of linux we have the same bug when gtk is used. =(

@mrserb
Copy link
Member Author

mrserb commented Feb 1, 2022

I have added an additional check for the parameters of the native method.

Copy link
Contributor

@prrace prrace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed this update.

@mrserb
Copy link
Member Author

mrserb commented Feb 12, 2022

/integrate

@openjdk
Copy link

openjdk bot commented Feb 12, 2022

Going to push as commit eff5daf.
Since your change was applied there have been 131 commits pushed to the master branch:

  • 8acfbc2: 8281675: VMDeprecatedOptions test fails after JDK-8278423
  • 67077a0: 8278423: ExtendedDTraceProbes should be deprecated
  • 58dae60: 8274524: SSLSocket.close() hangs if it is called during the ssl handshake
  • aa918a6: 8281033: Improve ImageCheckboxTest to test all available LaF
  • 6fdfe04: 8281674: tools/javac/annotations/typeAnnotations/classfile/AnonymousExtendsTest.java fails with AssertionError
  • c3179a8: 8281462: Annotation toString output for enum not reusable for source input
  • 4032fe7: 8281238: TYPE_USE annotations not printed in correct position in toString output
  • 83ffbd2: 8277175: Add a parallel multiply method to BigInteger
  • 0786ddb: 8281535: Create a regression test for JDK-4670051
  • c5ff6e4: 8223077: module path support for dynamic CDS archive
  • ... and 121 more: https://git.openjdk.java.net/jdk/compare/0e70d4504c267174738485c7da82a2ac0ef09770...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 12, 2022
@openjdk openjdk bot closed this Feb 12, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 12, 2022
@openjdk
Copy link

openjdk bot commented Feb 12, 2022

@mrserb Pushed as commit eff5daf.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@mrserb mrserb deleted the JDK-8274939 branch February 12, 2022 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client client-libs-dev@openjdk.org integrated Pull request has been integrated
5 participants