Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8263482: Make access to the ICC color profiles data multithread-friendly #2957

Closed
wants to merge 4 commits into from

Conversation

mrserb
Copy link
Member

@mrserb mrserb commented Mar 12, 2021

FYI: probably is better/simpler to review it via webrev.

After migration to the lcms from the kcms the performance of some operations was regressed. One possible workaround was to split the operation into multiple threads. But unfortunately, we have too many bottlenecks which prevent using multithreading. This is the request to remove/minimize such bottlenecks(at least some of them), but it does not affect the single-threaded performance it should be handled separately.

The main code pattern optimized here is this:

    activate();
    byte[] theHeader = getData(cmmProfile, icSigHead);
    ---->  CMSManager.getModule().getTagData(p, tagSignature);

Notes about the code above:

  1. Before the change "activate()" method checked that the "cmmProfile" field was not null. After that we usually used the "cmmProfile" as the parameter to some other method. This included two volatile reads, and also required to check when we need to call the "activate()" method before usage of the "cmmProfile" field.
    Solution: "activate()" renamed to the "cmmProfile()" which became an accessor for the field, so we will get one volatile read and can easily monitor the usage of the field itself(it is used directly only in this method).

  2. The synchronized static method "CMSManager.getModule()" reimplemented to have only one volatile read.

  3. The usage of locking in the "getTagData()" is changed. Instead of the synchronized instance methods, we now use the mix of "ConcurrentHashMap" and StampedLock.

See some comments inline.

Some numbers(small numbers are better):

  1. Performance of ((ICC_ProfileRGB) ICC_Profile.getInstance(ColorSpace.CS_sRGB)).getMatrix();
jdk 15.0.2
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetMatrix       avgt    5   19,624 ±    0,059  us/op
CMMPerf.testGetMatrix                  avgt    5    0,154 ±    0,001  us/op

jdk - before the fix
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetMatrix       avgt    5   12,935 ±    0,042  us/op
CMMPerf.testGetMatrix                  avgt    5    0,127 ±    0,007  us/op

jdk - after the fix
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetMatrix       avgt    5    0,561 ±    0,005  us/op
CMMPerf.testGetMatrix                  avgt    5    0,092 ±    0,001  us/op
  1. Part of performance gain in jdk17 is from some other fixes, for example
    Performance of ICC_Profile.getInstance(ColorSpace.CS_sRGB); and ColorSpace.getInstance(ColorSpace.CS_sRGB);
jdk 15.0.2
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetSRGBProfile  avgt    5    2,299 ±    0,032  us/op
CMMPerf.ThreadsMAX.testGetSRGBSpace    avgt    5    2,210 ±    0,051  us/op
CMMPerf.testGetSRGBProfile             avgt    5    0,019 ±    0,001  us/op
CMMPerf.testGetSRGBSpace               avgt    5    0,018 ±    0,001  us/op

jdk - same before/ after the fix
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetSRGBProfile  avgt    5    0,005 ±    0,001  us/op
CMMPerf.ThreadsMAX.testGetSRGBSpace    avgt    5    0,005 ±    0,001  us/op
CMMPerf.testGetSRGBProfile             avgt    5    0,005 ±    0,001  us/op
CMMPerf.testGetSRGBSpace               avgt    5    0,005 ±    0,001  us/op

note "ThreadsMAX" is 32 threads.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8263482: Make access to the ICC color profiles data multithread-friendly

Reviewers

Download

To checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/2957/head:pull/2957
$ git checkout pull/2957

To update a local copy of the PR:
$ git checkout pull/2957
$ git pull https://git.openjdk.java.net/jdk pull/2957/head

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 12, 2021

👋 Welcome back serb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 12, 2021

@mrserb The following label will be automatically applied to this pull request:

  • 2d

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the 2d client-libs-dev@openjdk.org label Mar 12, 2021
return LCMS.getTagNative(getNativePtr(), key);
});
} finally {
lock.unlockRead(stamp);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments about the change in this class and the changes in the "getTag()" method.

  1. I have removed the TagData and TagCache classes and move the tag cache to the profile class directly.
  2. I have moved synchronization logic from the LCMS.java to this class, so now we can use more "granular" synchronizations.

The logic behind this change:

  1. The StampedLock guards the access to the native lcms code, so when we change the tags via "LCMS.setTagDataNative" nobody will call "LCMS.getTagNative" and "LCMS.getProfileDataNative". I tried the "ReentrantReadWriteLock", but it is a little bit slower.
  2. The cache itself is maintained by the ConcurrentHashMap, and the usage of "byte[] t = tags.get(sig);" w/o synchronization is a key for the performance.

Note that it is possible to use "tags.computeIfAbsent" instead of "tags.get(sig)" and take care of "native access" synchronization in the "LCMS.getTagNative", but unfortuntly for some workload the "get()" is x10 times faster than computeIfAbsent() if the key is already exists in the map(common situation for the cache).

activate();
return getData(cmmProfile, tagSignature);
byte[] t = getData(cmmProfile(), tagSignature);
return t != null ? t.clone() : null;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the clone operation to the public "ICC_Profile.getData(int tagSignature)" method, so we do not clone it again again and again when we use this data internally.


@Override
public synchronized void setTagData(Profile p, int tagSignature, byte[] data) {
getLcmsProfile(p).setTag(tagSignature, data);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this method is synchronized, I think it is not needed to be, because a long time ago(before the LCMSProfile class was implemented ) all method in this class was synchronized to guard information about tags. But later we changed all such synchronization methods to the per-profile locks. And only this method remains synchronized.

But I leave it as is for now, since it might affect transforms which I am not touching in this fix. WIll remove that later.

@mrserb mrserb marked this pull request as ready for review March 12, 2021 08:35
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 12, 2021
@mlbridge
Copy link

mlbridge bot commented Mar 12, 2021

Webrevs

@openjdk
Copy link

openjdk bot commented Mar 17, 2021

@mrserb This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8263482: Make access to the ICC color profiles data multithread-friendly

Reviewed-by: azvegint

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 97 new commits pushed to the master branch:

  • 5d5813a: 8263757: Remove serviceability/sa/ClhsdClasses.java from ZGC problem list
  • 50ff0d4: 8263756: Fix ZGC ProblemList entry for serviceability/sa/ClhsdbSymbol.java
  • 99b39aa: 8262807: Note assumptions of core reflection modeling and parameter handling
  • 26234b5: 8254979: Class.getSimpleName() returns non-empty for lambda and method
  • 83a49ef: 8263753: two new tests from JDK-8261671 fail with "Error. can not find ClassFileInstaller in test directory or libraries"
  • 24afa36: 8263726: divideToIntegralValue typo on BigDecimal documentation
  • cdf78e4: 8262298: G1BarrierSetC2::step_over_gc_barrier fails with assert "bad barrier shape"
  • 7674da4: 8262398: Shenandoah: Disable nmethod barrier and stack watermark when running with passive mode
  • 4f4ca0e: 8261671: X86 I2L conversion can be skipped for certain masked positive values
  • 5d87a21: 8263361: Incorrect arraycopy stub selected by C2 for SATB collectors
  • ... and 87 more: https://git.openjdk.java.net/jdk/compare/f3bd801a8681db778122714e1df5ea12e7ef5824...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 17, 2021
@mrserb
Copy link
Member Author

mrserb commented Mar 19, 2021

/integrate

@openjdk openjdk bot closed this Mar 19, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 19, 2021
@openjdk
Copy link

openjdk bot commented Mar 19, 2021

@mrserb Since your change was applied there have been 121 commits pushed to the master branch:

  • d185655: 8263832: Shenandoah: Fixing parallel thread iteration in final mark task
  • 434a399: 8260274: Cipher.init(int, key) does not use highest priority provider for random bytes
  • 6aa28b3: 8263827: Suspend "missing" javadoc doclint checks for smartcardio
  • ed1e25d: 8263833: Stop disabling warnings for sunFont.c with gcc
  • 788e30c: 8263320: [test] Add Object Stream Formatter to work with test utility HexPrinter
  • fa0f161: 8263742: (bf) MappedByteBuffer.force() should use the capacity as its upper bound
  • c82a673: 8262001: java/lang/management/ThreadMXBean/ResetPeakThreadCount.java failed with "RuntimeException: Current Peak = 14 Expected to be == previous peak = 7 + 8"
  • 01ddf3d: 8263622: The java.awt.color.ICC_Profile#setData invert the order of bytes for the "head" tag
  • e34f766: 8252723: Run stack016.java also with C2-only
  • 2173fed: 8263439: getSupportedAttributeValues() throws NPE for Finishings attribute
  • ... and 111 more: https://git.openjdk.java.net/jdk/compare/f3bd801a8681db778122714e1df5ea12e7ef5824...master

Your commit was automatically rebased without conflicts.

Pushed as commit 1a21f77.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2d client-libs-dev@openjdk.org integrated Pull request has been integrated
2 participants