Skip to content

8337505: Footprint and startup regressions up to 20% in GUI apps #21748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

prrace
Copy link
Contributor

@prrace prrace commented Oct 28, 2024

https://bugs.openjdk.org/browse/JDK-8338677 already improved things for this so that's good.

This fix adds to it lazy initialisation of VarHandles in StrikeCache at the cost of some extra code.
Since these VarHandles get used more or less immediately on Linux this new fix won't further improve matters there
But should help on Mac where they aren't usually needed at startup
And Windows is somewhere in between.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8337505: Footprint and startup regressions up to 20% in GUI apps (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21748/head:pull/21748
$ git checkout pull/21748

Update a local copy of the PR:
$ git checkout pull/21748
$ git pull https://git.openjdk.org/jdk.git pull/21748/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21748

View PR using the GUI difftool:
$ git pr show -t 21748

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21748.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 28, 2024

👋 Welcome back prr! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 28, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title 8337505 8337505: Footprint and startup regressions up to 20% in GUI apps Oct 28, 2024
@openjdk
Copy link

openjdk bot commented Oct 28, 2024

@prrace The following label will be automatically applied to this pull request:

  • client

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the client client-libs-dev@openjdk.org label Oct 28, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 28, 2024
@mlbridge
Copy link

mlbridge bot commented Oct 28, 2024

Webrevs

private static VarHandle cellInfoHandle;
private static VarHandle imageHandle;

private static VarHandle getXAdvanceHandle() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: is there no concern with multi-threading / race conditions for these lazy creation methods? (e.g. requiring double-checked locking or similar)

This looks like a good use case for StableValue, if it were ready!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There won't be multiple threads except in some really unusual case so I don't think it is a problem

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect dropping static final from these VarHandle-s would degrade performance, as some internal checks in VHs would not constant-fold.

Do you need these in isolation, or can you lazily-initialize them all at once? You can use "holder class" pattern like:

  static class VHHolder {
    static final VarHandle xAdvanceHandle = ...
  }

@prrace
Copy link
Contributor Author

prrace commented Oct 29, 2024

I suspect dropping static final from these VarHandle-s would degrade performance, as some internal checks in VHs would not constant-fold.

Do you need these in isolation, or can you lazily-initialize them all at once? You can use "holder class" pattern like:

  static class VHHolder {
    static final VarHandle xAdvanceHandle = ...
  }

I suspect dropping static final from these VarHandle-s would degrade performance, as some internal checks in VHs would not constant-fold.

Do you need these in isolation, or can you lazily-initialize them all at once? You can use "holder class" pattern like:

  static class VHHolder {
    static final VarHandle xAdvanceHandle = ...
  }

I've been told about this static final optimisation but I've never observed any measurable benefit in other code where I've tried hard to see it.
Performance (runtime) isn't a huge concern for this code. Won't be noticed
Start up is noticed. And on Windows some are used early but not all which is why I chose this approach

@bourgesl
Copy link
Contributor

bourgesl commented Oct 30, 2024

@prrace even if not perf critical, every cpu cycle count, do not waste if possible!

I do recomment using the lazy holder singleton pattern:

protected final class VHHolder {
    private final static vh...

    // END:
    private final static VHHolder INSTANCE = new VHHolder();
}

private static VHHolder holder = null;

static getVHHolder() {
    // lazy pattern (synchronized if needed)
   if (holder == null) holder = VHHolder.INSTANCE;
   return holder;
}

Or something like that...

My 2 cents

@prrace
Copy link
Contributor Author

prrace commented Oct 30, 2024

@prrace even if not perf critical, every cpu cycle count, do not waste if possible!

I do recomment using the lazy holder singleton pattern:

protected final class VHHolder {
    private final static vh...

    // END:
    private final static VHHolder INSTANCE = new VHHolder();
}

private static VHHolder holder = null;

static getVHHolder() {
    // lazy pattern (synchronized if needed)
   if (holder == null) holder = VHHolder.INSTANCE;
   return holder;
}

Or something like that...

My 2 cents

I would have to create a class per-varhandle. I really don't want to do that.

@bourgesl
Copy link
Contributor

Why ? You can group vh in holder class... if you need several, why not?

@prrace
Copy link
Contributor Author

prrace commented Nov 1, 2024

Why ? You can group vh in holder class... if you need several, why not?

I am not baking in the specifics of the grouping that "happens" to be what I see today on whatever tests I run on Windows.
Which might be completely different than some other cases or platforms. So no.

@shipilev
Copy link
Member

shipilev commented Nov 4, 2024

I am not baking in the specifics of the grouping that "happens" to be what I see today on whatever tests I run on Windows.
Which might be completely different than some other cases or platforms.

The similar argument can be applied to the expectation that non-foldable VHs do not affect performance, right? Since we are dealing with a performance regression already, it would be nice not to introduce another one.

I think holder classes are a fair compromise here until Stable Values arrive. Look at this variant (untested): 8337505-v1.patch. This is also arguably more straight-forward than lazy initialization with runtime null-checks and extra (mutable) fields, I think.

@prrace
Copy link
Contributor Author

prrace commented Nov 4, 2024

I am not baking in the specifics of the grouping that "happens" to be what I see today on whatever tests I run on Windows.
Which might be completely different than some other cases or platforms.

The similar argument can be applied to the expectation that non-foldable VHs do not affect performance, right? Since we are dealing with a performance regression already, it would be nice not to introduce another one.

In a previous fix I spent ages trying to see any improvement in perf of this over literally 100 of thousands of invocations
Never saw it. What I did see there and the issue here is start up costs.
And the startup cost is probably thousands of times more visible than any runtime cost here.
This code could be slower at runtime and it would still be way less important than the startup cost.

I think holder classes are a fair compromise here until Stable Values arrive. Look at this variant (untested): 8337505-v1.patch. This is also arguably more straight-forward than lazy initialization with runtime null-checks and extra (mutable) fields, I think.

In the case where they are all needed then this fix surely makes it worse.
The worse case was Linux and SCAICS, now we have to initialise a class for each one as well.
I can't see how this helps start up.

@merykitty
Copy link
Member

Is @Stable usable here, if so you can annotate all these with it and there should be no runtime concern.

@prrace
Copy link
Contributor Author

prrace commented Nov 6, 2024

Is @Stable usable here, if so you can annotate all these with it and there should be no runtime concern.

I don't know and don't see much precedent.
Outside of java.base, only the incubating vector module uses @stable.
But I don't think it would help. The problem is the cost of initialisation itself. Later VM optimisations don't matter.

@shipilev
Copy link
Member

shipilev commented Nov 7, 2024

Is @Stable usable here, if so you can annotate all these with it and there should be no runtime concern.

I don't know and don't see much precedent. Outside of java.base, only the incubating vector module uses @stable. But I don't think it would help. The problem is the cost of initialisation itself. Later VM optimisations don't matter.

What I think @merykitty means here is taking your lazy-initialization patch, and then stamp @Stable over the affected static fields. This way, we get lazy initialization for good startup, and also we get the ability to constant-fold the VarHandles once they are initialized.

@prrace
Copy link
Contributor Author

prrace commented Dec 4, 2024

Thanks for the suggestion, but it isn't a path I want to go. The patch is still fine as it is.

@shipilev
Copy link
Member

shipilev commented Dec 5, 2024

Thanks for the suggestion, but it isn't a path I want to go. The patch is still fine as it is.

Well, the point of the review process is to get at least some reviewers to agree. I do not see anyone agreeing so far.

My problem with current version is that exposing non-constant-foldable Method/VarHandle-s is a performance anti-pattern. I poked around the font code, and I have not been able to convince myself these accesses are never on the hot-path. I see this is also a cache, so I would think there is an expectation that most, if not all things in it work as fast as reasonably possible. Therefore, I cannot put my approval on current version, sorry.

I do not quite understand what is the problem with stamping @Stable over the now-non-final fields in your current PR, like below? Is this only about the conceptual uncleanliness of exposing @Stable to java.desktop, or is it about something else?

diff --git a/src/java.base/share/classes/module-info.java b/src/java.base/share/classes/module-info.java
index c3bcbed4e3b..66ffff5723a 100644
--- a/src/java.base/share/classes/module-info.java
+++ b/src/java.base/share/classes/module-info.java
@@ -261,6 +261,7 @@
         jdk.internal.vm.ci,
         jdk.jfr;
     exports jdk.internal.vm.annotation to
+        java.desktop,
         java.instrument,
         jdk.internal.vm.ci,
         jdk.incubator.vector,
diff --git a/src/java.desktop/share/classes/sun/font/StrikeCache.java b/src/java.desktop/share/classes/sun/font/StrikeCache.java
index e71235d54c0..5271afbe611 100644
--- a/src/java.desktop/share/classes/sun/font/StrikeCache.java
+++ b/src/java.desktop/share/classes/sun/font/StrikeCache.java
@@ -43,6 +43,8 @@
 import java.lang.ref.WeakReference;
 import java.util.*;
 
+import jdk.internal.vm.annotation.Stable;
+
 import sun.java2d.Disposer;
 import sun.java2d.pipe.BufferedContext;
 import sun.java2d.pipe.RenderQueue;
@@ -136,16 +138,16 @@ private static VarHandle getVarHandle(StructLayout struct, String name) {
         return MethodHandles.insertCoordinates(h, 1, 0L).withInvokeExactBehavior();
     }
 
-    private static VarHandle xAdvanceHandle;
-    private static VarHandle yAdvanceHandle;
-    private static VarHandle widthHandle;
-    private static VarHandle heightHandle;
-    private static VarHandle rowBytesHandle;
-    private static VarHandle managedHandle;
-    private static VarHandle topLeftXHandle;
-    private static VarHandle topLeftYHandle;
-    private static VarHandle cellInfoHandle;
-    private static VarHandle imageHandle;
+    @Stable private static VarHandle xAdvanceHandle;
+    @Stable private static VarHandle yAdvanceHandle;
+    @Stable private static VarHandle widthHandle;
+    @Stable private static VarHandle heightHandle;
+    @Stable private static VarHandle rowBytesHandle;
+    @Stable private static VarHandle managedHandle;
+    @Stable private static VarHandle topLeftXHandle;
+    @Stable private static VarHandle topLeftYHandle;
+    @Stable private static VarHandle cellInfoHandle;
+    @Stable private static VarHandle imageHandle;
 
     private static VarHandle getXAdvanceHandle() {
         if (xAdvanceHandle == null) {

@prrace
Copy link
Contributor Author

prrace commented Dec 5, 2024

A dependency on the internals of the base module is another reason not to do this.
And as I have already said runtime performance is a complete non-issue here.
This is about startup (not even warm-up, just startup). There's absolutely no (good) reason to worry about runtime here.

@liach
Copy link
Member

liach commented Dec 6, 2024

Are you sure all these var handles are rarely used (not on hot code path)? This current patch can slow down the use of these var handles by many times. If this is startup-only, we can probably just create the vh once, invoke it, and discard it, or look into FFM API and see if we have better one-time invocation solutions.

@prrace
Copy link
Contributor Author

prrace commented Dec 6, 2024

Are you sure all these var handles are rarely used (not on hot code path)? This current patch can slow down the use of these var handles by many times. If this is startup-only, we can probably just create the vh once, invoke it, and discard it, or look into FFM API and see if we have better one-time invocation solutions.

I don't know what you are thinking. Or perhaps you are overthinking it.
You invented the reference to "rarely". Discarding would be silly.
The FFM angle is understood by the Panama team.
The patch stands. I've not seen a single suggestion to change it that makes sense for the problem.

@liach
Copy link
Member

liach commented Dec 6, 2024

From the FFM API pov, we might be able to make the VarHandle lazy, that it contains symbolic information and is fully expanded upon first use, effectively the same as the laziness here. Don't know if this works for FFM, but this is how java lang invoke ones work if they have to initialize classes. If such laziness is implemented, we can always revisit the existing uses later. This patch is fine as long as there is no bad side effects.

@prrace
Copy link
Contributor Author

prrace commented Dec 6, 2024

From the FFM API pov, we might be able to make the VarHandle lazy, that it contains symbolic information and is fully expanded upon first use, effectively the same as the laziness here. Don't know if this works for FFM, but this is how java lang invoke ones work if they have to initialize classes. If such laziness is implemented, we can always revisit the existing uses later. This patch is fine as long as there is no bad side effects.

So some day this patch may be obsolete ? So no reason to make it more complex.
Bad side effects ? I think that is improbable.

@mbreinhold
Copy link
Member

Phil, I don’t understand your resistance to using the @Stable annotation, which is already exported by the java.base module to five other JDK modules. Yes, it’s marked internal, but that means “internal to the JDK,” not “internal to the base module.” There’s no conceptual uncleanliness here.

You could well be correct in saying that run-time performance is a complete non-issue, for all users at all times, but why not apply @Stable just in case it turns out to be an issue for some user at some time? Seems like cheap insurance.

@liach
Copy link
Member

liach commented Dec 6, 2024

Stable annotation requires the class to be loaded by the boot loader. Need to verify that for java.desktop first.

@mbreinhold
Copy link
Member

java.desktop is loaded by the bootstrap class loader:

java.desktop \

@liach
Copy link
Member

liach commented Dec 6, 2024

Thanks for the verification Mark.

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 3, 2025

@prrace This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@mrserb
Copy link
Member

mrserb commented Jan 13, 2025

Do we have a chance to reproduce the 20% numbers, and how it is comparable to the initial unsafe implementation? As a mentioned in the initial PR the new ffm implementation may add up to 100ms on startup(I tested it in the cmm). So taking that into account and 20% on some perftests(which one?) I assume migration back to unsafe could be an option as well.

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 10, 2025

@prrace This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 11, 2025

@prrace This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client client-libs-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

8 participants