-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8301958: Reduce Arrays.copyOf/-Range overheads #12453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…disambiguate from generic utility methods, tune microbenchmark
👋 Welcome back redestad! A progress list of the required criteria for merging this PR into |
Webrevs
|
@@ -4535,9 +4541,12 @@ void getBytes(byte[] dst, int srcPos, int dstBegin, byte coder, int length) { | |||
String(AbstractStringBuilder asb, Void sig) { | |||
byte[] val = asb.getValue(); | |||
int length = asb.length(); | |||
// To avoid surprises due to data races (which would either truncate or throw an exception) | |||
// we should check that length <= val.length up front | |||
checkOffset(length, val.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree a check is needed here but I assume using checkOffset means that SB::toString could fail with SIOOBE. I wonder if Math.min(length, val.length) would be better here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. That keeps behavior consistent for most cases and removes a path where we can fail with SIOOBE in the existing code (down StringUTF16::compress
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version looks good, I assume you've bump the copyright date on the files that still have 2022 as their last edit.
@cl4es This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 85 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@@ -760,8 +760,7 @@ public static String newString(byte[] val, int index, int len) { | |||
if (len == 0) { | |||
return ""; | |||
} | |||
return new String(Arrays.copyOfRange(val, index, index + len), | |||
LATIN1); | |||
return new String(String.copyBytes(val, index, len), LATIN1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worthwhile putting a comment at the top of this method saying that the caller is required to bounds-check the arguments. It's called from several other classes in this package, so it would be good to document this package-internal contract. I checked the callers and they seem fine.
Our source code is a reference implementation, and people will look at this change as evidence that On the other hand, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cl4es to look into this!
@@ -695,6 +695,12 @@ private String(Charset charset, byte[] bytes, int offset, int length) { | |||
} | |||
} | |||
|
|||
static byte[] copyBytes(byte[] bytes, int offset, int length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the stub generated for array copy seems highly dependent by the call site constrains, did you tried adding a check for offset == 0 and/or length == bytes.length?
If (offset == 0 && bytes.length == length) {
System.arrayCopy(bytes, 0, dst, 0, bytes.length);
// etc etc the other combinations
This should have different generated stubs with much smaller ASM depending by the enforced constrains (and shouldn't affect terribly the code size of the method, given that the stub won't be inlined AFAIK)
Beware, as noted by others, I'm not suggesting that's the way to fix this, but it would be interesting to check how much perf we leave on the ground due to the this supposed "inefficient" stub generation (if that's the issue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some quick experiments but saw no clear win from doing anything like this here. Feel free to experiment and see if there's some particular configuration that comes out ahead.
FTR I did not intend for this RFE to solve https://bugs.openjdk.org/browse/JDK-8295496 completely, but provide a small, partial win that might possibly clear a path to solving that likely orthogonal issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created a separate benchmark for this (named as your by accident - given that I've used it as a blueprint):
https://gist.github.com/franz1981/658c2bf6796aab4ae04a84bef1ef34b6
results are
Benchmark (offset) (size) Mode Cnt Score Error Units
StringConstructor.arrayCopy 0 7 avgt 10 9.519 ± 0.131 ns/op
StringConstructor.arrayCopy 1 7 avgt 10 9.194 ± 0.232 ns/op
StringConstructor.copyOf 0 7 avgt 10 11.548 ± 0.133 ns/op
StringConstructor.copyOf 1 7 avgt 10 9.812 ± 0.018 ns/op
StringConstructor.optimizedArrayCopy 0 7 avgt 10 6.854 ± 0.355 ns/op <---- THAT'S COOL
StringConstructor.optimizedArrayCopy 1 7 avgt 10 9.088 ± 0.049 ns/op
the optimized array copy is helping C2 on stub generation.
I didn't checked yet if this applies to the String
case and I didn't created a long enough dataset array to check the effects on the branch predictor with the newly introduced conditions too, but in term of generated stub, there's a difference.
It might be that the redundant checks in But yes, I will add some commentary to the effect that this should ideally be handled by our JIT, along with comments that the method deliberately avoids safety checks. |
I'm wondering if another contributing factor to the complexity of this code is the continued support of the non-compact-String codepaths. This means there are actually three code paths through every string computation. Do we need to continue to support the non-compact-string code paths? I'm concerned about maintainability too. |
I think most apps have sufficient ASCII/latin1-encodable data to make compact strings a net win. Especially with recent improvements to key intrinsics that has narrowed the gap. I still think turning off compact strings might be beneficial in locales where most strings are UTF-16, but as you say there might be wins to maintainability and code complexity by ripping out |
It is at least possible that splitting By splitting, in this case, I mean moving the first three lines into a helper routine called It’s sad and true, until we get a better inliner, which of course will disrupt the ecosystem because there is no unique best answer to an inlining problem (if it is complex enough, and they are). So for now we pretend to be good O-O programmers and code separate concerns in separate methods. |
Rather than splitting right down the middle isn't it more effective to factor out code that would typically not be executed, such as the exception creation + formatting? That additionally allows the JIT to outline such code altogether, allowing more aggressive inlining of the non-exceptional path(s). |
You could split it that way as well. It pushes the inliner a little harder, but doesn’t make it fall over. |
… out range checks and helping JIT pick the best arraycopy adapter
@franz1981 idea seems to apply nicely here, and going back and applying it to
We might still benefit for some cases to specialize a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -3582,6 +3590,9 @@ public static short[] copyOf(short[] original, int newLength) { | |||
* @since 1.6 | |||
*/ | |||
public static int[] copyOf(int[] original, int newLength) { | |||
if (newLength == original.length) { | |||
return original.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious about the use of clone
for some primitive array types (short[]
, int[]
, long[]
, char[]
, float[]
) and copyOf
using System.arraycopy
in other types (byte[]
, double[]
). Do these types optimize differently or hit different intrinsics depending on primitive type? Is there difference in array zeroing?
From a quick JMH benchmark System.arraycopy
seems slightly better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this and also saw a small win using arraycopy
, but the PR ended up in an inconsistent state with some using one and some using the other. While this discrepancy seem like something we should treat as a bug, I've arranged to use copyOf
helper consistently for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also curious if returning the new length from checkLength
would be worthwhile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nudge from John and others to move this to Arrays.copyOf is much better. It looks good.
We may have to come back to the SB::toString issue from the original proposal.
Thanks for reviewng! I've filed https://bugs.openjdk.org/browse/JDK-8302315 to investigate the clone/arraycopy performance discrepancy. Ideally we should be able to just do /integrate |
Going to push as commit 1f9c110.
Your commit was automatically rebased without conflicts. |
This patch adds special-cases to
Arrays.copyOf
andArrays.copyOfRange
to copy arrays more efficiently when exactly the whole input array is to be copied. This helps eliminate range checks and has been verified to help various String operations. Example:Baseline
Patch:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/12453/head:pull/12453
$ git checkout pull/12453
Update a local copy of the PR:
$ git checkout pull/12453
$ git pull https://git.openjdk.org/jdk pull/12453/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12453
View PR using the GUI difftool:
$ git pr show -t 12453
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12453.diff