Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8273100: Improve AbstractStringBuilder.append(String) when using CompactStrings #5291

Closed
wants to merge 3 commits into from

Conversation

cl4es
Copy link
Member

@cl4es cl4es commented Aug 28, 2021

Refactor to improve inlining, which helps some microbenchmarks exer StringBuilder.append(String)


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8273100: Improve AbstractStringBuilder.append(String) when using CompactStrings

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5291/head:pull/5291
$ git checkout pull/5291

Update a local copy of the PR:
$ git checkout pull/5291
$ git pull https://git.openjdk.java.net/jdk pull/5291/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5291

View PR using the GUI difftool:
$ git pr show -t 5291

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5291.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 28, 2021

👋 Welcome back redestad! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 28, 2021
@openjdk
Copy link

openjdk bot commented Aug 28, 2021

@cl4es The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Aug 28, 2021
@mlbridge
Copy link

mlbridge bot commented Aug 28, 2021

Webrevs

@cl4es
Copy link
Member Author

cl4es commented Aug 28, 2021

Microbenchmark results:

Baseline

Benchmark                    Mode  Cnt    Score   Error  Units
# +CompactStrings
StringBuilders.appendLoop16  avgt   25  749.861 ± 6.717  ns/op
StringBuilders.appendLoop8   avgt   25  382.491 ± 4.603  ns/op

# -CompactStrings
StringBuilders.appendLoop16  avgt   25  716.850 ± 7.857  ns/op
StringBuilders.appendLoop8   avgt   25  374.279 ± 5.793  ns/op

# -Xint +CompactStrings:
StringBuilders.appendLoop16  avgt    5  23479.176 ± 135.382  ns/op
StringBuilders.appendLoop8   avgt    5  12075.346 ±  80.429  ns/op

# -Xint -CompactStrings:
Benchmark                    Mode  Cnt      Score     Error  Units
StringBuilders.appendLoop16  avgt    5  22166.303 ± 295.892  ns/op
StringBuilders.appendLoop8   avgt    5  11480.187 ± 710.690  ns/op

Patched:

Benchmark                    Mode  Cnt    Score   Error  Units
# +CompactStrings
StringBuilders.appendLoop16  avgt   25  724.803 ± 9.613  ns/op
StringBuilders.appendLoop8   avgt   25  374.400 ± 6.368  ns/op

# -CompactStrings
StringBuilders.appendLoop16  avgt   25  715.314 ± 11.787  ns/op
StringBuilders.appendLoop8   avgt   25  377.415 ±  6.089  ns/op

# -Xint +CompactStrings
StringBuilders.appendLoop16  avgt    5  19794.280 ± 227.033  ns/op
StringBuilders.appendLoop8   avgt    5  10200.439 ± 221.899  ns/op

# -Xint -CompactStrings
StringBuilders.appendLoop16  avgt    5  16855.894 ± 127.499  ns/op
StringBuilders.appendLoop8   avgt    5   9004.396 ± 111.427  ns/op

@cl4es
Copy link
Member Author

cl4es commented Aug 30, 2021

Simplified further after realizing putStringAt(int, String) can use String.getBytes(String, int, byte), improving performance for both modes:

# +CompactStrings
Benchmark                    Mode  Cnt    Score    Error  Units
StringBuilders.appendLoop16  avgt   15  681.778 ± 14.978  ns/op
StringBuilders.appendLoop8   avgt   15  358.008 ±  9.230  ns/op

# -CompactStrings
Benchmark                    Mode  Cnt    Score    Error  Units
StringBuilders.appendLoop16  avgt   15  664.591 ± 10.505  ns/op
StringBuilders.appendLoop8   avgt   15  351.175 ± 13.282  ns/op

@stsypanov
Copy link
Contributor

Hi, just curious how have you found out that the code should be extracted into a separate methods? Profiler?

@cl4es
Copy link
Member Author

cl4es commented Aug 30, 2021

Hi, just curious how have you found out that the code should be extracted into a separate methods? Profiler?

I saw that String::length calls appeared more than once with async-profiler, then did some experiments to see if manually inlining so that only one call was needed helped (and it did).

}

private void putStringAt(int index, String str, int off, int end) {
inflateIfNeededFor(str);
str.getBytes(value, off, index, coder, end - off);
}

private void putStringAt(int index, String str) {
Copy link
Contributor

@stsypanov stsypanov Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace all the calls to this method with calls to previous method as putStringAt(index, str, 0, str.length()) taking into account that in all usecases str.length() is already calculated into a local var?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think so. The only use of this I can find is at line 1298 which effectively adds a substring: putStringAt(dstOffset, (String) s, start, end);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about lines 582, 1003 and 1175? E.g. 582

public AbstractStringBuilder append(String str) {
    if (str == null) {
        return appendNull();
    }
    int len = str.length();
    ensureCapacityInternal(count + len);
    putStringAt(count, str);             // couldn't it be putStringAt(count, str, 0, len);
    count += len;
    return this;
}

Doing this here and in other places allows to rid private void putStringAt(int index, String str) completely and reduce one nested method call, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've got the wrong idea: We want to use putStringAt(int, String) now since it can call into String.getBytes(String, int, byte), which has a simpler and more efficient implementation than String.getBytes(String, int, int, byte, int), since it avoids a couple of << coder operations. This makes up for most of the improvement between my initial and the current version of this patch.

(There's also no nested delegation from putStringAt(int, String) to putStringAt(int, String, int, int) in my proposed patch.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I've applied your patch locally incompletely and misunderstood it, my bad :)

private void putStringAt(int index, String str, int off, int end) {
if (getCoder() != str.coder()) {
private void inflateIfNeededFor(String input) {
if (COMPACT_STRINGS && (coder != input.coder())) {
Copy link
Contributor

@stsypanov stsypanov Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely sure whether it's a good idea in terms of maintainability, but I think this can be simplified a bit more. Currently in both String and ASB we have implementations of coder() very much alike:

// ASB
final byte getCoder() {
    return COMPACT_STRINGS ? coder : UTF16;
}

//String
byte getCoder() {
    return COMPACT_STRINGS ? coder : UTF16;
}

Here we have this condition

if (COMPACT_STRINGS && (coder != input.getCoder())) {}

where the right operand of && is evaluated only when COMPACT_STRINGS is true and hence it always returns the value of coder field. This means we can reference it directly as

if (COMPACT_STRINGS && (coder != input.coder)) {}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this'd give us enough to motivate the refactor, especially since we'd have to widen the visibility of String.coder. Maybe startup could be helped a little. Either way it feels out of scope for this change, don't you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openjdk
Copy link

openjdk bot commented Aug 30, 2021

@cl4es This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8273100: Improve AbstractStringBuilder.append(String) when using CompactStrings

Reviewed-by: rriggs, alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 12 new commits pushed to the master branch:

  • fecefb8: 8271302: Regex Test Refresh
  • f18c0fa: 8271560: sun/security/ssl/DHKeyExchange/LegacyDHEKeyExchange.java still fails due to "An established connection was aborted by the software in your host machine"
  • 5aaa20f: 8272861: Add a micro benchmark for vector api
  • 7a01ba6: 8272093: Extract evacuation failure injection from G1CollectedHeap
  • 98b9d98: 8272797: Mutex with rank safepoint_check_never imply allow_vm_block
  • f11e099: 8272651: G1 heap region info print order changed by JDK-8269914
  • fbffa54: 8270438: "Cores to use" output in configure is misleading
  • 5185dbd: 8273098: Unnecessary Vector usage in java.naming
  • 276b07b: 8271490: [ppc] [s390]: Crash in JavaThread::pd_get_top_frame_for_profiling
  • bb7aa1c: 8272161: Make evacuation failure data structures local to collection
  • ... and 2 more: https://git.openjdk.java.net/jdk/compare/f55d5ab5177b6e08e8499abc181a320a98b28a5f...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 30, 2021
@cl4es
Copy link
Member Author

cl4es commented Aug 31, 2021

Thanks for reviews, everyone!

/integrate

@openjdk
Copy link

openjdk bot commented Aug 31, 2021

Going to push as commit 98fa533.
Since your change was applied there have been 16 commits pushed to the master branch:

  • 9732fbe: 8273153: Consolidate file_exists into os:file_exists
  • 0609421: 8272347: ObjectMethods::bootstrap should specify NPE if any argument except lookup is null
  • 7fc8540: 8260265: UTF-8 by Default
  • 3204853: 8272343: Remove MetaspaceClosure::FLAG_MASK
  • fecefb8: 8271302: Regex Test Refresh
  • f18c0fa: 8271560: sun/security/ssl/DHKeyExchange/LegacyDHEKeyExchange.java still fails due to "An established connection was aborted by the software in your host machine"
  • 5aaa20f: 8272861: Add a micro benchmark for vector api
  • 7a01ba6: 8272093: Extract evacuation failure injection from G1CollectedHeap
  • 98b9d98: 8272797: Mutex with rank safepoint_check_never imply allow_vm_block
  • f11e099: 8272651: G1 heap region info print order changed by JDK-8269914
  • ... and 6 more: https://git.openjdk.java.net/jdk/compare/f55d5ab5177b6e08e8499abc181a320a98b28a5f...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Aug 31, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 31, 2021
@openjdk
Copy link

openjdk bot commented Aug 31, 2021

@cl4es Pushed as commit 98fa533.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@cl4es cl4es deleted the asb_inlining branch August 31, 2021 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
4 participants