Skip to content

8341141: Optimize DirectCodeBuilder #21243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

wenshao
Copy link
Contributor

@wenshao wenshao commented Sep 29, 2024

Some DirectCodeBuilder related optimizations to improve startup and running performance:

  1. Merge calls, merge writeU1 and writeU2 into writeU3
  2. Merge calls, merge writeU1 and writeIndex operations
  3. Directly use writeU1 instead of writeBytecode
  4. Rewrite the implementation of load and store

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8341141: Optimize DirectCodeBuilder (Sub-task - P4)

Reviewers

Contributors

  • Claes Redestad <redestad@openjdk.org>
  • Chen Liang <liach@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21243/head:pull/21243
$ git checkout pull/21243

Update a local copy of the PR:
$ git checkout pull/21243
$ git pull https://git.openjdk.org/jdk.git pull/21243/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21243

View PR using the GUI difftool:
$ git pr show -t 21243

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21243.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 29, 2024

👋 Welcome back swen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 29, 2024

@wenshao This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8341141: Optimize DirectCodeBuilder

Co-authored-by: Claes Redestad <redestad@openjdk.org>
Co-authored-by: Chen Liang <liach@openjdk.org>
Reviewed-by: liach, redestad

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 35 new commits pushed to the master branch:

  • 62acc9c: 8341548: More concise use of classfile API
  • 7312eea: 8341131: Some jdk/jfr/event/compiler tests shouldn't be executed with Xcomp
  • 966eb72: 8341447: Open source closed frame tests # 5
  • b9db74a: 8341378: Open source few TrayIcon tests - Set8
  • 6546353: 8340203: Link color is hard to distinguish from text color in API documentation
  • 580eb62: 8320500: [vectorapi] RISC-V: Optimize vector math operations with SLEEF
  • 4a12f5b: 8341643: G1: Merged cards counter skewed by merge cards cache
  • 6e48618: 8341644: Compile error in cgroup coding when using toolchain clang
  • 7a1e832: 8336843: Deprecate java.util.zip.ZipError for removal
  • f62dba3: 8341597: ZipFileInflaterInputStream input buffer size uses uncompressed size
  • ... and 25 more: https://git.openjdk.org/jdk/compare/1c3e56c3e45be3626afec0461d4ae8059b0b577f...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Sep 29, 2024

@wenshao The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Sep 29, 2024
@wenshao wenshao changed the title Optimize DirectCodeBuilder 8341141: Optimize DirectCodeBuilder Sep 29, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 29, 2024
wenshao and others added 16 commits September 30, 2024 08:57
Reviewed-by: lancea, eirbjo
Reviewed-by: mr, jwaters, erikj
…mitDirectMemory.java to problem list

Reviewed-by: liach, dcubed, alanb
…osx-all

Reviewed-by: liach, alanb, darcy, dfuchs
…ad of AtomicInteger

Reviewed-by: jrose, redestad, shade
Reviewed-by: dnguyen, prr
… thread safe

Reviewed-by: liach, shade, jvernee
@wenshao wenshao requested a review from liach October 5, 2024 15:35
@cl4es
Copy link
Member

cl4es commented Oct 7, 2024

I took this for a spin and results looks promising. Even a tentative win on the Write.jdkTree microbenchmark. The number of changes is a bit staggering so will take a while to thoroughly review in detail. I might be able to get through it all tomorrow.

Copy link
Member

@liach liach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else looks good.

return this;
}

@Override
public CodeBuilder aload(int slot) {
writeLocalVar(BytecodeHelpers.aload(slot), slot);
if (slot >= 0 && slot <= 3) {
Copy link
Member

@liach liach Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use if ((slot & ~3) != 0) for shorter bytecode? #21367

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While fun, I wonder if such bit-fiddling optimizations obfuscate the code more than it helps performance. Would be good with some supporting evidence that it 1) helps interpreted performance and 2) that all JITs recognize and optimize this pattern well and without surprises.

Comment on lines +503 to 507
if (slot < 256) {
bytecodesBufWriter.writeU1U1(bytecode, slot);
} else {
bytecodesBufWriter.writeU1U1U2(WIDE, bytecode, slot);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do:

        if ((slot & ~0xFF) == 0)
            bytecodesBufWriter.writeU1U1(bytecode, slot);
        else if ((slot & ~0xFFFF) == 0)
            bytecodesBufWriter.writeU1U1U2(WIDE, bytecode, slot);
        else
            throw BytecodeHelpers.slotOutOfBounds(slot);

@wenshao
Copy link
Contributor Author

wenshao commented Oct 8, 2024

/contributor add redestad

@wenshao
Copy link
Contributor Author

wenshao commented Oct 8, 2024

/contributor add liach

@openjdk
Copy link

openjdk bot commented Oct 8, 2024

@wenshao
Contributor Claes Redestad <redestad@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Oct 8, 2024

@wenshao
Contributor Chen Liang <liach@openjdk.org> successfully added.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 8, 2024
Copy link
Member

@cl4es cl4es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I spotted places outside of the DirectCodeBuilder paths that could benefit from using these new coalescing writers, but write-only cases is a good focus point for now.

} else {
bytecodesBufWriter.writeU1(value);
bytecodesBufWriter.writeU1U1(opcode.bytecode(), value);
}
}

public void writeLoadConstant(Opcode opcode, LoadableConstantEntry value) {
// Make sure Long and Double have LDC2_W and
// rewrite to _W if index is > 256
Copy link
Member

@cl4es cl4es Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing..

  • Comment should say >= 256.
  • As we're cloning from a pre-existing pool I assume there's a (perhaps unlikely) possibility we go from a wide to a lower index? In that case the opcode could profitably be "rewritten" to Opcode.LDC in an else clause. (I assume LDC_W with an index in the 0-255 range works fine functionally; it just wastes a byte.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the opcode here is pointless if we have a pool adaption; in that case we should just use ldc(LoadableConstantEntry).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here in writeLoadConstant, some branches have curly braces, and some don't. The style is inconsistent. Should I change it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do it in a later cleanup that fixes writeLoadConstant. This patch is already huge.

@wenshao
Copy link
Contributor Author

wenshao commented Oct 8, 2024

LGTM.

I spotted places outside of the DirectCodeBuilder paths that could benefit from using these new coalescing writers, but write-only cases is a good focus point for now.

Are you talking about merging multiple writeU2 in places like AbstractAttributeMapper? I am planning to make a new PR to complete these works, because this PR has too many changes.

@cl4es
Copy link
Member

cl4es commented Oct 8, 2024

Are you talking about merging multiple writeU2 in places like AbstractAttributeMapper? I am planning to make a new PR to complete these works, because this PR has too many changes.

Good. I would have suggested the same.

@wenshao
Copy link
Contributor Author

wenshao commented Oct 9, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Oct 9, 2024

Going to push as commit 047c2d7.
Since your change was applied there have been 41 commits pushed to the master branch:

  • d636e0d: 8341688: Aarch64: Generate comments in -XX:+PrintInterpreter to link to source code
  • d3f3c6a: 8330157: C2: Add a stress flag for bailouts
  • d809bc0: 8341658: RISC-V: Test DateFormatProviderTest.java run timeouted
  • de90204: 8341588: Remove CollectionUsageThreshold.java from ProblemList-Xcomp for debugging
  • f276f58: 8341803: ProblemList containers/docker/TestJcmdWithSideCar.java on linux-x64
  • 7eab0a5: 8337066: Repeated call of StringBuffer.reverse with double byte string returns wrong result
  • 62acc9c: 8341548: More concise use of classfile API
  • 7312eea: 8341131: Some jdk/jfr/event/compiler tests shouldn't be executed with Xcomp
  • 966eb72: 8341447: Open source closed frame tests # 5
  • b9db74a: 8341378: Open source few TrayIcon tests - Set8
  • ... and 31 more: https://git.openjdk.org/jdk/compare/1c3e56c3e45be3626afec0461d4ae8059b0b577f...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 9, 2024
@openjdk openjdk bot closed this Oct 9, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 9, 2024
@openjdk
Copy link

openjdk bot commented Oct 9, 2024

@wenshao Pushed as commit 047c2d7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.