Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8257837: Performance regression in heap byte buffer views #1733

Closed
wants to merge 1 commit into from

Conversation

mcimadamore
Copy link
Contributor

@mcimadamore mcimadamore commented Dec 10, 2020

As a result of the recent integration of the foreign memory access API, some of the buffer implementations now use ScopedMemoryAccess instead of Unsafe. While this works generally well, there are situations where profile pollution arises, which result in a considerable slowdown. The profile pollution occurs because the same ScopedMemoryAccess method (e.g. getIntUnaligned) is called with two different buffer kinds (e.g. an off heap buffer where base == null, and an on-heap buffer where base == byte[]). Because of that, unsafe access cannot be optimized, since C2 can't guess what the unsafe base access is.

In reality, this problem was already known (and solved) elsewhere: the sun.misc.Unsafe wrapper does basically the same thing that ScopedMemoryAccess does. To make sure that profile pollution does not occur in those cases, argument profiling is enabled for sun.misc.Unsafe as well. This patch adds yet another case for ScopedMemoryAccess.

Here are the benchmark results:

Before:

Benchmark                                            Mode  Cnt  Score   Error  Units
LoopOverPollutedBuffer.direct_byte_buffer_get_float  avgt   30  0.612 ? 0.005  ms/op
LoopOverPollutedBuffer.heap_byte_buffer_get_int      avgt   30  2.740 ? 0.039  ms/op
LoopOverPollutedBuffer.unsafe_get_float              avgt   30  0.504 ? 0.020  ms/op

After

Benchmark                                            Mode  Cnt  Score   Error  Units
LoopOverPollutedBuffer.direct_byte_buffer_get_float  avgt   30  0.613 ? 0.007  ms/op
LoopOverPollutedBuffer.heap_byte_buffer_get_int      avgt   30  0.304 ? 0.002  ms/op
LoopOverPollutedBuffer.unsafe_get_float              avgt   30  0.491 ? 0.004  ms/op

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8257837: Performance regression in heap byte buffer views

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1733/head:pull/1733
$ git checkout pull/1733

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 10, 2020

👋 Welcome back mcimadamore! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 10, 2020
@openjdk
Copy link

openjdk bot commented Dec 10, 2020

@mcimadamore The following labels will be automatically applied to this pull request:

  • core-libs
  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Dec 10, 2020
@mlbridge
Copy link

mlbridge bot commented Dec 10, 2020

Webrevs

@openjdk
Copy link

openjdk bot commented Dec 10, 2020

@mcimadamore This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8257837: Performance regression in heap byte buffer views

Reviewed-by: chegar, roland

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 10, 2020
@mcimadamore
Copy link
Contributor Author

/integrate

@openjdk openjdk bot closed this Dec 10, 2020
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 10, 2020
@openjdk
Copy link

openjdk bot commented Dec 10, 2020

@mcimadamore Since your change was applied there has been 1 commit pushed to the master branch:

  • 0890620: 8258005: JDK build fails with incorrect fixpath script

Your commit was automatically rebased without conflicts.

Pushed as commit 37043b0.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@@ -1586,7 +1586,8 @@ bool MethodData::profile_unsafe(const methodHandle& m, int bci) {
Bytecode_invoke inv(m , bci);
if (inv.is_invokevirtual()) {
if (inv.klass() == vmSymbols::jdk_internal_misc_Unsafe() ||
inv.klass() == vmSymbols::sun_misc_Unsafe()) {
inv.klass() == vmSymbols::sun_misc_Unsafe() ||
inv.klass() == vmSymbols::jdk_internal_misc_ScopedMemoryAccess()) {
ResourceMark rm;
char* name = inv.name()->as_C_string();
if (!strncmp(name, "get", 3) || !strncmp(name, "put", 3)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing, but !strncmp(name, "get", 3) seem a very circumspect way of writing inv->name()->starts_with("get") - which shouldn't need a ResourceMark either. Another observation is that inv.klass() isn't inlined (defined in bytecode.cpp), so introducing a local for inv.klass() avoids multiple calls. How about this:

  if (inv.is_invokevirtual()) {
    Symbol* klass = inv.klass();
    if (klass == vmSymbols::jdk_internal_misc_Unsafe() ||
        klass == vmSymbols::sun_misc_Unsafe() ||
        klass == vmSymbols::jdk_internal_misc_ScopedMemoryAccess()) {
      Symbol* name = inv.name();
      if (name->starts_with("get") || name->starts_with("put")) {
        return true;
      }
    }
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
4 participants