Skip to content

JDK-8324523: Lilliput: if +UseCOH, always use the archive's encoding base and shift #124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Jan 23, 2024

We have two ways to initialize narrow Klass encoding: either we let the JVM choose base and shift freely, or we dictate base and shift. The former gives the JVM more leeway, e.g. to go with unscaled encoding. The latter, however, is required if we load a CDS archive and that archive contains precomputed narrow Klass IDs.

In the Legacy VM, this can only happen if the archive contains heap objects. In Lilliput, the markword carries the nKlass, and therefore the prototype baked into archived Klass structures carries it also. Therefore, we must always choose the strict initialization when +UseCOH.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Issue

  • JDK-8324523: Lilliput: if +UseCOH, always use the archive's encoding base and shift (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/lilliput.git pull/124/head:pull/124
$ git checkout pull/124

Update a local copy of the PR:
$ git checkout pull/124
$ git pull https://git.openjdk.org/lilliput.git pull/124/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 124

View PR using the GUI difftool:
$ git pr show -t 124

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/lilliput/pull/124.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 23, 2024

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@tstuefe
Copy link
Member Author

tstuefe commented Jan 23, 2024

x86 problem unrelated.

@tstuefe tstuefe marked this pull request as ready for review January 23, 2024 15:19
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 23, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 23, 2024

Webrevs

Copy link
Collaborator

@rkennke rkennke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks ok.

Questions, though: what is the impact of this? Is it a bug? Does it improve or regress performance? Should it be backported?

@openjdk
Copy link

openjdk bot commented Jan 23, 2024

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8324523: Lilliput: if +UseCOH, always use the archive's encoding base and shift

Reviewed-by: rkennke

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 23, 2024
@tstuefe
Copy link
Member Author

tstuefe commented Jan 23, 2024

Change looks ok.

Questions, though: what is the impact of this? Is it a bug? Does it improve or regress performance? Should it be backported?

It is a bug, but I saw it only on Windows with my new class pointer patch. I could not reproduce it with the standard JVM, but I did not try very hard. Bug could lead to crashes. Performance impact is nil, since if we do actually run with an encoding scheme mismatch, we won't live very long. If we don't, it does not matter.

@tstuefe
Copy link
Member Author

tstuefe commented Jan 24, 2024

@rkennke :

Questions, though: what is the impact of this? Is it a bug?

Okay, this bugged me, and I just had to know for sure. This bug is confirmed. In the traditional Lilliput VM this leads to early crashes in rare cases if a couple of conditions hold:

  • we generate and run with an archive that had been created using +UseCOH (my Smaller Classpointers patch generates such archives)
  • we don't use CDS heap archiving (Windows, or did not build with G1 support)
  • when reserving the class space, we optimize for zero- or unscaled encoding even though we run with CDS. This is highly CPU-specific; on some platforms, we do that today (eg PPC). On x64, it will be the case once 8323497: On x64, use 32-bit immediate moves for narrow klass base if possible jdk#17340 is merged.

If all these conditions hold, we will generate the archive using precomputed narrow Klass IDs in prototype headers based on the assumption that the later encoding base is the Klass range start. But at runtime, we set the encoding base to zero, and now the narrow Klass IDs don't match up, and we get a crash right away:

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/shared/projects/openjdk/lilliput/source/src/hotspot/share/classfile/javaClasses.cpp:1274), pid=192787, tid=192788
#  assert(is_instance(java_class)) failed: must be a Class object

Does it improve or regress performance?

Neither. If the bug hits, we crash. If it does not, we already do the right thing today, nothing changes with this patch.

Another answer is that it can improve performance, since it makes +UseCOH able to run with CDS archive, so it improves startup.

Should it be backported?

Yes, but this is based upon the big classspace rework patch from last summer (JDK-8312018) and a bunch of other patches. If they had been backported too, this should be backported as well.

@rkennke
Copy link
Collaborator

rkennke commented Jan 24, 2024

@rkennke :

Questions, though: what is the impact of this? Is it a bug?

Okay, this bugged me, and I just had to know for sure. This bug is confirmed. In the traditional Lilliput VM this leads to early crashes in rare cases if a couple of conditions hold:

  • we generate and run with an archive that had been created using +UseCOH (my Smaller Classpointers patch generates such archives)
  • we don't use CDS heap archiving (Windows, or did not build with G1 support)
  • when reserving the class space, we optimize for zero- or unscaled encoding even though we run with CDS. This is highly CPU-specific; on some platforms, we do that today (eg PPC). On x64, it will be the case once JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible jdk#17340 is merged.

If all these conditions hold, we will generate the archive using precomputed narrow Klass IDs in prototype headers based on the assumption that the later encoding base is the Klass range start. But at runtime, we set the encoding base to zero, and now the narrow Klass IDs don't match up, and we get a crash right away:

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/shared/projects/openjdk/lilliput/source/src/hotspot/share/classfile/javaClasses.cpp:1274), pid=192787, tid=192788
#  assert(is_instance(java_class)) failed: must be a Class object

Does it improve or regress performance?

Neither. If the bug hits, we crash. If it does not, we already do the right thing today, nothing changes with this patch.

Another answer is that it can improve performance, since it makes +UseCOH able to run with CDS archive, so it improves startup.

Should it be backported?

Yes, but this is based upon the big classspace rework patch from last summer (JDK-8312018) and a bunch of other patches. If they had been backported too, this should be backported as well.

Ok, thank you very much for the clarification, this is very useful! Patch is good to go!

@tstuefe
Copy link
Member Author

tstuefe commented Jan 24, 2024

x86 error unrelated (and I think already fixed upstream).

Thanks @rkennke

/integrate

@openjdk
Copy link

openjdk bot commented Jan 24, 2024

Going to push as commit e735376.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 24, 2024
@openjdk openjdk bot closed this Jan 24, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 24, 2024
@openjdk
Copy link

openjdk bot commented Jan 24, 2024

@tstuefe Pushed as commit e735376.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

2 participants