Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8330027: Identity hashes of archived objects must be based on a reproducible random seed #18735

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Apr 11, 2024

CDS archive contains archived objects with identity hashes.

These hashes are deliberately preserved or even generated during dumping. They are generated based on a seed that is initialized randomly on a per-thread basis. These generations precede CDS dump initialization, so they are not affected by the init_random call there, nor would they be affected by JDK-8323900.

A random seed will not work for dumping archives since it prevents reproducible archive generation. Therefore, when dumping, these seeds must be initiated in a reproducible way.

--- Update

After discussions with Ioi, and several redos, we settled on:

  • make sure CDS dump only ever calls ihash generation from one thread. Means, we disable the explicit hash generation we had been doing before, since that one was called from the VM thread, not from the single java thread
  • Start out with a constant seed for all threads. This is fine and does not cause collisions between threads, since - see above - we only call ihash generation from a single threads.
  • We also assert that we only use a single thread

--- Update Update

The final version I plan to push does not have above mentioned assert, since it turned out to be too tricky and complex to get right, not worth the trouble.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8330027: Identity hashes of archived objects must be based on a reproducible random seed (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18735/head:pull/18735
$ git checkout pull/18735

Update a local copy of the PR:
$ git checkout pull/18735
$ git pull https://git.openjdk.org/jdk.git pull/18735/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 18735

View PR using the GUI difftool:
$ git pr show -t 18735

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18735.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 11, 2024

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 11, 2024

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8330027: Identity hashes of archived objects must be based on a reproducible random seed

Reviewed-by: ccheung, iklam

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 48 new commits pushed to the master branch:

  • d654124: 8331932: Startup regressions in 23-b13
  • 7db6a3f: 8331905: Fix direct includes of g1_globals.hpp
  • f47fc86: 8331908: Simplify log code in vectorintrinsics.cpp
  • b9a142a: 8226990: GTK & Nimbus LAF: Tabbed pane's background color is not expected one when change the opaque checkbox.
  • d2d37c9: 8331942: On Linux aarch64, CDS archives should be using 64K alignment by default
  • a706ca4: 8329418: Replace pointers to tables with offsets in relocation bitmap
  • a643d6c: 8331862: Remove split relocation info implementation
  • d47a4e9: 8332008: Enable issuestitle check
  • 0bf7282: 8331231: containers/docker/TestContainerInfo.java fails
  • ffbdfff: 8331999: BasicDirectoryModel/LoaderThreadCount.java frequently fails on Windows in CI
  • ... and 38 more: https://git.openjdk.org/jdk/compare/f308e107ce8b993641ee3d0a0d5d52bf5cd3b94e...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title JDK-8330027: Identity hashes of archived objects must be based on a reproducable random seed 8330027: Identity hashes of archived objects must be based on a reproducable random seed Apr 11, 2024
@openjdk
Copy link

openjdk bot commented Apr 11, 2024

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Apr 11, 2024
@tstuefe
Copy link
Member Author

tstuefe commented Apr 11, 2024

Mac OS aarch64 build error unrelated (infra problem). I built and tested this patch locally on a Mac m1, so it is tested.

@tstuefe
Copy link
Member Author

tstuefe commented Apr 11, 2024

Ping @iklam, @calvinccheung

@tstuefe tstuefe marked this pull request as ready for review April 11, 2024 13:06
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 11, 2024
@mlbridge
Copy link

mlbridge bot commented Apr 11, 2024

Webrevs

@rkennke
Copy link
Contributor

rkennke commented Apr 11, 2024

Do we need to generate I-hash on archive generation at all?

@tstuefe
Copy link
Member Author

tstuefe commented Apr 11, 2024

Do we need to generate I-hash on archive generation at all?

Yes, see

// We need to retain the identity_hash, because it may have been used by some hashtables

@rkennke
Copy link
Contributor

rkennke commented Apr 11, 2024

Do we need to generate I-hash on archive generation at all?

Yes, see

// We need to retain the identity_hash, because it may have been used by some hashtables

I get that we need to preserve I-hashed. But why do we ever have to generate any new ones?

@tstuefe
Copy link
Member Author

tstuefe commented Apr 11, 2024

Do we need to generate I-hash on archive generation at all?

Yes, see

// We need to retain the identity_hash, because it may have been used by some hashtables

I get that we need to preserve I-hashed. But why do we ever have to generate any new ones?

That is an excellent question, and here is what I understand:

  • The comment states we want to minimize the chance of runtime writes to the archived objects. I assume we do this to prevent COW on that page, such that we have the maximum effect of memory sharing between processes using the same archive.
  • But this relies on archive heap region mapping not needing relocation. If we relocate, we touch every object anyway. You only avoid relocation if you manage to map at the same address.
  • The loading code [1] seems to indicate that the CDS heap region is always allocated at the end of the heap (note that we ignore the "preferred_addr" argument). Therefore, we only get the same address iff the heap has the same size and starts at the same address as at dump time. The chances of that are there, but not great, since default heap size depends on context at runtime.
  • All of that is moot anyway since, with JDK-8294323, we started forcing relocation due to security reasons.

Therefore I think we don't have to actively generate ihashes.

@iklam What do you think?

[1]

HeapWord* G1CollectedHeap::alloc_archive_region(size_t word_size, HeapWord* preferred_addr) {

@dcubed-ojdk
Copy link
Member

I fixed a typo in the bug's synopsis. The easiest way to update is with: /issue JDK-8330027.

@tstuefe
Copy link
Member Author

tstuefe commented Apr 11, 2024

I fixed a typo in the bug's synopsis. The easiest way to update is with: /issue JDK-8330027.

Thanks Dan :)

/issue JDK-8330027

@openjdk openjdk bot changed the title 8330027: Identity hashes of archived objects must be based on a reproducable random seed 8330027: Identity hashes of archived objects must be based on a reproducible random seed Apr 11, 2024
@openjdk
Copy link

openjdk bot commented Apr 11, 2024

@tstuefe This issue is referenced in the PR title - it will now be updated.

Copy link
Member

@calvinccheung calvinccheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
nit: please update copyright header in thread.hpp.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 16, 2024
@tstuefe
Copy link
Member Author

tstuefe commented Apr 18, 2024

@iklam @calvinccheung could you take another look please?

I rewrote this patch to be both minimally invasive and as bulletproof against concurrent activity as possible.

My thoughts on this:

  • just initializing the global seed of os::random with a constant makes the ihash vulnerable against concurrent calls to os::random in unrelated threads. At the minimum, it makes us vulnerable against the order of thread start and number of thread starts since each thread constructor did call os::random to init its ihash seed
  • My first patch version constified the ihash seed in the Thread constructor, but that still leaves us somewhat vulnerable against the same problem
  • This version - also the simplest in my opinion - makes ihash seed generation lazy, on-demand only, the first time a thread generates an ihash. That is the most robust version, since when dumping, only two threads ever generate ihashes - the initial java thread, and the VM thread. Since both run sequentially, not concurrently, the order of ihash generations is deterministic, and since we restrict the seed initialization to those threads that actually do generate ihashes, we can be reasonably safe of getting the same random numbers.

@iklam
Copy link
Member

iklam commented Apr 18, 2024

@iklam @calvinccheung could you take another look please?

I rewrote this patch to be both minimally invasive and as bulletproof against concurrent activity as possible.

My thoughts on this:

  • just initializing the global seed of os::random with a constant makes the ihash vulnerable against concurrent calls to os::random in unrelated threads. At the minimum, it makes us vulnerable against the order of thread start and number of thread starts since each thread constructor did call os::random to init its ihash seed
  • My first patch version constified the ihash seed in the Thread constructor, but that still leaves us somewhat vulnerable against the same problem
  • This version - also the simplest in my opinion - makes ihash seed generation lazy, on-demand only, the first time a thread generates an ihash. That is the most robust version, since when dumping, only two threads ever generate ihashes - the initial java thread, and the VM thread. Since both run sequentially, not concurrently, the order of ihash generations is deterministic, and since we restrict the seed initialization to those threads that actually do generate ihashes, we can be reasonably safe of getting the same random numbers.

This PR doesn't help CDS in terms of making the contents of archived heap objects deterministic.

The dumping of archived heap objects is very sensitive to (Java thread) context switching: if two concurrent Java threads call Object.identityHashcode() on the same object, then the hashcode inside the header of this object will have deterministic values. We cannot easily recover from this with post-processing, as the shapes of the archived hashtables are influenced by the hashcode, and the CDS code doesn't know how to repack the hashtables in Java.

As a result, during java -Xshare:dump, we disable the launching of all new Java threads, so there's only a single (main) Java thread running the whole time

if (CDSConfig::is_dumping_static_archive()) {
// During java -Xshare:dump, if we allow multiple Java threads to
// execute in parallel, symbols and classes may be loaded in
// random orders which will make the resulting CDS archive
// non-deterministic.
//
// Lucikly, during java -Xshare:dump, it's important to run only
// the code in the main Java thread (which is NOT started here) that
// creates the module graph, etc. It's safe to not start the other
// threads which are launched by class static initializers
// (ReferenceHandler, FinalizerThread and CleanerImpl).
if (log_is_enabled(Info, cds)) {
ResourceMark rm;
oop t = JNIHandles::resolve_non_null(jthread);
log_info(cds)("JVM_StartThread() ignored: %s", t->klass()->external_name());
}

Even if we apply this PR, we still cannot run more than one Java thread. Conversely, if we stick to a single Java thread, then the complexity in this PR is not needed.

@tstuefe
Copy link
Member Author

tstuefe commented Apr 19, 2024

@iklam @calvinccheung could you take another look please?
I rewrote this patch to be both minimally invasive and as bulletproof against concurrent activity as possible.
My thoughts on this:

  • just initializing the global seed of os::random with a constant makes the ihash vulnerable against concurrent calls to os::random in unrelated threads. At the minimum, it makes us vulnerable against the order of thread start and number of thread starts since each thread constructor did call os::random to init its ihash seed
  • My first patch version constified the ihash seed in the Thread constructor, but that still leaves us somewhat vulnerable against the same problem
  • This version - also the simplest in my opinion - makes ihash seed generation lazy, on-demand only, the first time a thread generates an ihash. That is the most robust version, since when dumping, only two threads ever generate ihashes - the initial java thread, and the VM thread. Since both run sequentially, not concurrently, the order of ihash generations is deterministic, and since we restrict the seed initialization to those threads that actually do generate ihashes, we can be reasonably safe of getting the same random numbers.

This PR doesn't help CDS in terms of making the contents of archived heap objects deterministic.

It does though, demonstratably. (comment removed) Once that is fixed, archives are non-deterministic for the reason stated in the PR description. And this PR makes the archive deterministic again. One could argue whether this is the simplest solution (and we do that below), but saying it does not help is just wrong, sorry.

The dumping of archived heap objects is very sensitive to (Java thread) context switching: if two concurrent Java threads call Object.identityHashcode() on the same object, then the hashcode inside the header of this object will have deterministic values. We cannot easily recover from this with post-processing, as the shapes of the archived hashtables are influenced by the hashcode, and the CDS code doesn't know how to repack the hashtables in Java.

As a result, during java -Xshare:dump, we disable the launching of all new Java threads, so there's only a single (main) Java thread running the whole time

if (CDSConfig::is_dumping_static_archive()) {
// During java -Xshare:dump, if we allow multiple Java threads to
// execute in parallel, symbols and classes may be loaded in
// random orders which will make the resulting CDS archive
// non-deterministic.
//
// Lucikly, during java -Xshare:dump, it's important to run only
// the code in the main Java thread (which is NOT started here) that
// creates the module graph, etc. It's safe to not start the other
// threads which are launched by class static initializers
// (ReferenceHandler, FinalizerThread and CleanerImpl).
if (log_is_enabled(Info, cds)) {
ResourceMark rm;
oop t = JNIHandles::resolve_non_null(jthread);
log_info(cds)("JVM_StartThread() ignored: %s", t->klass()->external_name());
}

Even if we apply this PR, we still cannot run more than one Java thread. Conversely, if we stick to a single Java thread, then the complexity in this PR is not needed.

I get all that.

But the problem is not restricted to the one java thread.

We create ihashes from the single one java thread, as well as the VM thread. They don't run concurrently, so their order of ihash creation calls is fixed for two subsequent runs of the same JVM.

Okay.

We seed each threads ihash RNG in Thread::Thread(). We seed it with os::random. os::random works with a global seed.

Okay, we can make that global seed constant too.

But we are betting that the number of os::random calls happening before the start of the java thread and the VM thread is constant from run to run. It often is, but that is not guaranteed by any means. If it differs, the VMthread and the one java thread get different seeds, therefore will generate different ihashes.

The order of os::random() calls can change due to multiple reasons:

  • any variations in runtime conditions could cause different code paths that cause os::random to be called or not called.
  • concurrent threads could call os:;random. There are concurrent threads, albeit no java threads.
  • Since we call os::random in Thread::Thread(), any fluctuation that changes the number of threads to be started before the VMThread/java thread is started will add to the number of prior os::random calls, thus messing up the RNG sequence.

I get that the chance for this happening is remote, but hunting sources of entropy is frustrating work, and the patch is really very simple. So, why not fix it? I don't share the opinion that this is added complexity.

@iklam
Copy link
Member

iklam commented Apr 22, 2024

I get that the chance for this happening is remote, but hunting sources of entropy is frustrating work, and the patch is really very simple. So, why not fix it? I don't share the opinion that this is added complexity.

Why not do it inside Thread::Thread()

// thread-specific hashCode stream generator state - Marsaglia shift-xor form
  if (CDSConfig::is_dumping_static_archive()) {
     _hashStateX = 0;
  } else {
     _hashStateX = os::random();
  }  

@tstuefe
Copy link
Member Author

tstuefe commented Apr 22, 2024

I get that the chance for this happening is remote, but hunting sources of entropy is frustrating work, and the patch is really very simple. So, why not fix it? I don't share the opinion that this is added complexity.

Why not do it inside Thread::Thread()

// thread-specific hashCode stream generator state - Marsaglia shift-xor form
  if (CDSConfig::is_dumping_static_archive()) {
     _hashStateX = 0;
  } else {
     _hashStateX = os::random();
  }  

Because then it would inject os::random into the startup of every thread, not just of every thread that generates iHashes. So it would also fire for GC threads and other thread started before "our" threads. That would make our random sequence against order and number of threads started.

@tstuefe
Copy link
Member Author

tstuefe commented Apr 23, 2024

I get that the chance for this happening is remote, but hunting sources of entropy is frustrating work, and the patch is really very simple. So, why not fix it? I don't share the opinion that this is added complexity.

Why not do it inside Thread::Thread()

// thread-specific hashCode stream generator state - Marsaglia shift-xor form
  if (CDSConfig::is_dumping_static_archive()) {
     _hashStateX = 0;
  } else {
     _hashStateX = os::random();
  }  

Because then it would inject os::random into the startup of every thread, not just of every thread that generates iHashes. So it would also fire for GC threads and other thread started before "our" threads. That would make our random sequence against order and number of threads started.

My last answer was rubbish, sorry, did not read your comment carefully enough.

Yes, your approach would also work, but it would lead to the two threads involved in dumping the archive - VMthread and the one Java thread - using the same seed, hence generating the same sequence of ihashes. That, in turn, can lead to different archived objects carrying the same ihash, which may negatively impact performance later when the archive is used.

@iklam
Copy link
Member

iklam commented Apr 23, 2024

I get that the chance for this happening is remote, but hunting sources of entropy is frustrating work, and the patch is really very simple. So, why not fix it? I don't share the opinion that this is added complexity.

Why not do it inside Thread::Thread()

// thread-specific hashCode stream generator state - Marsaglia shift-xor form
  if (CDSConfig::is_dumping_static_archive()) {
     _hashStateX = 0;
  } else {
     _hashStateX = os::random();
  }  

Because then it would inject os::random into the startup of every thread, not just of every thread that generates iHashes. So it would also fire for GC threads and other thread started before "our" threads. That would make our random sequence against order and number of threads started.

My last answer was rubbish, sorry, did not read your comment carefully enough.

Yes, your approach would also work, but it would lead to the two threads involved in dumping the archive - VMthread and the one Java thread - using the same seed, hence generating the same sequence of ihashes. That, in turn, can lead to different archived objects carrying the same ihash, which may negatively impact performance later when the archive is used.

I think it's better to just not compute the identity hash inside the VM thread. Here's what I tried

iklam@ad95e2e

We thought that forcing the identity hash computation would increase sharing across processes, as it would mean fewer updates of the object headers during run time. However, most of the heap objects in the CDS archive are not accessible by the application (they are part of the archived module graph, etc). Also the archive contains a large number of Strings, which are unlikely to need the identity hash (String has its own hashcode() method).

Since the reason is rather dubious, I think it's better to remove it and simplify the system.

@tstuefe
Copy link
Member Author

tstuefe commented Apr 24, 2024

I get that the chance for this happening is remote, but hunting sources of entropy is frustrating work, and the patch is really very simple. So, why not fix it? I don't share the opinion that this is added complexity.

Why not do it inside Thread::Thread()

// thread-specific hashCode stream generator state - Marsaglia shift-xor form
  if (CDSConfig::is_dumping_static_archive()) {
     _hashStateX = 0;
  } else {
     _hashStateX = os::random();
  }  

Because then it would inject os::random into the startup of every thread, not just of every thread that generates iHashes. So it would also fire for GC threads and other thread started before "our" threads. That would make our random sequence against order and number of threads started.

My last answer was rubbish, sorry, did not read your comment carefully enough.
Yes, your approach would also work, but it would lead to the two threads involved in dumping the archive - VMthread and the one Java thread - using the same seed, hence generating the same sequence of ihashes. That, in turn, can lead to different archived objects carrying the same ihash, which may negatively impact performance later when the archive is used.

I think it's better to just not compute the identity hash inside the VM thread. Here's what I tried

iklam@ad95e2e

We thought that forcing the identity hash computation would increase sharing across processes, as it would mean fewer updates of the object headers during run time. However, most of the heap objects in the CDS archive are not accessible by the application (they are part of the archived module graph, etc). Also the archive contains a large number of Strings, which are unlikely to need the identity hash (String has its own hashcode() method).

Since the reason is rather dubious, I think it's better to remove it and simplify the system.

I like that, that is simpler. Okay, then we will only call ihash from a single thread, so a global constant seed should be fine. I should be able to assert that, right?

@iklam
Copy link
Member

iklam commented Apr 24, 2024

I think it's better to just not compute the identity hash inside the VM thread. Here's what I tried
iklam@ad95e2e
We thought that forcing the identity hash computation would increase sharing across processes, as it would mean fewer updates of the object headers during run time. However, most of the heap objects in the CDS archive are not accessible by the application (they are part of the archived module graph, etc). Also the archive contains a large number of Strings, which are unlikely to need the identity hash (String has its own hashcode() method).
Since the reason is rather dubious, I think it's better to remove it and simplify the system.

I like that, that is simpler. Okay, then we will only call ihash from a single thread, so a global constant seed should be fine. I should be able to assert that, right?

AI think an assert can be added, since we don't allow any Java threads to be launched. So even test cases that run arbitrary Java code during -Xshare:dump (using Java agents of -XX:ArchiveHeapTestClass) will not be able to run any Java code outside of the main Java thread.

@tstuefe tstuefe force-pushed the JDK-JDK-8330027-cds-ihash-reproducability branch from 45ee254 to ddfb218 Compare May 1, 2024 11:32
@openjdk
Copy link

openjdk bot commented May 1, 2024

@tstuefe Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@tstuefe
Copy link
Member Author

tstuefe commented May 1, 2024

@iklam Sorry, had to force push because I messed up the PR branch somehow.

The new version contains the change proposed by you - getting rid of explicit ihash generation when dumping - as well as a simplified version of my original patch. We now are back at generating the ihash seed in Thread::Thread(), with a constant if CDS dumps, and on ihash generation we check that we ever only call it with a single thread.

@iklam
Copy link
Member

iklam commented May 1, 2024

@iklam Sorry, had to force push because I messed up the PR branch somehow.

I think it's possible to just discard your local branch, and check out a new version from the PR, then make changes on top of it. That way you can avoid forced pushes.

The new version contains the change proposed by you - getting rid of explicit ihash generation when dumping - as well as a simplified version of my original patch. We now are back at generating the ihash seed in Thread::Thread(), with a constant if CDS dumps, and on ihash generation we check that we ever only call it with a single thread.

This version looks good. I am running our tiers 1-4 just to be sure.

Thanks

@tstuefe
Copy link
Member Author

tstuefe commented May 2, 2024

@iklam Sorry, had to force push because I messed up the PR branch somehow.

I think it's possible to just discard your local branch, and check out a new version from the PR, then make changes on top of it. That way you can avoid forced pushes.

Its possible, I do this a lot when I mess up locally. But in this case, I had already accidentally pushed a broken merge to my GH fork, and then saw it touches >5000 files and has >300 commits. No idea what went wrong, but the only way to fix this that I found was to locally redo the patch and switch the branch.

The new version contains the change proposed by you - getting rid of explicit ihash generation when dumping - as well as a simplified version of my original patch. We now are back at generating the ihash seed in Thread::Thread(), with a constant if CDS dumps, and on ihash generation we check that we ever only call it with a single thread.

This version looks good. I am running our tiers 1-4 just to be sure.

Cool. I like the new version more, its simpler. Thank you!

Thanks

@tstuefe
Copy link
Member Author

tstuefe commented May 2, 2024

@calvinccheung could you re-review this PR, please? It is simpler than the old version, but completely redone. Thank you!

// Verify that during CDS dumping, only a single thread
// ever calls ihash
assert(!CDSConfig::is_dumping_archive() || runs_on_one_thread_only(),
"Only one thread should generate ihash during CDS dumps");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should check for CDSConfig::is_dumping_static_archive() instead (here and in Thread::Thread()), to match the check in JVM_StartThread

JVM_ENTRY(void, JVM_StartThread(JNIEnv* env, jobject jthread))
#if INCLUDE_CDS
if (CDSConfig::is_dumping_static_archive()) {
)

Thread* const cur = Thread::current();
if (t == nullptr) {
Atomic::cmpxchg(&cds_dump_java_thread, t, cur);
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not thread safe, as I got few intermittent asserts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it should not really need to be thread safe, since it should only ever be called from a single thread. Do the asserts always happen in the one and only java thread?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the problem. A new JavaThread is created during VM exit. This thread is attached to the main OS thread:

Thread 2 "java" hit Breakpoint 3, JavaThread::JavaThread (this=0x7ffff002c4e0, is_attaching_via_jni=true) at javaThread.cpp:532
532	JavaThread::JavaThread(bool is_attaching_via_jni) : JavaThread() {
(gdb) where
#0  JavaThread::JavaThread (this=0x7ffff002c4e0, is_attaching_via_jni=true)
#1  attach_current_thread (vm=0x7ffff7c283d8 <main_vm>, penv=0x7ffff5358d50, _args=0x7ffff5358d60, daemon=false)
#2  jni_AttachCurrentThread (vm=0x7ffff7c283d8 <main_vm>, penv=0x7ffff5358d50, _args=0x7ffff5358d60)
#3  JavaVM_::AttachCurrentThread (this=0x7ffff7c283d8 <main_vm>, penv=0x7ffff5358d50, args=0x7ffff5358d60)
#4  jni_DestroyJavaVM_inner (vm=0x7ffff7c283d8 <main_vm>)
#5  jni_DestroyJavaVM (vm=0x7ffff7c283d8 <main_vm>)
#6  JavaMain ()
#7  ThreadJavaMain ()
#8  start_thread (arg=<optimized out>)
#9  clone3 ()

This thread can execute Java code (shutdown hooks) which can call Object::identityHashcode().

@tstuefe
Copy link
Member Author

tstuefe commented May 7, 2024

@iklam I decided to remove the assert. I briefly thought about adding a condition to time (don't assert after dump finished) but decided against more complexity.

Copy link
Member

@iklam iklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@tstuefe tstuefe requested a review from calvinccheung May 9, 2024 10:04
@tstuefe
Copy link
Member Author

tstuefe commented May 9, 2024

I'll wait for @calvinccheung 's re-approval, then push.

Copy link
Member

@calvinccheung calvinccheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version looks good. Thanks!

@tstuefe
Copy link
Member Author

tstuefe commented May 10, 2024

Cool, thanks @calvinccheung !

/integrate

@openjdk
Copy link

openjdk bot commented May 10, 2024

Going to push as commit 9f43ce5.
Since your change was applied there have been 48 commits pushed to the master branch:

  • d654124: 8331932: Startup regressions in 23-b13
  • 7db6a3f: 8331905: Fix direct includes of g1_globals.hpp
  • f47fc86: 8331908: Simplify log code in vectorintrinsics.cpp
  • b9a142a: 8226990: GTK & Nimbus LAF: Tabbed pane's background color is not expected one when change the opaque checkbox.
  • d2d37c9: 8331942: On Linux aarch64, CDS archives should be using 64K alignment by default
  • a706ca4: 8329418: Replace pointers to tables with offsets in relocation bitmap
  • a643d6c: 8331862: Remove split relocation info implementation
  • d47a4e9: 8332008: Enable issuestitle check
  • 0bf7282: 8331231: containers/docker/TestContainerInfo.java fails
  • ffbdfff: 8331999: BasicDirectoryModel/LoaderThreadCount.java frequently fails on Windows in CI
  • ... and 38 more: https://git.openjdk.org/jdk/compare/f308e107ce8b993641ee3d0a0d5d52bf5cd3b94e...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 10, 2024
@openjdk openjdk bot closed this May 10, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 10, 2024
@openjdk
Copy link

openjdk bot commented May 10, 2024

@tstuefe Pushed as commit 9f43ce5.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tstuefe
Copy link
Member Author

tstuefe commented May 10, 2024

/backport jdk22u

@tstuefe
Copy link
Member Author

tstuefe commented May 10, 2024

/backport jdk21u-dev

@openjdk
Copy link

openjdk bot commented May 10, 2024

@tstuefe Could not automatically backport 9f43ce5a to openjdk/jdk22u due to conflicts in the following files:

  • src/hotspot/share/cds/heapShared.cpp

Please fetch the appropriate branch/commit and manually resolve these conflicts by using the following commands in your personal fork of openjdk/jdk22u. Note: these commands are just some suggestions and you can use other equivalent commands you know.

# Fetch the up-to-date version of the target branch
$ git fetch --no-tags https://git.openjdk.org/jdk22u.git master:master

# Check out the target branch and create your own branch to backport
$ git checkout master
$ git checkout -b backport-tstuefe-9f43ce5a

# Fetch the commit you want to backport
$ git fetch --no-tags https://git.openjdk.org/jdk.git 9f43ce5a725b212cec0f3cd17491c4bada953676

# Backport the commit
$ git cherry-pick --no-commit 9f43ce5a725b212cec0f3cd17491c4bada953676
# Resolve conflicts now

# Commit the files you have modified
$ git add files/with/resolved/conflicts
$ git commit -m 'Backport 9f43ce5a725b212cec0f3cd17491c4bada953676'

Once you have resolved the conflicts as explained above continue with creating a pull request towards the openjdk/jdk22u with the title Backport 9f43ce5a725b212cec0f3cd17491c4bada953676.

Below you can find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit 9f43ce5a from the openjdk/jdk repository.

The commit being backported was authored by Thomas Stuefe on 10 May 2024 and was reviewed by Calvin Cheung and Ioi Lam.

Thanks!

@openjdk
Copy link

openjdk bot commented May 10, 2024

@tstuefe Could not automatically backport 9f43ce5a to openjdk/jdk21u-dev due to conflicts in the following files:

  • src/hotspot/share/cds/archiveHeapWriter.cpp
  • src/hotspot/share/cds/heapShared.cpp

Please fetch the appropriate branch/commit and manually resolve these conflicts by using the following commands in your personal fork of openjdk/jdk21u-dev. Note: these commands are just some suggestions and you can use other equivalent commands you know.

# Fetch the up-to-date version of the target branch
$ git fetch --no-tags https://git.openjdk.org/jdk21u-dev.git master:master

# Check out the target branch and create your own branch to backport
$ git checkout master
$ git checkout -b backport-tstuefe-9f43ce5a

# Fetch the commit you want to backport
$ git fetch --no-tags https://git.openjdk.org/jdk.git 9f43ce5a725b212cec0f3cd17491c4bada953676

# Backport the commit
$ git cherry-pick --no-commit 9f43ce5a725b212cec0f3cd17491c4bada953676
# Resolve conflicts now

# Commit the files you have modified
$ git add files/with/resolved/conflicts
$ git commit -m 'Backport 9f43ce5a725b212cec0f3cd17491c4bada953676'

Once you have resolved the conflicts as explained above continue with creating a pull request towards the openjdk/jdk21u-dev with the title Backport 9f43ce5a725b212cec0f3cd17491c4bada953676.

Below you can find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit 9f43ce5a from the openjdk/jdk repository.

The commit being backported was authored by Thomas Stuefe on 10 May 2024 and was reviewed by Calvin Cheung and Ioi Lam.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants