-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8301106: Allow archived Java strings to be moved by GC #12607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8301106: Allow archived Java strings to be moved by GC #12607
Conversation
👋 Welcome back iklam! A progress list of the required criteria for merging this PR into |
Webrevs
|
src/hotspot/share/cds/heapShared.cpp
Outdated
void HeapShared::archive_strings() { | ||
oop shared_strings_array = StringTable::init_shared_table(_dumped_interned_strings); | ||
bool success = archive_reachable_objects_from(1, _default_subgraph_info, shared_strings_array, /*is_closed_archive=*/ false); | ||
guarantee(success, "shared strings array should not point to any unachivable objects"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: unachivable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still working through this. A few comments.
src/hotspot/share/cds/heapShared.cpp
Outdated
void HeapShared::archive_strings() { | ||
oop shared_strings_array = StringTable::init_shared_table(_dumped_interned_strings); | ||
bool success = archive_reachable_objects_from(1, _default_subgraph_info, shared_strings_array, /*is_closed_archive=*/ false); | ||
guarantee(success, "shared strings array must point to only archivable objects"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What could cause this to fail? Do we need a more graceful bailout in release builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should never fail. I could've used an assert but made it a guarantee since this is rather new code and I am paranoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what stops us from hitting this:
// Don't archive a subgraph root that's too big. For archives static fields, that's OK
// as the Java code will take care of initializing this field dynamically.
return false;
in HeapShared::archive_reachable_objects_from
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the code to make it structurally impossibly to reach this point, and added asserts to check for that. Explanations are here:
void HeapShared::archive_strings() {
oop shared_strings_array = StringTable::init_shared_table(_dumped_interned_strings);
bool success = archive_reachable_objects_from(1, _default_subgraph_info, shared_strings_array, /*is_closed_archive=*/ false);
// We must succeed because:
// - _dumped_interned_strings do not contain any large strings.
// - StringTable::init_shared_table() doesn't create any large arrays.
assert(success, "shared strings array must not point to arrays or strings that are too large to archive");
StringTable::set_shared_strings_array_index(append_root(shared_strings_array));
}
There are other asserts in the handling of _dumped_interned_strings
and shared_strings_array
to check their object sizes.
@@ -73,7 +73,7 @@ const size_t REHASH_LEN = 100; | |||
const double CLEAN_DEAD_HIGH_WATER_MARK = 0.5; | |||
|
|||
#if INCLUDE_CDS_JAVA_HEAP | |||
bool StringTable::_two_dimensional_shared_strings_array = false; | |||
bool StringTable::_is_two_dimensional_shared_strings_array = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name seems excessive, but you're the one who had to keep typing it in.
Mailing list message from Ioi Lam on hotspot-runtime-dev: On 2/20/2023 5:16 PM, David Holmes wrote:
I moved the code block to a new function. Thanks |
// | ||
// [bits 31 .. 14][ bits 13 .. 0 ] | ||
// primary_index secondary_index | ||
const static int _secondary_array_index_bits = 14; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sure secondary array bits are selected such that the secondary array is less than ArchiveHeapWriter::MIN_GC_REGION_ALIGNMENT
.
But I believe it would make the code more robust if we calculate the secondary array length dynamically by determining the appropriate length of the array for a given size.
For instance by querying objArrayOopDesc
for the appropriate length of the array for an object of ArchiveHeapWriter::MIN_GC_REGION_ALIGNMENT
bytes, and then set the _secondary_array_index_bits
according to the result.
It would also alleviate the problem of how to best divide the 32 bits between secondary and primary array.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted _secondary_array_index_bits
to be a compile-time constant so that the related code can be optimized by the C++ compiler. However, to calculate it at compile time would require the use of constexpr
, but objArrayOopDesc::object_size(int)
can't easily be made constexpr
because it eventually reads the runtime constant heapOopSize
.
Instead, I added a function StringTable::verify_secondary_array_index_bits()
to check that its value isn't too big.
Note that I used 14 as that will be the eventually value when we support Shenandoah. Also, using a large value means the SharedStringsStress needs a larger data set, so I stick with 14.
This will still support up to 28M strings so it's should be OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iklam thanks for adding those comments in StringTable::verify_secondary_array_index_bits
.
…ng for large objects; added StringTable::verify_secondary_array_index_bits()
lgtm! |
// refer to more than 16384 * 16384 = 26M interned strings! Not a practical concern | ||
// but bail out for safety. | ||
log_error(cds)("Too many strings to be archived: " SIZE_FORMAT, _items_count); | ||
os::_exit(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I hadn't really noticed that the CDS code has introduced its own direct exit path using os::_exit
in a number of places. Why does it do this instead of using the existing VM exit paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is explained in metaspaceShared.cpp, but perhaps we should consolidate all these direct exit calls (in a separate PR) to an utility function so it's easy to track (and explain) all the calls.
// There may be pending VM operations. We have changed some global states
// (such as vmClasses::_klasses) that may cause these VM operations
// to fail. For safety, forget these operations and exit the VM directly.
os::_exit(0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vm_direct_exit
would seem to be okay in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing further from me. Changes look good. Thanks.
const static int _secondary_array_index_mask = _secondary_array_max_length - 1; | ||
|
||
// make sure _secondary_array_index_bits is not too big | ||
static void verify_secondary_array_index_bits() PRODUCT_RETURN; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be NOT_DEBUG_RETURN, as the function definition is DEBUG only.
@iklam This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
@iklam This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 7 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the ➡️ To integrate this PR with the above commit message to the |
Thanks @ashu-mehra and @dholmes-ora for the review. I merged with latest repo and passed tiers 1~4. |
Going to push as commit b524a74.
Your commit was automatically rebased without conflicts. |
Background:
Currently, the archived java strings are mapped in the G1 "closed archive" region. This essentially pins all the strings in memory.
As a prerequisite for (JDK-8296263), this PR removes the requirement of pinning the archived strings. This will allow the CDS archive heap to be mapped in garbage collectors that do not support object pinning.
Code changes:
_shared_strings_array
) to keep them alive. As a result, it's no longer necessary to pin them._shared_table
in stringTable.cpp is modified to store a 32-bit index for each archived string. This index is used to retrieve the archived string from_shared_strings_array
at runtime.Note that CDS has a limit on the size of archived objArrays. When there's a large number of strings, we use a two-level table. See the comments around
_shared_strings_array
in the header file.Testing
Tiers 1 - 4
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12607/head:pull/12607
$ git checkout pull/12607
Update a local copy of the PR:
$ git checkout pull/12607
$ git pull https://git.openjdk.org/jdk.git pull/12607/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12607
View PR using the GUI difftool:
$ git pr show -t 12607
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12607.diff