Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8293980: Resolve CONSTANT_FieldRef at CDS dump time #19355

Closed
wants to merge 12 commits into from

Conversation

iklam
Copy link
Member

@iklam iklam commented May 22, 2024

Overview

This PR archives CONSTANT_FieldRef entries in the resolved state when it's safe to do so.

I.e., when a CONSTANT_FieldRef constant pool entry in class A refers to a non-static field B.F,

  • B is the same class as A; or
  • B is a supertype of A; or
  • B is one of the vmClasses, and A is loaded by the boot class loader.

Under these conditions, it's guaranteed that whenever A tries to use this entry at runtime, B is guaranteed to have already been resolved in A's system dictionary, to the same value as resolved during dump time.

Therefore, we can safely archive the ResolvedFieldEntry in class A that refers to B.F.

(Note that we do not archive the CONSTANT_FieldRef entries for static fields, as the resolution of such entries can lead to class initialization at runtime. We plan to handle them in a future RFE.)

Static CDS Archive

This feature is implemented in three steps for static CDS archive dump:

  1. At the end of the training run, ClassListWriter iterates over all loaded classes and writes the indices of their resolved Class and FieldRef constant pool entries into the classlist file, with the @cp prefix. E.g., the following means that the constant pool entries at indices 2, 19 and 106 were resolved during the training run:
@cp java/util/Objects 2 19 106
  1. When creating the static CDS archive from the classlist file, ClassListParser processes the @cp entries and resolves all the indicated entries.

  2. Inside the ArchiveBuilder::make_klasses_shareable() function, we iterate over all entries in all archived ConstantPools. When we see a resolved entry that does not satisfy the safety requirements as stated in Overview, we revert it back to the unresolved state.

Dynamic CDS Archive

When dumping the dynamic CDS archive, ClassListWriter and ClassListParser are not used, so steps 1 and 2 are skipped. We only perform step 3 when the archive is being written.

Limitations

  • For safety, we limit this optimization to only classes loaded by the boot, platform, and app class loaders. This may be relaxed in the future.
  • We archive only the constant pool entries that are actually resolved during the training run. We don't speculatively resolve other entries, as doing so may cause C2 to unnecessarily generate code for paths that are never taken by the application.

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8293980: Resolve CONSTANT_FieldRef at CDS dump time (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19355/head:pull/19355
$ git checkout pull/19355

Update a local copy of the PR:
$ git checkout pull/19355
$ git pull https://git.openjdk.org/jdk.git pull/19355/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19355

View PR using the GUI difftool:
$ git pr show -t 19355

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19355.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 22, 2024

👋 Welcome back iklam! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 22, 2024

@iklam This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8293980: Resolve CONSTANT_FieldRef at CDS dump time

Reviewed-by: erikj, matsaave, heidinga

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 22, 2024
@openjdk
Copy link

openjdk bot commented May 22, 2024

@iklam The following labels will be automatically applied to this pull request:

  • build
  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org build build-dev@openjdk.org labels May 22, 2024
@mlbridge
Copy link

mlbridge bot commented May 22, 2024

@iklam iklam marked this pull request as draft May 23, 2024 00:22
@openjdk openjdk bot removed the rfr Pull request is ready for review label May 23, 2024
@iklam iklam marked this pull request as ready for review May 23, 2024 03:30
@openjdk openjdk bot added the rfr Pull request is ready for review label May 23, 2024
@iklam
Copy link
Member Author

iklam commented May 23, 2024

I pressed the wrong button and sent out the RFR mail too soon ....

I have finished updating the PR description text. It's ready for review now.

Copy link
Member

@erikj79 erikj79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build change looks good.

/reviewers 2

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 23, 2024
@openjdk
Copy link

openjdk bot commented May 23, 2024

@erikj79
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label May 23, 2024
Copy link
Contributor

@matias9927 matias9927 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After making a quick first pass over this, I have some comments about the constant pool and cpcache code.

_cpool_index = other._cpool_index;
_tos_state = other._tos_state;
_flags = other._flags;
_get_code = other._get_code;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fields _get_code and _put_code are normally set atomically, does this need to be the case when copying as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done inside while the ResolvedFieldEntries are being prepared during class rewriting. All access is single threaded so there's no need for atomic operations.

@@ -298,20 +298,22 @@ objArrayOop ConstantPool::prepare_resolved_references_for_archiving() {

objArrayOop rr = resolved_references();
if (rr != nullptr) {
ConstantPool* orig_pool = ArchiveBuilder::current()->get_source_addr(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the changes below necessary? I think the original was fine but I may be missing the point of this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just for consistency. "source" is the terminology used in the comments in archiveBuilder.cpp.

ResolvedFieldEntry* rfi = field_entries->adr_at(i);
int cp_index = rfi->constant_pool_index();
bool archived = false;
bool resolved = rfi->is_resolved(Bytecodes::_putfield) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is one of these meant to be is_resolved(Bytecodes::get_field) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

break;
}
}

if (cache() != nullptr) {
// cache() is null if this class is not yet linked.
cache()->remove_unshareable_info();
remove_resolved_field_entries_if_non_deterministic();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods look like they can belong to the constant pool cache instead. Can cpCache call the ClassLinker code instead so this can be part of cache()->remove_unshareable_info()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved remove_resolved_field_entries_if_non_deterministic() to cpCache as you suggested. I removed the functions for indy and method, as dumptime resolution for those types of entries is not yet implemented.

Copy link
Contributor

@matias9927 matias9927 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, thanks!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 28, 2024
# These are needed for deterministic classlist:
# - The classlist can be influenced by locale. Always set it to en/US.
# - Run with -Xint, as the compiler can speculatively resolve constant pool entries.
# - ForkJoinPool parallelism can cause constant pool resolution to be non-dererministic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo

Suggested change
# - ForkJoinPool parallelism can cause constant pool resolution to be non-dererministic.
# - ForkJoinPool parallelism can cause constant pool resolution to be non-deterministic.

Comment on lines 2558 to 2560
// The ConstantPool is cleaned in a separate pass in ArchiveBuilder::make_klasses_shareable(),
// so no need to do it here.
//constants()->remove_unshareable_info();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be deleted rather than commented out?

Comment on lines +843 to +848
if (preresolve_class) {
ClassPrelinker::preresolve_class_cp_entries(THREAD, ik, &preresolve_list);
}
if (preresolve_fmi) {
ClassPrelinker::preresolve_field_and_method_cp_entries(THREAD, ik, &preresolve_list);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify the approach here?

As I read the code, ClassPrelinker::preresolve_class_cp_entries will walk the whole constant pool looking for unresolved class entries that match and then resolve them. ClassPrelinker::preresolve_field_and_method_cp_entries walks all methods bytecode by bytecode to resolve them.

Doesn't the preresolve_list already tell us which CP entries need to be resolved and the cp tag tell us the type of resolution to do? Can we not do this in a single pass over the cp rather than walking method bytecodes?

Is the reason for this approach to avoid always resolving FieldMethodRefs for both get and put and only do them if there's a corresponding bytecode?

Copy link
Member Author

@iklam iklam May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preresolve_list has the original CP indices (E.g., putfield #123 as stored in the classfile), but in HotSpot, after bytecode rewriting, the u2 following the bytecode is changed to an index into the cpcache()->_resolved_field_entries array, so it becomes something like putfield #45. So we need to know how to convert the 123 index to the 45 index.

We could walk _resolved_field_entries to find the ResolvedFieldEntry whose _cpool_index is 123. However, before the ResolvedFieldEntry is resolved, we don't know which bytecode is used to resolve it, so we don't know whether it's for a static field or non-static field. Since we want to filter out the static fields in the PR, we need to:

  • walk the bytecodes to find only getfield/putfield bytecodes
  • these bytecodes will give us an index to the _resolved_field_entries array
  • from there, we discover the original CP index
  • then we see if this index is set to true in preresolve_list

There's also a compatibility issue -- it's possible to have static and non-static field access using the same CP index:

class Hack {
static int myField;
int foo(boolean flag) {
    try {
        if (flag) {
            // throw IncompatibleClassChangeError
            return /* pseudo code*/ getfield this.myField;
        } else {
            // OK
            return /* pseudo code*/ getstatic Hack.myField;
        } 
    } catch (Throwable) {
        return 5678;
    }
}

So we must call InterpreterRuntime::resolve_get_put() which performs all the checks for access rights, static-vs-non-static, etc. This call requires a Method parameter, so we must walk all the Methods to find an appropriate one.

Perhaps it's possible to refactor the InterpreterRuntime code to avoid passing in a Method, but I am hesitant to do that with code that deals with access right checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could walk _resolved_field_entries to find the ResolvedFieldEntry whose _cpool_index is 123. However, before the ResolvedFieldEntry is resolved, we don't know which bytecode is used to resolve it, so we don't know whether it's for a static field or non-static field. Since we want to filter out the static fields in the PR, we need to:

  • walk the bytecodes to find only getfield/putfield bytecodes
  • these bytecodes will give us an index to the _resolved_field_entries array
  • from there, we discover the original CP index
  • then we see if this index is set to true in preresolve_list

Something's been bothering me about this explanation and I think I've put my finger on it. As you show, the same CP entry can be referenced by both getstatic & getfield bytecodes though only one will successfully resolve. Walking the bytecodes doesn't actually tell us anything - the resolution status should be different for instance vs static fields which means we're should always be safe to attempt the resolution of fields as instance fields provided we ignore errors.

So we must call InterpreterRuntime::resolve_get_put() which performs all the checks for access rights, static-vs-non-static, etc. This call requires a Method parameter, so we must walk all the Methods to find an appropriate one.

The Method parameter is necessary for puts to final fields - either <clinit> for static finals or an <init> method for instance finals. In either case, the we don't actually resolve the field for puts so it doesn't matter if we pass the "correct" method or not during pre resolution as it will never successfully complete. I think we'd be OK to send any method we want to that call when doing preresolution provided we ignore the errors

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the version in the Leyden repo, there are many different types of references that are handled in ClassPrelinker::maybe_resolve_fmi_ref

https://github.com/openjdk/leyden/blob/4faa72029abb86b55cb33b00acf9f3a18ade4b77/src/hotspot/share/cds/classPrelinker.cpp#L307

My goal is to defer all the safety checks to InterpreterRuntime::resolve_xxx so that we don't need to think about what is safe to pre-resolve, and what is not. Some of the checks are very complex (see linkResolver.cpp as well) and may change as the language evolve.

Copy link
Contributor

@DanHeidinga DanHeidinga May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current algorithm says:

for each bytecode in each method:
  switch(bytecode) {
    case getfield:
    case outfield:
      InterpreterRuntime::resolve_get_put(bc, raw_index, mh, cp, false /*initialize_holder*/, CHECK);
      break;
   ....
  }

What I'm proposing is:

for each ResolvedFieldEntry
   bool success = InterpreterRuntime::resolve_get_put(getfield, raw_index, nullptr /* mh */, cp, false /*initialize_holder*/, CHECK);
   if (success) {
     // also resolve for put
     InterpreterRuntime::resolve_get_put(putfield, raw_index, nullptr /* mh */, cp, false /*initialize_holder*/, CHECK);
   }

The method parameter is not critical as the "current" algorithm attempts resolution with multiple methods - once for each method that references the ResolvedFieldEntry. The resolution logic already has to handle dealing with different rules for different types of methods (ie <clinit> & <init>) for normal execution and "knows" not to resolve entries (like puts of field fields) regardless of the method as they need to do additional runtime checks on every access.

The same will apply to invoke bytecodes later.... it feels safer to do only what the bytecodes in some method have asked for but the runtime already has to be robust against different kinds of gets/puts or invokes targeting the same cp entries. By eagerly resolving we're not giving up any safety.

If you really want to only resolve the exact cases (ie: gets not puts, etc) that were resolved in the training run, then we need to write out as part of the classlist more explicitly what needs to be resolved:
ie:

@cp_resolved_gets 4, 7 8
@cp_resolved_puts 7 8 10

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. I will try to prototype it in the Leyden repo and then update this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried skipping the methodHandle parameter to InterpreterRuntime::resolve_get_put but it's more complicated than I thought.

  1. The fieldDescriptor::has_initialized_final_update() will return true IFF the class has putfield bytecodes to a final field outside of <clinit> methods. See here
  2. When InterpreterRuntime::resolve_get_put is called for a putfield, it adds putfield to the ResolvedFieldEntry only if fieldDescriptor::has_initialized_final_update() is false. See here
  3. InterpreterRuntime::resolve_get_putcalls LinkResolver::resolve_field(), which always checks if the methodHandle is <init> or not, without consulting fieldDescriptor::has_initialized_final_update(). See here

(2) is an optimization -- if a method sets final fields only inside <clinit> methods, we should fully resolve the putfield bytecodes. Otherwise every such putfield will result in a VM call.

(3) is for correctness -- make sure that only <init> can modify final fields.

I am pretty sure (2) and (3) are equivalent. I.e., we should check against the method in (3) only if fieldDescriptor::has_initialized_final_update() is true. However, (3) is security related code, so I don't want to change it inside an optimization PR like this one. Without fixing that, I cannot call InterpreterRuntime::resolve_get_put with a null methodHandle, as it will hit the assert.

This goes back to my original point -- I'd rather do something stupid but correct (call the existing APIs and live with the existing behavior), rather than trying to analyze the resolution code and see if we can skip certain checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying... and the fieldDescriptor::has_initialized_final_update() is a nice optimization that we don't want to mess up. Getting the straightforward code in makes sense as we can optimize it later given it runs at CDS archive build which isn't performance critical.

Down the line, we can add new entry points into the LinkResolver for use by CDS that refuse to resolve final fields and avoid this kind of issue but that's a future problem.

iklam added a commit to openjdk/leyden that referenced this pull request Jun 4, 2024
@iklam
Copy link
Member Author

iklam commented Jun 14, 2024

Thanks @erikj79 @matias9927 @DanHeidinga for the review
/integrate

@openjdk
Copy link

openjdk bot commented Jun 14, 2024

Going to push as commit b818679.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 14, 2024
@openjdk openjdk bot closed this Jun 14, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 14, 2024
@openjdk
Copy link

openjdk bot commented Jun 14, 2024

@iklam Pushed as commit b818679.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build-dev@openjdk.org hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
4 participants