Native Image Committer and Community Meeting 2023-12-14 #8012

christianwimmer · 2023-12-12T22:26:51Z

christianwimmer
Dec 12, 2023

List of all past and upcoming meetings: #3933

New and Noteworthy

Release branches for GraalVM for JDK 22 will be created early next week. After that, the master branch will track JDK 23 promoted build, and the release branch will track JDK 22 promoted builds.
Shadow heap:
[GR-49410] Do not mark AnalysisType reachable when creating AnalysisField. #8011
[GR-50236] Seal shadow heap before heap layout. #7963
[GR-42996] Refactor ImageHeapConstant unwrapping. #7875
[GR-42996] Ensure no SubstrateObjectConstant in Graal graphs. #7847
It is no longer possible to run the image generator on the class path (which we supported for a while to allow frameworks to transition to the new module path mode):
[GR-30433] Disallow the deprecated environment variable USE_NATIVE_IMAGE_JAVA_PLATFORM_MODULE_SYSTEM=false. #7952
Foreign Function and Memory API support (OpenJKD Project Panama): Since support in native image is still under development, it needs to be enabled explicitly at image build time (and then some things will work, and some things will not work).
[GR-49655] Option for experimental Foreign Function and Memory API support. #7980
Static analysis:
[GR-50203] Track primitive values in the points-to analysis #7852 This is the first step to track predicates and primitive values by default in the static analysis, to improve analysis precision and usability. Since it does not give benefits without also tracking predicates, it is not enabled by default yet.
Compatibility
[GR-50267] Deprecate options that we plan to remove in a future GraalVM release. #7902
[GR-47376] Support exception handler entries reachable from normal control flow. #7832
[GR-49807] Modify substitutions for the SecurityManager to behave the same as the JDK when 'java.security.manager' is not set. #7805 System.setSecurityManager is no longer a fatal error, this is no longer necessary since SecurityManager is deprecated in the JDK.
[GR-50426] Store signature of AnalysisMethod as resolved types. #7953 Having multiple classes with the same name at image build time still had some problems, because method signatures were still name-based and could get messed up.
[GR-50194] Disable MethodHandle inlining blocking. #7960 Some method handles created by JDK bootstrap methods have a "profiling mode", which is not useful for AOT compilation. This is a pretty crude fix because it disables the profiling also in the image builder, but solves some problems.
[GR-45806] Execute bootstrap methods at run time #7834 merged now, see the November Deep Dive for details.
Monitoring / tools:
Debugging: we are working on a "python helper script" for the new Dwarf information, to simplify debugging with gdb. We had such helpers for the old enterprise-only debug information, and not having them made debugging unnecessarily difficult.
[GR-47007] Support Async Sampling-based Profiler on Darwin. #7897

Deep Dive: Shadow heap

And the end of the image build process, we need to write out the "image heap", i.e., the starting point of the application heap at run time. Most of that image heap is built already by the static analysis. We call that the "shadow heap". The shadow heap is a graph of ImageHeapConstant objects that store field/array values and reference each other.

Root pointers of the shadow heap:

Static fields that the analysis found as reachable: A TypeData object per class, referenced from AnalysisType.typeData.
Constants embedded in Graal compiler graphs of reachable methods. All such constants are collected in a single place to avoid iterating over all graphs repeatedly: AnalysisUniverse.embeddedRoots.
- After compilation the embedded constant roots are recomputed from the compilation results to account for all constants create late, e.g., during snippet lowering.

Graal compiler graphs only reference ImageHeapConstant. The old SubstrateObjectConstant will go away at some point (more precisely, it will be only used during Truffle JIT compilation)

AOT compilation (for the purpose of constant folding) and image heap writing only access the shadow heap, i.e., there is no more "backdoor" in any place that reads values directly from the HotSpot heap without going through the shadow heap.

Life cycle of a `ImageHeapConstant`

Shadow heap objects that have a hostedObject, i.e., an underlying object in the HotSpot heap:

Creating an ImageHeapConstant is side effect free. Even though the AnalysisType for its type is eagerly created it may not be marked as reachable yet, and the object itself is not marked as reachable yet. But it is important that there is a unique mapping from a hosted object to a shadow heap object, maintained in ImageHeap.objectsCache.
Even before reading the hosted fields, an object can be referenced from a Graal compiler graph.
Reading of the hosted fields: All hosted fields are read in one batch, regardless of the field's individual reachability. That avoids inconsistencies when the hosting object is modified. Reading the hosted fields requires the type of the constant to be marked as reachable. The object itself is not yet reachable.
Reading the hosted fields invokes registered FieldValueTransformer, i.e., the hosted fieldValues are already transformed values.
Field values of not-yet-reachable objects can be constant folded without making the object reachable.
Marking an object as reachable adds the object into the shadow heap graph, i.e., stores the value in the corresponding type set in ImageHeap.reachableObjects. The values of all fields marked as read are also processed and marked as reachable. The array elements are immediately processed and marked as reachable. This is done in parallel with the static analysis: a new object being reachable can make a new type reachable, which can make new methods reachable, which can make more objects reachable, ...

Shadow heap objects without a hostedObject (currently only created by the simulation of class initializer):

Creating the ImageHeapConstant is side effect free. Field values can be freely modified as long as the object is not reachable.
There are no field value transformers or object replacer invoked for such constants (because there are no hosted objects that could be passed to them).
Marking an object as reachable adds the object into the shadow heap graph, no difference here to objects with an underlying hosted object.

Field value transformations

The shadow heap reflects the values of fields as they will be when the application starts up. Several fields need to be transformed as part of the snapshotting process:

Resetting of fields to the default value so that unwanted image generator state does not leak into the image
Eager initialization of fields: do work once at build time instead during every application run

We have multiple ways to transform a field value:

FieldValueTransformer are part of the supported API, and the preferred way of transformation. The The API for registration transformers is Feature.BeforeAnalysisAccess#registerFieldValueTransformer.
ComputedValueField registered via the @Alias @RecomputeFieldValue annotation is the legacy way of registering a transformer. It has a couple of unwanted side effects, e.g., the declaring class is then also always initialized at image time build.
@UnknownObjectField and @UnknownPrimitiveField are internal annotations for fields whose value is only available after static analysis.

Transformers are also registered automatically by the com.oracle.svm.hosted.substitute.UnsafeAutomaticSubstitutionSupport, to intercept field and array offsets stored in static final fields of classes that are initialized at image build time.

Availability of field values

Some field transformed field values are not available yet during static analysis, because the depend on analysis results. For example the instance field offset (accessible via Unsafe) can only be known once we know which fields are reachable.

The non-API FieldValueTransformerWithAvailability allows to provide field values only after static analysis. For @UnknownObjectField and @UnknownPrimitiveField annotated fields, the hosted value is only read based on the specified availability. The user is responsible for ensuring that this does not introduce new types, methods, or fields as reachable that the static analysis did not see. It is easy to make mistakes, so this is not API on purpose.

Object replacer

Part of the supported API, registered via Feature.DuringSetupAccess.registerObjectReplacer. Every object replacer gets invoked for every object, before the ImageHeapConstant is created. If an object replacer maps two distinct hosted objects to the same replacement, there is only one ImageHeapConstant.

Object replacer are often misused as an "object reachable" mechanism too. But keep in mind that an object replacer might see an object that ends up as not reachable in the end. We do not not have a proper reachability handler for objects yet, but we want to add that in the future, to avoid such corner cases. And considering issues such as quarkusio/quarkus#37498 we might need to do it sooner than later, i.e., even for the upcoming release.

"Verification" of the shadow heap

Ideally, once a hosted value is read, that value never changes in the hosted HotSpot heap (if you know a field is, e.g., a lazily initialized cache, either reset or eagerly initialize the field using a field value transformer).

But in reality, we are still missing many places where a field value transformer would be necessary. So for now, we repeatedly re-read hosted values and check for changes:

During analysis: when the analysis reaches a fixed point, all values are re-read. This can trigger another analysis iteration. All such new values are correctly reflected in the shadow heap.
After analysis: we also re-read the hosted values several times after analysis. Any changes to the shadow heap here are dangerous, because they must not make new elements reachable. Future work is to gradually remove these re-reads: first print warnings, then print errors, and then when all known problems are fixed remove the re-reads.

When are types and fields marked as reachable

Before #8011, already the creation of an AnalysisField made the declaring class reachable. That is bad because creating metadata elements should really be side-effect free, so fixing that was an important priority. Now we can create AnalysisType, AnalysisField, AnalysisMethod, and ImageHeapConstant instances without side effects.

Marking a field as reachable also makes the type reachable.
Marking an shadow heap object as reachable also makes the type reachable.
Constant folding a field makes the field reachable. That rule makes sense, but can lead to surprises: fields are constant folded already during bytecode parsing, and methods can be parsed even though they end up as not reachable in the end (like when inlining before analysis proactively parses and attempts method inlining).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native Image Committer and Community Meeting 2023-12-14 #8012

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Native Image Committer and Community Meeting 2023-12-14 #8012

christianwimmer Dec 12, 2023

New and Noteworthy

Deep Dive: Shadow heap

Life cycle of a ImageHeapConstant

Field value transformations

Availability of field values

Object replacer

"Verification" of the shadow heap

When are types and fields marked as reachable

Replies: 0 comments

christianwimmer
Dec 12, 2023

Life cycle of a `ImageHeapConstant`