Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8264774: Implementation of Foreign Function and Memory API (Incubator) #3699

Closed
wants to merge 38 commits into from

Conversation

mcimadamore
Copy link
Contributor

@mcimadamore mcimadamore commented Apr 26, 2021

This PR contains the API and implementation changes for JEP-412 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment.

[1] - https://openjdk.java.net/jeps/412


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8264774: Implementation of Foreign Function and Memory API (Incubator)

Reviewers

Contributors

  • Paul Sandoz <psandoz@openjdk.org>
  • Jorn Vernee <jvernee@openjdk.org>
  • Vladimir Ivanov <vlivanov@openjdk.org>
  • Athijegannathan Sundararajan <sundar@openjdk.org>
  • Chris Hegarty <chegar@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3699/head:pull/3699
$ git checkout pull/3699

Update a local copy of the PR:
$ git checkout pull/3699
$ git pull https://git.openjdk.java.net/jdk pull/3699/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3699

View PR using the GUI difftool:
$ git pr show -t 3699

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3699.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 26, 2021

👋 Welcome back mcimadamore! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 26, 2021

@mcimadamore The following labels will be automatically applied to this pull request:

  • core-libs
  • hotspot
  • nio
  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added security security-dev@openjdk.org hotspot hotspot-dev@openjdk.org nio nio-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Apr 26, 2021
@mcimadamore
Copy link
Contributor Author

mcimadamore commented Apr 26, 2021

Here we list the main changes introduced in this PR. As usual, a big thank to all who helped along the way: @ChrisHegarty, @iwanowww, @JornVernee, @PaulSandoz and @sundararajana.

Managing memory segments: ResourceScope

This PR introduces a new abstraction (first discussed here), namely ResourceScope which is used to manage the lifecycle of resources associated with off-heap memory (such as MemorySegment, VaList, etc). This is probably the biggest change in the API, as MemorySegment is no longer AutoCloseable: it instead features a scope accessor which can be used to access the memory segment's ResourceScope; the ResourceScope is the new AutoCloseable. In other words, code like this:

try (MemorySegment segment = MemorySegment.allocateNative(100)) {
   ...
}   

Now becomes like this:

try (ResourceScope scope = ResourceScope.ofConfinedScope()) {
   MemorySegment segment = MemorySegment.allocateNative(100, scope);
   ...
}   

While simple cases where only one segment is allocated become a little more verbose, this new API idiom obviously scales much better when multiple segments are created with the same lifecycle. Another important fact, which is captured by the name of the ResourceScope factory in the above snippet, is that segments no longer feature dynamic ownership changes. These were cool, but ultimately too expensive to support in the shared case. Instead, the API now requires clients to make a choice upfront (confined, shared or implicit - where the latter means GC-managed, like direct buffers).

Implementation-wise, ResourceScope is implemented by a bunch of internal classes: ResourceScopeImpl, ConfinedScope and SharedScope. A resource scope impl has a so called resource list which can be also shared or confined. This is where cleanup actions are added; the resource list can be attached to a Cleaner to get implicit deallocation. There is a new test TestResourceScope to stress test the behavior of resource scopes, as well as a couple of microbenchmarks to assess the cost of creating/closing scopes (ResourceScopeClose) and acquiring/releasing them (BulkMismatchAcquire).

IO operation on shared buffer views

In the previous iteration of the Memory Access API we have introduced the concept of shared segments. Shared segments are as easy to use as confined ones, and they are as fast. One problem with shared segments was that it wasn't clear how to support IO operations on byte buffers derived from such segments: since the segment memory could be released at any time, there was simply no way to guarantee that a shared segment could not be closed in the middle of a (possibly async) IO operation.

In this iteration, shared segments are just segments backed by a shared resource scope. The new API introduces way to manage the new complexity, in the form of two methods ResourceScope::acquire and ResourceScope::release, respectively, which can be used to acquire a resource scope. When a resource scope is in the acquired state, it cannot be closed (you can think of it as some slightly better and asymmetric form of an atomic reference counter).

This means we are now finally in a position to add support for IO operations on all byte buffers, including those derived from shared segments. A big thank to @ChrisHegarty who lead the effort here. More info on this work are included in his writeup.

Most of the implementation for this feature occurs in the internal NIO packages; a new method on Buffer has been added to facilitate acquiring from NIO code - most of the logic associated with acquiring is in the IOUtil class. @ChrisHegarty has added many good tests for scoped IO operations under the foreign/channels folder, to check for all possible buffer/scope flavors.

Allocating at speed: SegmentAllocator

Another abstraction introduced in this JEP is that of SegmentAllocator. A segment allocator is a functional interface which can be used to tell other APIs (and, crucially, the CLinker API) how segments should be allocated, if the need arise. For instance, think about some code which turns a Java string into a C string. Such code will invariably:

  1. allocate a memory segment off heap
  2. bulk copy (where possible) the content of the Java string into the off-heap segment
  3. add a NULL terminator

So, in (1) such string conversion routine need to allocate a new off-heap segment; how is that done? Is that a call to malloc? Or something else? In the previous iteration of the Foreign Linker API, the method CLinker::toCString had two overloads: a simple version, only taking a Java string parameter; and a more advanced version taking a NativeScope. A NativeScope was, at its core, a custom segment allocator - but the allocation scheme was fixed in NativeScope as that class always acted as an arena-style allocator.

SegmentAllocator is like NativeScope in spirit, in that it helps programs allocating segments - but it does so in a more general way than NativeScope, since a SegmentAllocator is not tied to any specific allocation strategy: in fact the strategy is left there to be defined by the user. As before, SegmentAllocator does provide some common factories, e.g. to create an arena allocator similar to NativeScope - but the CLinker API is now free to work with any implementations of the SegmentAllocator interface. This generalization is crucial, given that, when operating with off-heap memory, allocation performance is often the bottleneck.

Not only is SegmentAllocator accepted by all methods in the CLinker API which need to allocate memory: even the behavior of downcall method handle can now be affected by segment allocators: when linking a native function which returns a struct by value, the CLinker API will in fact need to dynamically allocate a segment to hold the result. In such cases, the method handle generated by CLinker will now accept an additional prefix parameter of type SegmentAllocator which tells CLinker how should memory be allocated for the result value. For instance, now clients can tell CLinker to return structs by value in heap segments, by using a SegmentAllocator which allocates memory on the heap; this might be useful if the segment is quickly discarded after use.

There's not much implementation for SegmentAllocator as most of it is defined in terms of default methods in the interface itself. However we do have implementation classes for the arena allocation scheme (ArenaAllocator.java). We support confined allocation and shared allocation. The shared allocation achieves lock-free by using a ThreadLocal of confined arena allocators. TestSegmentAllocators is the test which checks most of the arena allocation flavors.

MemoryAddress as scoped entities

A natural consequence of introducing the ResourceScope abstraction is that now not only MemorySegment are associated with a scope, but even instances of MemoryAddress can be. This means extra safety, because passing addresses which are associated with a closed scope to a native function will issue an exception. As before, it is possible to have memory addresses which the runtime knows nothing about (those returned by native calls, or those created via MemoryAddress::ofLong); these addresses are simply associated with the so called global scope - meaning that they are not actively managed by the user and are considered to be "always alive" by the runtime (as before).

Implementation-wise, you will now see that MemoryAddressImpl is no longer a pair of Object/long. It is now a pair of MemorySegment/long. The MemorySegment, if present, tells us which segment this address has been obtained from (and hence which scope is associated with the address). If null, if means that the address has no segment, and therefore is associated with the global scope. The long part acts as an offset into the segment (if segment is non-null), or as an absolute address. A new test SafeFunctionAccessTest attempts to call native functions with (closed) scoped addresses to see if exceptions are thrown.

Virtual downcall method handles

There are cases where the address of a downcall handle cannot be specified when a downcall method handle is linked, but can only be known subsequently, by doing more native calls. To better support these use cases, CLinker now provides a factory for downcall method handles which does not require any function entry point. Instead, such entry point will be provided dynamically, via an additional prefix parameter (of type MemoryAddress). Many thanks to @JornVernee who implemented this improvement.

The implementation changes for this range from tweaking the Java ABI support (to make sure that the prefix argument is handled as expected), to low-level hotspot changes to parameterize the generated compiled stub to use the address (dynamic) parameter. Note that regular downcall method handles (the ones that are bound to an address) are now simply obtained by getting a "virtual" method handle, and inserting a MemoryAddress coordinate in the first argument position. TestVirtualCalls has been written explicitly to test dynamic passing of address parameters but, in reality, all existing downcall tests are stressing the new implementation logic (since, as said before, the old logic is expressed as an adaptation of the new virtual method handles). The benchmark we used to test downcall performances CallOverhead has now been split into two: CallOverheadConstant vs. CallOverheadVirtual.

Optimized upcall support

The previous iteration of the Foreign Linker API supported intrinsification of downcall handles, which allows calls to downcall method handles to perform as fast as a regular JNI call. The dual case, calling Java code from native code (upcalls) was left unoptimized. In this iteration, @JornVernee has added intrinsics support for upcalls as well as downcalls, based on some prior work from @iwanowww. As for downcalls, a lot of the adaptation now happens in Java code, before we jump into the target method handle. As for the code which calls such target handle, changes have been made so that the native code can jump to the optimized entry point (if one exists) for such method handle more directly. The performance improvements with this new approach are rather nice, with CLinker upcalls being 3x-4x faster compared with regular upcalls via JNI.

Again, here we have changes in the guts of the Java ABI support, as we needed to adjust the method handle specialization logic to be able to work in two directions (both from Java to native and from native to Java). On the Hotspot front, the optimization changes are in universalUpcallHandler_x86_64.cpp.

Accessing restricted methods

It is still the case that some of the methods in the API are "restricted" and access to these methods is disabled by default. In previous iterations, access to such methods was granted by setting a JDK read-only runtime property: -Dforeign.restricted=permit. In this iteration we have refined the story for accessing restricted methods (thanks @sundararajana ), by introducing a new experimental command line option in the Java launcher, namely --enable-native-access=<module list>. This options accepts a list of modules (separated by commas), where a module name can also be ALL-UNNAMED (for the unnamed module). Adding this command line flag to the launcher has the effect of allowing access to restricted methods to a given set of modules (the list of modules specified in the command line option). Access to restricted methods from any other module not in the list is disallowed and will result in an IllegalAccessException.

When implementing this flag we considered two options: adding some resolution-time checks in the JVM (e.g. in linkResolver); or use @CallerSensitive methods. In the end we opted for the latter given that @CallerSensitive are generally well understood and optimized, and the general feeling was that inventing another form of callsite-dependent check might have been unnecessarily risky, given that the same checks can be implemented in Java using @CallerSensitive. We plan (not in 17) to add javadoc support by means of an annotation (like we do for preview API methods) so that the text that is currently copied and pasted in all restricted methods can be inferred automagically by javadoc.

GitHub testing status

Most platforms build and tests pass. There are a bunch of additional Linux platforms which do not yet work correctly:

  • ppc
  • s390

These are slightly harder to fix, as this patch moves some static functions (e.g. long_move, float_move) up to SharedRuntime; unfortunately, while most platforms use the same signatures for these function, on ppc and s390 that's not the case and function with same name, but incompatible signatures are defined there, leading to build issues. I filed an issue for this:

https://bugs.openjdk.java.net/browse/JDK-8266257

And I plan to fix this after integration; I already have the code working - see:

https://github.com/mcimadamore/jdk/actions/runs/793241141

But, after discussing with @iwanowww, we decided it would be best to wait for integration, as moving these routines from SharedRuntime to MacroAssembler will create noise in the hotspot code for this PR.

@mcimadamore
Copy link
Contributor Author

mcimadamore commented Apr 27, 2021

/contributor add @PaulSandoz

@openjdk
Copy link

openjdk bot commented Apr 27, 2021

@mcimadamore
Contributor Paul Sandoz <psandoz@openjdk.org> successfully added.

@mcimadamore
Copy link
Contributor Author

/csr

@mcimadamore
Copy link
Contributor Author

/contributor add @JornVernee

@mcimadamore
Copy link
Contributor Author

/contributor add @iwanowww

@mcimadamore
Copy link
Contributor Author

/contributor add @sundararajana

@mcimadamore
Copy link
Contributor Author

/contributor add @ChrisHegarty

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Apr 27, 2021
@openjdk
Copy link

openjdk bot commented Apr 27, 2021

@mcimadamore this pull request will not be integrated until the CSR request JDK-8264781 for issue JDK-8264774 has been approved.

@openjdk
Copy link

openjdk bot commented Apr 27, 2021

@mcimadamore
Contributor Jorn Vernee <jvernee@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Apr 27, 2021

@mcimadamore
Contributor Vladimir Ivanov <vlivanov@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Apr 27, 2021

@mcimadamore
Contributor Athijegannathan Sundararajan <sundar@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Apr 27, 2021

@mcimadamore
Contributor Chris Hegarty <chegar@openjdk.org> successfully added.

@mcimadamore mcimadamore marked this pull request as ready for review April 27, 2021 17:54
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 27, 2021
@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 13, 2021
@openjdk
Copy link

openjdk bot commented May 18, 2021

⚠️ @mcimadamore This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

* @implSpec
* Implementations of this interface are immutable, thread-safe and <a href="{@docRoot}/java.base/java/lang/doc-files/ValueBased.html">value-based</a>.
*/
public interface MemoryLayout extends Constable {
public sealed interface MemoryLayout extends Constable permits AbstractLayout, SequenceLayout, GroupLayout, PaddingLayout, ValueLayout {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle we could just permit AbstractLayout and be done. In practice, the javadoc comes out better if we list all the concrete API interfaces which extend MemoryLayout, as the permit list will be displayed in the javadoc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least the internal class name is hidden in the javadoc:
image

* Downgrade native methods in ProgrammableUpcallHandler to package-private
@openjdk
Copy link

openjdk bot commented May 27, 2021

@mcimadamore this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JEP-412
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated labels May 27, 2021
@openjdk openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels May 27, 2021
@mcimadamore
Copy link
Contributor Author

/integrate

@openjdk openjdk bot closed this Jun 2, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 2, 2021
@openjdk
Copy link

openjdk bot commented Jun 2, 2021

@mcimadamore Since your change was applied there have been 23 commits pushed to the master branch:

  • 71425dd: 8267118: OutOfMemoryError cannot be caught as a Throwable
  • de6472c: 8267459: Pasting Unicode characters into JShell does not work.
  • 9247630: 8265270: Type.getEnclosingType() may fail with CompletionFailure
  • 2d494bf: 8267836: Separate eager reclaim remembered set threshold from G1RSetSparseRegionEntries
  • bba3728: 8267726: ZGC: array_copy_requires_gc_barriers too strict
  • d47a77d: 8267773: PhaseStringOpts::int_stringSize doesn't handle min_jint correctly
  • 496fb90: 8267969: Add vectorized implementation for VectorMask.eq()
  • 1cea6ca: 8260360: IGV: Short name of combined nodes is hidden by background color
  • 7530c00: 8266559: XPathEvaluationResult.XPathResultType.NODESET maps to incorrect type
  • b98e52a: 8267570: The comment of the class JavacParser is not appropriate
  • ... and 13 more: https://git.openjdk.java.net/jdk/compare/36dc268abea2522596efe830365ba4bbe6e2696c...master

Your commit was automatically rebased without conflicts.

Pushed as commit a223189.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated nio nio-dev@openjdk.org security security-dev@openjdk.org
8 participants