Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8274912: Eagerly generate native invokers #593

Closed

Conversation

JornVernee
Copy link
Member

@JornVernee JornVernee commented Oct 7, 2021

Hi,

This patch is a refactoring of VM code pertaining to downcalls. As outlined in the JBS issue, this patch makes 'native invoker' generation currently done by C2 eager.

Most of the changes in this patch are moving around existing code:

  • The native invoker generation code is moved from sharedRuntime*.cpp to universalNativeInvoker*.cpp, to match what is done for upcalls.
  • C2 & nmethod scaffolding for registering native invokers has been removed (lifetime is now managed in Java code).
  • Now, C2 will only intrinsify trivial downcalls, which means some of the code was simplified there.
  • To reduce code duplication, several utilities have been added:
    • ArgumentShuffle (and impls of CallConvClosure): a utility class for computing argument shuffles and generating shuffling code. The shuffling code is heavily based on what is done today for optimized upcalls on x64.
    • To support the above, I've made a copy of the ComputeMoveOrder utility from SharedRuntime code (which was only implemented on x64), and adapted it to our use case.
    • RegSpillFill: similarly, a utility for computing and generating register spill and fill code (given a list of registers).
    • oop_cast: this was a pre-existing utility that was renamed (from 'cast') and moved into a separate file.
  • The linkToNative method handle stub now just loads the invoker address from the appendix argument and jumps to it, instead of jumping to the fallback method handle.

The implementation passes all the jdk_foreign tests on Windows/x64, Linux/x64 (WSL), and Linux/AArch64. Though, there were some pre-exsiting test failures on AArch64 wrt variadic functions.

Thanks,
Jorn


Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/panama-foreign pull/593/head:pull/593
$ git checkout pull/593

Update a local copy of the PR:
$ git checkout pull/593
$ git pull https://git.openjdk.java.net/panama-foreign pull/593/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 593

View PR using the GUI difftool:
$ git pr show -t 593

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/panama-foreign/pull/593.diff

/*
* @test id=scope
* @requires ((os.arch == "amd64" | os.arch == "x86_64") & sun.arch.data.model == "64") | os.arch == "aarch64"
* @modules jdk.incubator.foreign/jdk.internal.foreign
* @build NativeTestHelper CallGeneratorHelper TestUpcall
*
* @run testng/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:-VerifyDependencies
* --enable-native-access=ALL-UNNAMED -Dgenerator.sample.factor=17
* -DUPCALL_TEST_TYPE=SCOPE
* TestUpcall
*/

/*
* @test id=no_scope
* @requires ((os.arch == "amd64" | os.arch == "x86_64") & sun.arch.data.model == "64") | os.arch == "aarch64"
* @modules jdk.incubator.foreign/jdk.internal.foreign
* @build NativeTestHelper CallGeneratorHelper TestUpcall
*
* @run testng/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:-VerifyDependencies
* --enable-native-access=ALL-UNNAMED -Dgenerator.sample.factor=17
* -DUPCALL_TEST_TYPE=NO_SCOPE
* TestUpcall
*/

Copy link
Member Author

@JornVernee JornVernee Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that not all the upcall test configurations were being run, so I've added them back here. (these are separate jtreg tests, so shouldn't run into any problems with timeouts).

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Oct 7, 2021

👋 Welcome back jvernee! A progress list of the required criteria for merging this PR into foreign-memaccess+abi will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

else if (rOop == j_rarg1)
oop_slot = 1;
else if (rOop == j_rarg2)
oop_slot = 2;
else if (rOop == j_rarg3)
oop_slot = 3;
else if (rOop == j_rarg4)
oop_slot = 4;
else if (rOop == j_rarg5)
oop_slot = 5;
else if (rOop == j_rarg6)
oop_slot = 6;
else {
assert(rOop == j_rarg7, "wrong register");
oop_slot = 7;
}

oop_slot = oop_slot * VMRegImpl::slots_per_word + oop_handle_offset;
int offset = oop_slot*VMRegImpl::stack_slot_size;

map->set_oop(VMRegImpl::stack2reg(oop_slot));
// Store oop in handle area, may be NULL
str(rOop, Address(sp, offset));
if (is_receiver) {
*receiver_offset = offset;
}

cmp(rOop, zr);
lea(rHandle, Address(sp, offset));
// conditionally move a NULL
csel(rHandle, zr, rHandle, Assembler::EQ);
}

// If arg is on the stack then place it otherwise it is already in correct reg.
if (dst.first()->is_stack()) {
str(rHandle, Address(sp, reg2offset_out(dst.first())));
}
}

// A float arg may have to do float reg int reg conversion
void MacroAssembler::float_move(VMRegPair src, VMRegPair dst) {
assert(src.first()->is_stack() && dst.first()->is_stack() ||
src.first()->is_reg() && dst.first()->is_reg(), "Unexpected error");
if (src.first()->is_stack()) {
if (dst.first()->is_stack()) {
ldrw(rscratch1, Address(rfp, reg2offset_in(src.first())));
strw(rscratch1, Address(sp, reg2offset_out(dst.first())));
} else {
ShouldNotReachHere();
}
} else if (src.first() != dst.first()) {
if (src.is_single_phys_reg() && dst.is_single_phys_reg())
fmovs(dst.first()->as_FloatRegister(), src.first()->as_FloatRegister());
else
ShouldNotReachHere();
}
}

// A long move
void MacroAssembler::long_move(VMRegPair src, VMRegPair dst) {
if (src.first()->is_stack()) {
if (dst.first()->is_stack()) {
// stack to stack
ldr(rscratch1, Address(rfp, reg2offset_in(src.first())));
str(rscratch1, Address(sp, reg2offset_out(dst.first())));
} else {
// stack to reg
ldr(dst.first()->as_Register(), Address(rfp, reg2offset_in(src.first())));
}
} else if (dst.first()->is_stack()) {
// reg to stack
// Do we really have to sign extend???
// __ movslq(src.first()->as_Register(), src.first()->as_Register());
str(src.first()->as_Register(), Address(sp, reg2offset_out(dst.first())));
} else {
if (dst.first() != src.first()) {
mov(dst.first()->as_Register(), src.first()->as_Register());
}
}
}


// A double move
void MacroAssembler::double_move(VMRegPair src, VMRegPair dst) {
assert(src.first()->is_stack() && dst.first()->is_stack() ||
src.first()->is_reg() && dst.first()->is_reg(), "Unexpected error");
if (src.first()->is_stack()) {
if (dst.first()->is_stack()) {
ldr(rscratch1, Address(rfp, reg2offset_in(src.first())));
str(rscratch1, Address(sp, reg2offset_out(dst.first())));
} else {
ShouldNotReachHere();
}
} else if (src.first() != dst.first()) {
if (src.is_single_phys_reg() && dst.is_single_phys_reg())
fmovd(dst.first()->as_FloatRegister(), src.first()->as_FloatRegister());
else
ShouldNotReachHere();
}
}
Copy link
Member Author

@JornVernee JornVernee Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what was done for x64, I've moved the aarch64 shuffling helpers to macroassembler

movq(Address(rsp, reg2offset_out(dst.first())), src.first()->as_Register());
}
} else if (dst.is_single_phys_reg()) {
assert(src.is_single_reg(), "not a stack pair");
movq(dst.first()->as_Register(), Address(rbp, reg2offset_out(src.first())));
movq(dst.first()->as_Register(), Address(rbp, reg2offset_in(src.first())));
Copy link
Member Author

@JornVernee JornVernee Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pre-existing bug found when testing (same below). Loads from rbp should always be paired with reg2offset_in

@openjdk openjdk bot added the rfr label Oct 7, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 7, 2021

Webrevs

@@ -29,3 +29,17 @@ address ProgrammableInvoker::generate_adapter(jobject jabi, jobject jlayout) {
Unimplemented();
return nullptr;
}

RuntimeStub* ProgrammableInvoker::make_native_invoker(BasicType* signature,
Copy link
Collaborator

@mcimadamore mcimadamore Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe crazy idea - can we make it so that, by default make_native_invoker will just throw Unimplemented() in the ProgrammableInvoker class - and if clients want to provide implementation for it, they have to subclass and define an implementation? This might avoid need for stubbing things out.

Copy link
Member Author

@JornVernee JornVernee Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess in that case each platform would still need to define a factory method for the platform specific impl, so that platform agnostic code can reach it, and that would then need to be stubbed on unsupported platforms.

But, either way, I think it's better to stick with the conventions used in other VM code, which is to stub things out.

@JornVernee
Copy link
Member Author

@JornVernee JornVernee commented Oct 7, 2021

@nick-arm if you want to, could you take a look at the AArch64 bits?

@nick-arm
Copy link
Member

@nick-arm nick-arm commented Oct 11, 2021

@nick-arm if you want to, could you take a look at the AArch64 bits?

Sure, will do.

Copy link
Member

@nick-arm nick-arm left a comment

AArch64 changes look fine, just a couple of trivial naming comments. I'm also seeing the varargs test failure now on the foreign-memaccess+abi branch.

src/hotspot/cpu/aarch64/universalNativeInvoker_aarch64.cpp Outdated Show resolved Hide resolved
src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Outdated Show resolved Hide resolved
@JornVernee
Copy link
Member Author

@JornVernee JornVernee commented Oct 11, 2021

@nick-arm Thanks for the good suggestions, I've applied them.

@nick-arm
Copy link
Member

@nick-arm nick-arm commented Oct 13, 2021

The implementation passes all the jdk_foreign tests on Windows/x64, Linux/x64 (WSL), and Linux/AArch64. Though, there were some pre-exsiting test failures on AArch64 wrt variadic functions.

Looks like the varargs failures are caused by this change in #576:

https://github.com/openjdk/panama-foreign/pull/576/files#diff-9fbf245dc20c824d5e9f3b1864a18284f110ae808cf11782457f8462e8ad8ce0L211

The STACK_VARARGS_ATTRIBUTE_NAME attribute used to be set only on macOS and was used to force all variadic arguments onto the stack, but that now happens unconditionally on Linux as well.

@JornVernee
Copy link
Member Author

@JornVernee JornVernee commented Oct 14, 2021

The STACK_VARARGS_ATTRIBUTE_NAME attribute used to be set only on macOS and was used to force all variadic arguments onto the stack, but that now happens unconditionally on Linux as well.

Ah, nice find!

I think we can just add a check to see if we are on the Mac ABI and only then do the adjustment for varargs?

@JornVernee
Copy link
Member Author

@JornVernee JornVernee commented Oct 15, 2021

I've filed an issue here: https://bugs.openjdk.java.net/browse/JDK-8275332

Rather than just checking whether we are running on MacOS before adjusting for var args, I think it would be better to have it as a configuration flag passed to CallArranger, or perhaps to split the AArch64 CallArranger into a base class and 2 sub-classes for each platform, that way we can add CallArranger tests for each platform that will still run on all platforms.

Copy link
Collaborator

@mcimadamore mcimadamore left a comment

Looks very good.

@openjdk
Copy link

@openjdk openjdk bot commented Oct 15, 2021

@JornVernee This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8274912: Eagerly generate native invokers

Reviewed-by: ngasson, mcimadamore

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 266 new commits pushed to the foreign-memaccess+abi branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the foreign-memaccess+abi branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Oct 15, 2021
@JornVernee
Copy link
Member Author

@JornVernee JornVernee commented Oct 15, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Oct 15, 2021

Going to push as commit 1aa1c29.
Since your change was applied there have been 266 commits pushed to the foreign-memaccess+abi branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Oct 15, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Oct 15, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Oct 15, 2021

@JornVernee Pushed as commit 1aa1c29.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@JornVernee JornVernee deleted the Eager_Invoker_RB branch Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated
3 participants