Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support macOS Aarch64 ABI for compiled wrappers. #3

Conversation

gonzalolarralde
Copy link

@gonzalolarralde gonzalolarralde commented Nov 4, 2020

This PR adds support for Apple's aarch64 C ABI, that requires stack arguments to be sized according to its data type, and considering padding properly.

Reference: https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms

/cc @AntonKozlov

Ref: JDK-8253818


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/aarch64-port.git pull/3/head:pull/3
$ git checkout pull/3

Update a local copy of the PR:
$ git checkout pull/3
$ git pull https://git.openjdk.org/aarch64-port.git pull/3/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3

View PR using the GUI difftool:
$ git pr show -t 3

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/aarch64-port/pull/3.diff

This PR adds support for Apple's aarch64 C ABI, that requires
stack arguments to be sized according to its data type, and
considering padding properly.

https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms
@bridgekeeper
Copy link

bridgekeeper bot commented Nov 4, 2020

👋 Welcome back gonzalolarralde! A progress list of the required criteria for merging this PR into jdk-macos will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 4, 2020

@gonzalolarralde The label @AntonKozlov is not a valid label. These labels are valid:

@gonzalolarralde gonzalolarralde changed the title Add Apple's ABI support to NativeNMethod compilation. Support macOS Aarch64 ABI for compiled wrappers. Nov 4, 2020
Copy link

@VladimirKempik VladimirKempik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello.
I think you need to get rid of macros

handle_padding(bytes_used, 1);
advance_offset(bytes_used, 1);

you just put platform dependent size of type on stack in generic code

handle_padding can be replaced with openjdk's align_up() function
and advance_offset is simple bytes_used += size of type on stack;

size_of_type_on_stack can be made under #ifdefs in the beginning of the file

check example in our interp patch - 368f06e

    _stack_offset = align_up(_stack_offset, nativeDoubleSpace);
    __ ldr(r0, src);
    __ str(r0, Address(to(), _stack_offset));
    _stack_offset += nativeDoubleSpace;

Copy link
Member

@AntonKozlov AntonKozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I think there is a correctness issue that I would like to discuss

// abi-dependant move for required types
#ifdef __APPLE__
# define abi_dep_word_move(masm, src, dst, size) move_merge_and_shift(masm, src, dst, size);
# define abi_dep_float_move(masm, src, dst, size) move_merge_and_shift(masm, src, dst, size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move_merge_and_shift is unable to load to a float register (i.e. in stack->reg move), only to a general purpose one. I wonder why this was not caught by tests, but it looks like a bug

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I overlooked that. it seems that test cases might not be causing a displacement of float parameters, therefore never making this evaluate true: (src.first() != dst.first()). I will try to force the error and add a test if I confirm this to be the case.

// On type size dependant ABIs types smaller than 8 bytes will use the second word
// to store the position relative to the doubleword where the value needs to be
// written. This allows padding to be pre-calculated. As a result of this it is
// to be expected that some contiguous arguments share the same stack offset.
int SharedRuntime::c_calling_convention(const BasicType *sig_bt,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c_calling_convention is used by the shared VM code, which may be confused by non-VMRegs in regs. Although I assume shared code asks convention for short signatures only, some form of assert will be useful anyway, so we'll be sure shared code won't produce unexpected result.

For example, it is possible to make a private c_calling_convention, which will be called only from the aarch64 code, which can deal with unaliged offsets on stack. And to create another "public" c_calling_convention that does not try to encode anything unusual in regs. As a side benefit, the first "private" function can encode unaligned offsets in a more straightforward way, like a a separate bitmap or array argument.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, will address this as you're describing. thanks!

@@ -856,7 +906,10 @@ int SharedRuntime::c_calling_convention(const BasicType *sig_bt,
}
}

return stk_args;
// Ensure padding when finished parsing arguments
bytes_used += bytes_used % 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align_up will be a bit more straightforward

// Ensure padding when finished parsing arguments
bytes_used += bytes_used % 8;

return bytes_used / 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider using VMRegImpl::stack_slot_size (4, a fixed size slot equal sizeof(jint)), wordSize (8, native word size) and VMRegImpl::slots_per_word (wordSize / stack_slot_size)

if (src.first()->is_stack()) {
__ ldr(rscratch1, Address(rfp, reg2offset_in(src.first())));
} else {
__ mov(rscratch1, src.first()->as_Register());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it's better operate directly on the src.first()->as_Register()

Comment on lines +963 to +980
__ andr(rscratch1, rscratch1, (1L << bit_size) - 1);
__ lsl(rscratch1, rscratch1, bit_offset);

// load destination register
__ ldr(rscratch2, Address(sp, reg2offset_out(dst.first())));

// clear bits to be overwritten by shift
if (bit_offset > 0) {
__ ror(rscratch2, rscratch2, bit_offset);
}
__ lsr(rscratch2, rscratch2, bit_size);
__ ror(rscratch2, rscratch2, 64 - bit_size - bit_offset);

// combine
__ orr(rscratch1, rscratch1, rscratch2);

// save
__ str(rscratch1, Address(sp, reg2offset_out(dst.first())));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be replaced with just lsr and strw/strh/strb?

Comment on lines 1826 to 1828
default:
move32_64(masm, in_regs[i], out_regs[c_arg]);
int_args++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we have processed all necessary types above, shouldn't the default case be ShouldNotReachHere()?

@gonzalolarralde
Copy link
Author

Thanks for the comments @AntonKozlov, I'll try to rebuild using the latest changes on the branch that add support for MAP_JIT and get back with the feedback addressed. Due to the lack of a workaround my build + test runs were blocked, but now that those patches are public that should unblock me.

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 5, 2023

@gonzalolarralde This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

openjdk bot pushed a commit that referenced this pull request Sep 24, 2024
Reviewed-by: honkar, prr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants