-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8314949: linux PPC64 Big Endian: Implementation of Foreign Function & Memory API #15417
8314949: linux PPC64 Big Endian: Implementation of Foreign Function & Memory API #15417
Conversation
👋 Welcome back mdoerr! A progress list of the required criteria for merging this PR into |
@TheRealMDoerr The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
/** | ||
* PPC64 CallArranger specialized for ABI v1. | ||
*/ | ||
public class ABIv1CallArranger extends CallArranger { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be more natural for CallArranger to have an abstract method (or even a kind() accessor for the different kinds of ABI supported) and then have these specialized subclasses return the correct kind? It seems to me that setting the useXYZAbi
flag using an instanceof test is a little dirty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had something like that, but another reviewer didn't like it, either. Originally, I had thought that the v1 and v2 CallArrangers would get more content, but they're still empty. Would it be better to remove these special CallArrangers and distinguish in the base class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that what you are doing is similar to what was done for aarch64, which was dealt with using very simple subclasses:
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/linux/LinuxAArch64CallArranger.java
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/macos/MacOsAArch64CallArranger.java
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/aarch64/windows/WindowsAArch64CallArranger.java
In your case there's less difference, but I think we should follow the same idiom for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I've changed it with the 2nd commit.
* Positive [shiftAmount] converts to long if needed and shifts left. | ||
* Negative [shiftAmount] shifts right and converts to int if needed. | ||
*/ | ||
record ShiftLeft(int shiftAmount, Class<?> type) implements Binding { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the situation you are facing, perhaps adding the new binding here is unavoidable. Let's wait to hear from @JornVernee. In the meantime, can you point me to a document which explains this behavior? I'm curious and I'd like to know more :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm starting to see it - it's not a special rule, as much as it is a consequence of the endianness. E.g. if you have a struct that is 64 + 32 bytes, you can store the first 64 bytes as a long. Then, there's an issue as we have to fill another long, but we have only 32 bits of value. Is it the problem that if we just copy the value into the long word "as is" it will be stored in the "wrong" 32 bits? So the shift takes care of that, I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my assumption above is correct, then maybe another way to solve the problem, would be to, instead of adding a new shift binding, to generalize the VM store binding we have to allow writing a smaller value into a bigger storage, with an offset. Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ABI says: "An aggregate or union smaller than one doubleword in size is padded so that it appears in the least significant bits of the doubleword. All others are padded, if necessary, at their tail." [https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#PARAM-PASS].
I have written examples which pass 9 and 15 Bytes.
In the first case, we need to get 0x0001020304050607 in the first argument and 0x08XXXXXXXXXXXXXX into the second argument (X is "don't care"). Shift amount is 7.
In the second case, we need to get 0x0001020304050607 in the first argument and 0x08090a0b0c0d0eXX into the second argument. Shift amount is 1.
In other words, we need shift amounts between 1 and 7. Stack slots and registers are always 64 bit on PPC64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it - I found these representations:
https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.7.html#BYTEORDER
Very helpful. So you have e.g. a short
value (loaded from somewhere) and you have to store it on a double-word. Now, if you just stored it at offset 0, you will write the bits 0-15, which are the "most" significant bits in big-endian representation. So, it's backwards. I believe FFM will take care of endianness, so that the bytes 0-7 and 8-15 will be "swapped" when writing into the double-word (right?) but their base offset (0) is still off, as they should really start at offset 48. Hence the shift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking for a while, I think having the new binding operator is needed. e.g. for stores: we load a 32 bit value from the end of a struct, then the low-order bits of the value needs to be padded with zeros to get a 64 bit register value, leaving the original 32 bit value in the high order bits. This can't be handled by the current cast operator. e.g. if we had an int -> long cast conversion, then in the resulting value the low-order bits would be occupied by the 32 bit value, which is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall these changes look good - as commented I'd like to learn a bit more of the underlying ABI, to get a sense of whether adding a new binding is ok. But overall it's great to see support for a big-endian ABI - apart from the linker, I am pleased to see that you did not encounter too many issues in the memory-side of the FFM API.
@Override | ||
public void interpret(Deque<Object> stack, StoreFunc storeFunc, | ||
LoadFunc loadFunc, SegmentAllocator allocator) { | ||
if (shiftAmount > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we assume we can only deal with ints or longs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have inserted casts into public Binding.Builder shiftLeft(int shiftAmount, Class<?> type)
(similar to other bindings). The VM handles integral types smaller than int
like int
and uses 4 Bytes for arithmetic operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see that now - it's done the binding "builder".
@mcimadamore: Thanks for your feedback! Jorn and I had resolved the other issues already when we have worked on the linux little endian part. It already contains some ABIv1 code. Note that we already have one big endian platform: s390. But that one doesn't pass structs >8 Bytes in registers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, I've been on vacation.
src/java.base/share/classes/jdk/internal/foreign/abi/Binding.java
Outdated
Show resolved
Hide resolved
src/java.base/share/classes/jdk/internal/foreign/abi/BindingSpecializer.java
Outdated
Show resolved
Hide resolved
src/java.base/share/classes/jdk/internal/foreign/abi/Binding.java
Outdated
Show resolved
Hide resolved
if (shiftAmount > 0 && isSubIntType(type)) { | ||
bindings.add(Binding.cast(type, int.class)); | ||
type = int.class; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the casts are handled here with explicit cast bindings, but the widening from int -> long, and narrowing from long -> int are handled implicitly as part of the ShiftLeft implementation. I'd much prefer if all the type conversions are handled with explicit cast bindings. This would also semantically simplify the shift operator, since it would just handle the actual shifting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we would need to add additional bindings for that? Is is worth adding more just for a big endian corner case? Or can that be done with the existing ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A Cast
case for int -> long and long -> int would need to be added. Given the existing setup, that should only be a few lines of code for each. (See e.g. for int -> long master...JornVernee:jdk:I2L). I don't think the cost is that high.
Is is worth adding more just for a big endian corner case?
I think it's worth it in order to have a cleaner contract for the shift ops, should we want to use them for anything else in the future, but also just to make them easier to understand for future readers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth it in order to have a cleaner contract for the shift ops, should we want to use them for anything else in the future, but also just to make them easier to understand for future readers.
I agree that having a cleaner contract for the shift binding would prove useful in the long run. If we do that, we can also simplify the binding itself, as it would no longer need an input type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as it would no longer need an input type?
Yes. Then both shift ops would always operate on long
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it. Note that I need many more conversions because buffer load/store also use subtypes of int
. Please take a look at my updated version (after commit number 5).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, your changes look great.
src/java.base/share/classes/jdk/internal/foreign/abi/Binding.java
Outdated
Show resolved
Hide resolved
* Positive [shiftAmount] converts to long if needed and shifts left. | ||
* Negative [shiftAmount] shifts right and converts to int if needed. | ||
*/ | ||
record ShiftLeft(int shiftAmount, Class<?> type) implements Binding { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking for a while, I think having the new binding operator is needed. e.g. for stores: we load a 32 bit value from the end of a struct, then the low-order bits of the value needs to be padded with zeros to get a 64 bit register value, leaving the original 32 bit value in the high order bits. This can't be handled by the current cast operator. e.g. if we had an int -> long cast conversion, then in the resulting value the low-order bits would be occupied by the 32 bit value, which is incorrect.
No problem. Hope you had a good time! Thanks for your feedback. |
src/java.base/share/classes/jdk/internal/foreign/abi/BindingSpecializer.java
Outdated
Show resolved
Hide resolved
@TheRealMDoerr We've been discussing the shifts in order to wrap our heads around it, and we've ended up with this diagram in order to try and visualize what happens: Let's say we have a struct with 3 ints:
If this struct is passed as an argument, then the load of the second 'half' of the struct would look like this:
So, the 'Result' is padded at the tail with zeros. Does that seem right? Does it seem useful to add this diagram as a comment somewhere, for us when we come back to this code a year from now? Thanks |
It would perhaps be cleaner if in the MSB/LSB comments we said: LSBs are zzz...z (e.g. avoid to refer to MSBs in the first, since those bytes are not exactly zero, they are the padding bytes) |
Hmm. Do you see a good place for such a comment? Maybe it would be better to use a different size for the last chunk. Maybe three or five Bytes. That's even less straight-forward. |
PPC CallArranger seems like a good place to me. We have a similar explanation comment in the AArch64 CallArranger.
Does the size matter that much? It just changes the shift amount right? Could use |
type = int.class; | ||
if (type == int.class || isSubIntType(type)) { | ||
bindings.add(Binding.cast(type, long.class)); | ||
type = long.class; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems redundant now (type
is not used below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Removed.
if (shiftAmount > 0 && isSubIntType(type)) { | ||
bindings.add(Binding.cast(type, int.class)); | ||
type = int.class; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, your changes look great.
Thank you! I have added an example to the CallArranger. Please take a look. |
Latest version looks great! I've started a CI job as well. Will approve when that comes back green. |
Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some known failures in CI (1). So, this is good to go from my perspective.
@TheRealMDoerr This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 129 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@mcimadamore: May I ask you or somebody else from the Panama team to provide a 2nd review? This PR requires Panama knowledge, not really PPC knowledge. |
Sorry - I have already reviewed it - but didn't approve as I was waiting for @JornVernee to chime in. Now he has, and I will add my approval as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great - thanks!
Thanks for your outstanding support! I'm planning to integrate tomorrow. |
/integrate |
Going to push as commit f6c203e.
Your commit was automatically rebased without conflicts. |
@TheRealMDoerr Pushed as commit f6c203e. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
I've found a way to solve the remaining FFI problem on linux PPC64 Big Endian. Large structs (>8 Bytes) which are passed in registers or on stack require shifting the Bytes in the last slot if the size is not a multiple of 8. This PR adds the required functionality to the Java code.
Please review and provide feedback. There may be better ways to implement it. I just found one which works and makes the tests pass:
Note: This PR should be considered as preparation work for AIX which also uses ABIv1.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15417/head:pull/15417
$ git checkout pull/15417
Update a local copy of the PR:
$ git checkout pull/15417
$ git pull https://git.openjdk.org/jdk.git pull/15417/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 15417
View PR using the GUI difftool:
$ git pr show -t 15417
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15417.diff
Webrev
Link to Webrev Comment