-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8280476: [macOS] : hotspot arm64 bug exposed by latest clang #7270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back dcubed! A progress list of the required criteria for merging this PR into |
|
/label add hotspot-runtime It would be great to hear from @theRealAph on this review thread. |
|
@dcubed-ojdk The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
|
@dcubed-ojdk |
|
Since this fix involved undefined behavior, I'd also like to hear |
|
@dcubed-ojdk The |
Webrevs
|
theRealAph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird. We can't ever have hit it, or chaos would have ensued.
I'm going to leave it to Andrew Dinn to review this one.
|
@theRealAph - Thanks for chiming in on the review thread. This is definitely I look forward to getting a review from @adinn. |
| // would result in undefined behavior. | ||
| if (nbits == 64) { | ||
| assert(count <= 1, "must be"); | ||
| return bits; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the < 64 case, shouldn't this be return (count == 0) ? 0 : bits; ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, but that's an algorithmic question we'll need @adinn to answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimbarrett - Thanks for the review! Good catch! I believe you are correct,
but let's wait for @adinn to chime in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning 0 if count == 0 and bits otherwise is consistent with the old code,
assuming the problematic shift doesn't cause UB data corruption or something
otherwise strange. It might do the "mathematically correct" thing of shifting
out all the bits (so zeroing the value). Another possibility is that the
implemented shift amount for N is N mod word-size (some platforms do that, but
I haven't looked up aarch64), which in this case is 64 mod 64, or 0. I don't
think there are any other non-strange non-UB-data-corruption possibilities.
result is initially 0.
with the old code:
- if count == 0, there are no iterations, and final result is still 0.
- if count == 1, there is one iteration.
result <<= nbits
-- if UB corruption, all bets are off
-- if shifts by 64, result is unchanged (== 0)
-- if shifts by 64 mod 64 (== 0), result is unchanged (== 0)
result |= (bits & mask)
-- result == bits
- if count > 1, for additional iterations
result <<= nbits
-- if UB corruption, all bets are off
-- if shifts by 64, result == 0
-- if shifts by 64 mod 64 (== 0), result is unchanged (== bits)
result |= (bits & mask)
-- result == bits
So with old code, for count > 0, result == bits (or UB corrupted data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First thing to note is that replicate is never actually called with count == 0. It's questionable whether it ever should be (especially as this function is really only meant to be used locally to this code) but logically I think the result ought to be 0 if it ever does get called that way.
What replicate is meant to do is replicate the bottom nbits bits of the uint64_t passed via argument bits across successive subfields of result up to (count * bits) worth of bits (sorry ... the names are somewhat inconveniently chosen as far as this discussion is concerned). So, anyway, a count of zero ought to insert 0 bits giving a zero result.
As to how it is used, replicate is only called from method expandLogicalImmediate with two alternative calling patterns. It is sometimes called like this
for (int i = 0; i < 6; i++) {
int nbits = 1 << i;
. . .
replicate(and_bit, 1, nbits)
n.b. argument count is being supplied as nbits and argument nbits is supplied as 1 (yeah, naming is hard). These calls serve to insert anywhere between 1 and 32 copies of a single bit into the rightmost portion of the result.
It is also called like this:
for (int i = 0; i < 6; i++) {
int nbits = 1 << i;
. . .
replicate(and_bits_top, 2 * nbits, 32 / nbits)
This is used to replicate a bit pattern taken from the lower end of the first input across the whole of a uint64_t result. For these calls the input arguments count and nbits satisfy the following invariants:
nbits | 64
count | 64
nbits * count == 64
2 <= nbits <= 32
32 >= count >= 2
I am not actually clear why the caller is calling replicate in the way it does. The algorithm is way more complicated than it actually needs to be. The basic idea is to insert imms 1s into a 2^k bit field, right rotate them by immr and then replicate the result across the full 64 bits. There's a few wrinkles which are explained here for those who are interested:
expandLogicalImmediate(uint32_t immN, uint32_t immr, uint32_t imms, uint64_t& bimm)
-
outputs a bit pattern of width 64 by replicating a bit field of width 2^k and returns 1 or fails and returns 0
-
when immN is 1 then k is 6 and immr/imms are masked to 6 bit integers.
-
when immN is 0 then k is the count of the first 0 bit in imms and immr/imms are masked to k-bit integers (i.e. leading 1s in imms determine dead bits of imms and immr)
A k value of 0 (i.e. immN == 0 and imms = all 1s) fails and returns 0
After masking
An imms value of all 1s (i.e. 2^k - 1) fails and returns 0 (note given the previous check we only get here when immN == 1)
The bit field can now be constructed as follows:
- imms + 1 specifies the number of 1s to insert into the 2^k-field (note that imms + 1 < 2^k)
- immr counts how far to right rotate the 2^k-field
The resulting 2^k-field is then replicated across the full 64-bit result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree with one of your points:
replicate is never actually called with count == 0
so that means that this:
assert(count <= 1, "must be");is a little too loose and should change to:
assert(count == 1, "must be");if we're going to keep it at all. I think we should keep it, but
I can be convinced otherwise since the original code had no similar assert().
That is entirely reasonable given that replicate is only meant to be used locally and none of these local use cases will ever try to replicate 0 copies of a bit field.
We could actually add one or both of these more general asserts at entry to replicate since they apply to all cases, not just when nbits == 64:
assert(count > 0, "must be")
assert(count * nbits <= 64, "must be")
In other words it is an error to try to replicate zero copies of a bit field and it is an error to try to replicate more copies than can fit into 64 bits.
Note that the second assert removes the need for the current assert and combining it with the first assert also enforces your recommended alternative.
We could also add another general assert
assert(nbits > 0, "must be")
i.e. it is an error to try to replicate an empty bit field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimbarrett - Thanks for confirming that we have UB here.
@adinn - I'll take a look and updating the assert() calls and retest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First thing to note is that
replicateis never actually called withcount == 0.
This is crucial: it means we have no UB.
Could I ask you to take the opportunity add a little of this commentary to the rather opaque code? Pretty please? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could I ask you to take the opportunity add a little of this commentary to the rather opaque code? Pretty please? :-)
Sure,this stuff really needs to be better documented in the code itself. I'll do that as a follow-up to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First thing to note is that replicate is never actually called with count == 0.
This is crucial: it means we have no UB.
The UB occurs when count == 1 and nbits == 64.
|
The v01 changes have passed Mach5 Tier[126]. |
kimbarrett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
|
@dcubed-ojdk This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 94 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
@adinn or @theRealAph - I need one more reviewer to approve so I can integrate this. |
|
@adinn - Thanks for the review. As you mentioned above, I'll leave it up to /integrate |
|
Going to push as commit f5d6fdd.
Your commit was automatically rebased without conflicts. |
|
@dcubed-ojdk Pushed as commit f5d6fdd. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
A trivial fix to solve undefined behavior in src/hotspot/cpu/aarch64/immediate_aarch64.cpp:
replicate().
I was not able to reproduce the reported failure using:
Xcode: Version 13.2.1 (13C100) which includes clang Apple LLVM 13.0.0 (clang-1300.0.29.30)
so I'm moving forward with the proposed fix from a code inspection
point of view.
I've tested this fix with Mach5 Tier[1-6]. Tier1 and Tier2 have completed with
no failures. Tier[3-6] are still running.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7270/head:pull/7270$ git checkout pull/7270Update a local copy of the PR:
$ git checkout pull/7270$ git pull https://git.openjdk.java.net/jdk pull/7270/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 7270View PR using the GUI difftool:
$ git pr show -t 7270Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7270.diff