-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8332119: Incorrect IllegalArgumentException for C2 compiled permute kernel #19442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back jbhateja! A progress list of the required criteria for merging this PR into |
|
@jatin-bhateja This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 141 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
@jatin-bhateja The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
| if (is_vector_shuffle(vbox_klass_to)) { | ||
| op = wrap_indexes(op, num_elem_to, elem_bt_to); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wrap_indexes is needed only for two vector rearrange. It looks to me that doing a wrap_indexes here at convert would force it for single vector rearrange (or selectFrom) and thereby reduce the performance for that case as well. Please note that the single vector rearrange throws "IndexOutOfBoundsException" and doesn't need to do a wrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ignore the above comment. I verified that each index is partially wrapped as part of toShuffle(). We should name the wrap_indexes() to partially_wrap_indexes() for clarity.
sviswa7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than these two minor comments, the PR looks good to me.
| for (int i = 0; i < res.length; i++) { | ||
| float expected = Float.NaN; | ||
| // Exceptional index. | ||
| if (shuf[i] < 0 || shuf[i] >= FSP.length()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To better match the specs, this could be:
if ( (int)shuf[i] < 0 || (int)shuf[i] >= FSP.length()) {
sviswa7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
|
Hi @vnkozlov / @TobiHartmann / @iwanowww / @eme64 , please let me know if it's good for integration. |
|
Hi @TobiHartmann , @vnkozlov , please let me know if it's good to integrate. |
|
Please, wait our review and testing. |
vnkozlov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some comments.
| const TypeVect * vt = TypeVect::make(elem_bt, num_elem); | ||
| const Type * type_bt = Type::get_const_basic_type(elem_bt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove space between a type and *.
| return true; | ||
| } | ||
|
|
||
| Node* LibraryCallKit::partially_wrap_indexes(Node* index_vec, int num_elem, BasicType elem_bt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add comment with pseudo code to show what this method do?
| !arch_supports_vector(Op_VectorMaskCmp, num_elem_to, elem_bt_to, VecMaskNotUsed) || | ||
| !arch_supports_vector(Op_AndV, num_elem_to, elem_bt_to, VecMaskNotUsed) || | ||
| !arch_supports_vector(Op_Replicate, num_elem_to, elem_bt_to, VecMaskNotUsed))) { | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add log_if_needed(" here too.
| res = gvn().transform(VectorNode::make(Op_AndV, res, bcast_mod, vt)); | ||
| Node * biased_val = gvn().transform(VectorNode::make(Op_SubVB, res, bcast_lane_cnt, vt)); | ||
| res = gvn().transform(new VectorBlendNode(biased_val, res, mask)); | ||
| res = partially_wrap_indexes(res, num_elem, elem_bt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original spacing here was correct. One at the line 621 is wrong and have to be fixed.
| // Note: Unsigned greater than comparison treat both <0 and >VEC_LENGTH indices as out-of-bound | ||
| // indexes. | ||
| Node* LibraryCallKit::partially_wrap_indexes(Node* index_vec, int num_elem, BasicType elem_bt) { | ||
| assert(elem_bt == T_BYTE, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write message in assert: why is it limited to byte?
| const Type* type_bt = Type::get_const_basic_type(elem_bt); | ||
|
|
||
| Node* mod_val = gvn().makecon(TypeInt::make(num_elem-1)); | ||
| Node* bcast_mod = gvn().transform(VectorNode::scalar2vector(mod_val, num_elem, type_bt)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming issue: this is not the result of the mod, so "mod" is a bit misleading. I would use mask, as it is used as a mask in the AndV below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: it seems to me that you are duplicating these 4 lines above from its call-site. I wonder if this means that you are slicing the boundary of your new method right, or if maybe the whole if-else block from the call-site should be a new method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: it seems to me that you are duplicating these 4 lines above from its call-site. I wonder if this means that you are slicing the boundary of your new method right, or if maybe the whole if-else block from the call-site should be a new method?
The duplication you are pointing in code may not translate into IR since GVN implicitly promotes sharing based on nodes hash value which is a function of node's opcode and inputs.
| * @bug 8332119 | ||
| * @summary Incorrect IllegalArgumentException for C2 compiled permute kernel | ||
| * @modules jdk.incubator.vector | ||
| * @requires vm.compiler2.enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary to restrict to C2? Maybe this test tickles something for other compilers as well.
| * @requires vm.compiler2.enabled | ||
| * @library /test/lib / | ||
| * @run main/othervm -XX:+UnlockDiagnosticVMOptions -Xbatch -XX:-TieredCompilation -XX:CompileOnly=TestTwoVectorPermute::micro compiler.vectorapi.TestTwoVectorPermute | ||
| * @run main/othervm -XX:+UnlockDiagnosticVMOptions -Xbatch -XX:-TieredCompilation compiler.vectorapi.TestTwoVectorPermute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add a run without -XX:-TieredCompilation, that could lead to different compilation patterns, and increase our test coverage.
| public class TestTwoVectorPermute { | ||
| public static final VectorSpecies<Float> FSP = FloatVector.SPECIES_256; | ||
|
|
||
| public static void validate(float [] res, float [] shuf, float [] src1, float [] src2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public static void validate(float [] res, float [] shuf, float [] src1, float [] src2) { | |
| public static void validate(float[] res, float[] shuf, float[] src1, float[] src2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar issues below.
vnkozlov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good. Tobias ran testing for v01 and it passed.
|
Please, answer Emanuel's questions/suggestions before integration. |
|
/integrate |
|
Going to push as commit 4c09d9f.
Your commit was automatically rebased without conflicts. |
|
@jatin-bhateja Pushed as commit 4c09d9f. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Currently inline expansion of vector to shuffle conversion simply type casts the vector holding indexes to byte vector[1] where as fallback implementation[2] also wraps the indexes to a valid index range [0, VEC_LEN-1) or generates a -ve index for exceptional / OOB indices.
This patch extends the conversion inline expander to match the fall back implementation. This imposes around 20% performance tax on Vector.toShuffle() intrinsic but fixes this functional bug.
Kindly review and share your feedback.
Best Regards,
Jatin
PS: Patch also fixes an incorrectness issue reported with JDK-8332118
[1] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/FloatVector.java#L2352
[2] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/AbstractShuffle.java#L58
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19442/head:pull/19442$ git checkout pull/19442Update a local copy of the PR:
$ git checkout pull/19442$ git pull https://git.openjdk.org/jdk.git pull/19442/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 19442View PR using the GUI difftool:
$ git pr show -t 19442Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19442.diff
Webrev
Link to Webrev Comment