Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8266720: Wrong implementation in LibraryCallKit::inline_vector_shuffle_iota #3933

Closed

Conversation

Wanghuang-Huawei
Copy link

@Wanghuang-Huawei Wanghuang-Huawei commented May 8, 2021

Dear All,
Here is the patch of JDK-8266720. Could you do me a favor to review this?

  • Reproduce:
    • cherry-pick JDK-8265956
    • run patch's TestVectorShuffleIotaByteWrongImpl.java
    • However, this wrong of this code is obvious.
  • Reason :
  1. In interpreter:
static int partiallyWrapIndex(int index, int laneCount) {
    return checkIndex0(index, laneCount, (byte)-1);
}

@ForceInline
static int checkIndex0(int index, int laneCount, byte mode) {
    int wrapped = VectorIntrinsics.wrapToRange(index, laneCount);
    if (mode == 0 || wrapped == index) { // NOTE here
        return wrapped;
    }
    if (mode < 0) {
        return wrapped - laneCount;  // special mode for internal storage
    }
    throw checkIndexFailed(index, laneCount);
}

@ForceInline
static int wrapToRange(int index, int size) {
    if ((size & (size - 1)) == 0) {
        // Size is zero or a power of two, so we got this.
        return index & (size - 1);
    } else {
        return wrapToRangeNPOT(index, size);
    }
}
  1. However, we have this intrinsics in
    src/hotspot/share/opto/vectorIntrinsics.cpp [jdk/jdk]
 386   } else {
 387     ConINode* pred_node = (ConINode*)gvn().makecon(TypeInt::make(1)); // BoolTest::gt here
 388     Node * lane_cnt  = gvn().makecon(TypeInt::make(num_elem));
 389     Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt, num_elem, type_bt));
// here BoolTest::ge != 1 (which means BoolTest::gt)
 390     Node* mask = gvn().transform(new VectorMaskCmpNode(BoolTest::ge, bcast_lane_cnt, res, pred_node, vt));
  1. In aarch64 neon backend, we use BoolTest::ge for generated code:
// cond is useless here
instruct vcmge8B(vecD dst, vecD src1, vecD src2, immI cond)
%{
  predicate(n->as_Vector()->length() == 8 &&
            n->as_VectorMaskCmp()->get_predicate() == BoolTest::ge &&
            n->in(1)->in(1)->bottom_type()->is_vect()->element_basic_type() == T_BYTE);
  match(Set dst (VectorMaskCmp (Binary src1 src2) cond));
  format %{ "cmge  $dst, T8B, $src1, $src2\t# vector cmp (8B)" %}
  ins_cost(INSN_COST);
  ins_encode %{
    __ cmge(as_FloatRegister($dst$$reg), __ T8B,
            as_FloatRegister($src1$$reg), as_FloatRegister($src2$$reg));
  %}
  ins_pipe(vdop64);
%}

However, we use cond (=1 or BoolTest::gt). So X86 is right on jdk/jdk

instruct vcmp(legVec dst, legVec src1, legVec src2, immI8 cond, rRegP scratch) %{
  predicate(vector_length_in_bytes(n->in(1)->in(1)) >=  8 && // src1
            vector_length_in_bytes(n->in(1)->in(1)) <= 32 && // src1
            is_integral_type(vector_element_basic_type(n->in(1)->in(1)))); // src1
  match(Set dst (VectorMaskCmp (Binary src1 src2) cond));
  effect(TEMP scratch);
  format %{ "vector_compare $dst,$src1,$src2,$cond\t! using $scratch as TEMP" %}
  ins_encode %{
    int vlen_enc = vector_length_encoding(this, $src1);
    Assembler::ComparisonPredicate cmp = booltest_pred_to_comparison_pred($cond$$constant);
    Assembler::Width ww = widthForType(vector_element_basic_type(this, $src1));
    __ vpcmpCCW($dst$$XMMRegister, $src1$$XMMRegister, $src2$$XMMRegister, cmp, ww, vlen_enc, $scratch$$Register);
  %}
  ins_pipe( pipe_slow );
%}
  1. In repo panama-vector, both of them are wrong, because the IR is fixed:
 455   } else {
 456     ConINode* pred_node = (ConINode*)gvn().makecon(TypeInt::make(BoolTest::ge));// WRONG here
 457     Node * lane_cnt  = gvn().makecon(TypeInt::make(num_elem));
 458     Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt, num_elem, type_bt));
 459     Node* mask = gvn().transform(new VectorMaskCmpNode(BoolTest::ge, bcast_lane_cnt, res, pred_node, vt));

Yours,
Wang Huang


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8266720: Wrong implementation in LibraryCallKit::inline_vector_shuffle_iota

Contributors

  • Wang Huang <whuang@openjdk.org>
  • Ai Jiaming <aijiaming1@huawei.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3933/head:pull/3933
$ git checkout pull/3933

Update a local copy of the PR:
$ git checkout pull/3933
$ git pull https://git.openjdk.java.net/jdk pull/3933/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3933

View PR using the GUI difftool:
$ git pr show -t 3933

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3933.diff

@bridgekeeper
Copy link

bridgekeeper bot commented May 8, 2021

👋 Welcome back whuang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@Wanghuang-Huawei
Copy link
Author

/contributor add Wang Huang whuang@openjdk.org
/contributor add Ai Jiaming aijiaming1@huawei.com

@openjdk openjdk bot added the rfr Pull request is ready for review label May 8, 2021
@openjdk
Copy link

openjdk bot commented May 8, 2021

@Wanghuang-Huawei
Contributor Wang Huang <whuang@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented May 8, 2021

@Wanghuang-Huawei
Contributor Ai Jiaming <aijiaming1@huawei.com> successfully added.

@openjdk
Copy link

openjdk bot commented May 8, 2021

@Wanghuang-Huawei The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label May 8, 2021
@mlbridge
Copy link

mlbridge bot commented May 8, 2021

Webrevs

@Wanghuang-Huawei
Copy link
Author

This issue will be closed because I will fix it on panama-vector since #3803 has not been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org rfr Pull request is ready for review
1 participant