Skip to content

8303161: [vectorapi] VectorMask.cast narrow operation returns incorrect value with SVE #12901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

Bhavana-Kilambi
Copy link
Contributor

@Bhavana-Kilambi Bhavana-Kilambi commented Mar 7, 2023

The cast operation for VectorMask from wider type to narrow type returns incorrect result for trueCount() method invocation for the resultant mask with SVE (on some SVE machines toLong() also results in incorrect values). An example narrow operation which results in incorrect toLong() and trueCount() values is shown below for a 128-bit -> 64-bit conversion and this can be extended to other narrow operations where the source mask in bytes is either 4x or 8x the size of the result mask in bytes -

public class TestMaskCast {

    static final boolean [] mask_arr = {true, true, false, true};

    public static long narrow_long() {
        VectorMask<Long> lmask128 = VectorMask.fromArray(LongVector.SPECIES_128, mask_arr, 0);
        return lmask128.cast(IntVector.SPECIES_64).toLong();
    }

    public static void main(String[] args) {
        long r = 0L;
        for (int ic = 0; ic < 50000; ic++) {
            r = narrow_long();
        }
        System.out.println("toLong() :  " + r);
    }
}

C2 compilation result :
java --add-modules jdk.incubator.vector TestMaskCast
toLong(): 15

Interpreter result (for verification) :
java --add-modules jdk.incubator.vector -Xint TestMaskCast
toLong(): 3

The incorrect results with toLong() have been observed only on the 128-bit and 256-bit SVE machines but they are not reproducible on a 512-bit machine. However, trueCount() returns incorrect values too and they are reproducible on all the SVE machines and thus is more reliable to use trueCount() to bring out the drawbacks of the current implementation of mask cast narrow operation for SVE.

Replacing the call to toLong() by trueCount() in the above example -

public class TestMaskCast {

    static final boolean [] mask_arr = {true, true, false, true};

    public static int narrow_long() {
        VectorMask<Long> lmask128 = VectorMask.fromArray(LongVector.SPECIES_128, mask_arr, 0);
        return lmask128.cast(IntVector.SPECIES_64).trueCount();
    }

    public static void main(String[] args) {
        int r = 0;
        for (int ic = 0; ic < 50000; ic++) {
            r = narrow_long();
        }
        System.out.println("trueCount() :  " + r);
    }
}


C2 compilation result:
java --add-modules jdk.incubator.vector TestMaskCast
trueCount() : 4

Interpreter result:
java --add-modules jdk.incubator.vector -Xint TestMaskCast
trueCount() : 2

Since in this example, the source mask size in bytes is 2x that of the result mask, trueCount() returns 2x the number of true elements in the source mask. It would return 4x/8x the number of true elements in the source mask if the size of the source mask is 4x/8x that of result mask.

The returned values are incorrect because of the higher order bits in the result not being cleared (since the result is narrowed down) and trueCount() or toLong() tend to consider the higher order bits in the vector register as well which results in incorrect value. For the 128-bit to 64-bit conversion with a mask - "TT" passed, the current implementation for mask cast narrow operation returns the same mask in the lower and upper half of the 128-bit register that is - "TTTT" which results in a long value of 15 (instead of 3 - "FFTT" for the 64-bit Integer mask) and number of true elements to be 4 (instead of 2).

This patch proposes a fix for this problem. An already existing JTREG IR test - "test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastTest.java" has also been modified to call the trueCount() method as well since the toString() method alone cannot be used to reproduce the incorrect values in this bug. This test passes successfully on 128-bit, 256-bit and 512-bit SVE machines. Since the IR test has been changed, it has been tested successfully on other platforms like x86 and aarch64 Neon machines as well to ensure the changes have not introduced any new errors.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8303161: [vectorapi] VectorMask.cast narrow operation returns incorrect value with SVE

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12901/head:pull/12901
$ git checkout pull/12901

Update a local copy of the PR:
$ git checkout pull/12901
$ git pull https://git.openjdk.org/jdk.git pull/12901/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 12901

View PR using the GUI difftool:
$ git pr show -t 12901

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12901.diff

…ct value with SVE

The cast operation for VectorMask from wider type to narrow type returns
incorrect result for trueCount() method invocation for the resultant
mask with SVE (on some SVE machines toLong() also results in incorrect
values). An example narrow operation which results in incorrect toLong()
and trueCount() values is shown below for a 128-bit -> 64-bit conversion
and this can be extended to other narrow operations where the source
mask in bytes is either 4x or 8x the size of the result mask in
bytes -

public class TestMaskCast {

    static final boolean [] mask_arr = {true, true, false, true};

    public static long narrow_long() {
        VectorMask<Long> lmask128 = VectorMask.fromArray(LongVector.SPECIES_128, mask_arr, 0);
        return lmask128.cast(IntVector.SPECIES_64).toLong();
    }

    public static void main(String[] args) {
        long r = 0L;
        for (int ic = 0; ic < 50000; ic++) {
            r = narrow_long();
        }
        System.out.println("toLong() :  " + r);
    }
}

C2 compilation result :
java --add-modules jdk.incubator.vector TestMaskCast
toLong():  15

Interpreter result (for verification) :
java --add-modules jdk.incubator.vector -Xint TestMaskCast
toLong():  3

The incorrect results with toLong() have been observed only on the
128-bit and 256-bit SVE machines but they are not reproducible on a
512-bit machine. However, trueCount() returns incorrect values too
and they are reproducible on all the SVE machines and thus is more
reliable to use trueCount() to bring out the drawbacks of the current
implementation of mask cast narrow operation for SVE.

Replacing the call to toLong() by trueCount() in the above example -
public class TestMaskCast {

    static final boolean [] mask_arr = {true, true, false, true};

    public static int narrow_long() {
        VectorMask<Long> lmask128 = VectorMask.fromArray(LongVector.SPECIES_128, mask_arr, 0);
        return lmask128.cast(IntVector.SPECIES_64).trueCount();
    }

    public static void main(String[] args) {
        int r = 0;
        for (int ic = 0; ic < 50000; ic++) {
            r = narrow_long();
        }
        System.out.println("trueCount() :  " + r);
    }
}

C2 compilation result:
java --add-modules jdk.incubator.vector TestMaskCast
trueCount() :  4

Interpreter result:
java --add-modules jdk.incubator.vector -Xint TestMaskCast
trueCount() :  2

Since in this example, the source mask size in bytes is 2x that of the
result mask, trueCount() returns 2x the number of true elements in the
source mask. It would return 4x/8x the number of true elements in the
source mask if the size of the source mask is 4x/8x that of result mask.

The returned values are incorrect because of the higher order bits in
the result not being cleared (since the result is narrowed down) and
trueCount() or toLong() tend to consider the higher order bits in the
vector register as well which results in incorrect value.
For the 128-bit to 64-bit conversion with a mask - "TT" passed, the
current implementation for mask cast narrow operation returns the same
mask in the lower and upper half of the 128-bit register that is -
"TTTT" which results in a long value of 15 (instead of 3 - "FFTT" for
the 64-bit Integer mask) and number of true elements to be 4 (instead of
2).

This patch proposes a fix for this problem. An already existing JTREG IR
test - "test/hotspot/jtreg/compiler/vectorapi/VectorMaskCastTest.java"
has also been modified to call the trueCount() method as well since the
toString() method alone cannot be used to reproduce the incorrect values
in this bug. This test passes successfully on 128-bit, 256-bit and
512-bit SVE machines. Since the IR test has been changed, it has been
tested successfully on other platforms like x86 and aarch64 Neon
machines as well to ensure the changes have not introduced any new
errors.
@bridgekeeper
Copy link

bridgekeeper bot commented Mar 7, 2023

👋 Welcome back bkilambi! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 7, 2023
@openjdk
Copy link

openjdk bot commented Mar 7, 2023

@Bhavana-Kilambi The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Mar 7, 2023
@mlbridge
Copy link

mlbridge bot commented Mar 7, 2023

Webrevs

Copy link
Member

@e1iu e1iu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already reviewed internally.

Copy link

@XiaohongGong XiaohongGong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@Bhavana-Kilambi
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Mar 23, 2023

@Bhavana-Kilambi This pull request has not yet been marked as ready for integration.

@openjdk
Copy link

openjdk bot commented Mar 29, 2023

@Bhavana-Kilambi This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8303161: [vectorapi] VectorMask.cast narrow operation returns incorrect value with SVE

Reviewed-by: eliu, xgong, ngasson

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 19 new commits pushed to the master branch:

  • e56bcb0: 8305095: Update java/net/httpclient/CustomRequestPublisher.java to use new HttpTestServer factory methods
  • 0985288: 8304681: compiler/sharedstubs/SharedStubToInterpTest.java fails after JDK-8304387
  • ff368d5: 8304867: Explicitly disable dtrace for ppc builds
  • 96fa275: 8305112: RISC-V: Typo fix for RVC description
  • 7239150: 8305094: typo (missing *) in doc comment
  • 3fbbfd1: 8301995: Move invokedynamic resolution information out of ConstantPoolCacheEntry
  • 50a995f: 8304927: Update java/net/httpclient/BasicAuthTest.java to check basic auth over HTTP/2
  • ca745cb: 8291598: Matcher.appendReplacement should not create new StringBuilder instances
  • 1683a63: 8305098: [Backout] JDK-8303912 Clean up JavadocTokenizer
  • fab2357: 8304498: JShell does not switch to raw mode when there is no /bin/test
  • ... and 9 more: https://git.openjdk.org/jdk/compare/cddaf686e16424e9543be50a48b1c02337e79cf1...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealELiu, @XiaohongGong, @nick-arm) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 29, 2023
@Bhavana-Kilambi
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 29, 2023
@openjdk
Copy link

openjdk bot commented Mar 29, 2023

@Bhavana-Kilambi
Your change (at version ccb23e2) is now ready to be sponsored by a Committer.

@nick-arm
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 29, 2023

Going to push as commit 6727490.
Since your change was applied there have been 22 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 29, 2023
@openjdk openjdk bot closed this Mar 29, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Mar 29, 2023
@openjdk
Copy link

openjdk bot commented Mar 29, 2023

@nick-arm @Bhavana-Kilambi Pushed as commit 6727490.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants