Skip to content

8283598: [vectorapi] Add new vector operation for compress and expand bits #184

Closed
smita-kamath wants to merge 8 commits intoopenjdk:vectorIntrinsicsfrom
smita-kamath:compressbits
Closed

8283598: [vectorapi] Add new vector operation for compress and expand bits #184
smita-kamath wants to merge 8 commits intoopenjdk:vectorIntrinsicsfrom
smita-kamath:compressbits

Conversation

@smita-kamath
Copy link
Contributor

@smita-kamath smita-kamath commented Mar 24, 2022

Hi,

I've added support for new vector operations for compressing bits of integral vector types(Byte/Short/Integer/Long).
The implementation is based on Compress or Generalized Extract mentioned in Hackers Delight by Henry S. Warren, Jr.
The implementation does the following: given a mask and the number to be compressed, the bits of the number corresponding to the set mask bit are selected and compressed.

Currently, this PR addresses only Java changes for compress bits operation. I've also updated the test framework.
Do review and share feedback.


Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed

Issue

  • JDK-8283598: [vectorapi] Add new vector operation for compress and expand bits

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/panama-vector pull/184/head:pull/184
$ git checkout pull/184

Update a local copy of the PR:
$ git checkout pull/184
$ git pull https://git.openjdk.java.net/panama-vector pull/184/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 184

View PR using the GUI difftool:
$ git pr show -t 184

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/panama-vector/pull/184.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 24, 2022

👋 Welcome back svkamath! A progress list of the required criteria for merging this PR into vectorIntrinsics will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@smita-kamath smita-kamath marked this pull request as ready for review March 24, 2022 06:35
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 24, 2022
@mlbridge
Copy link

mlbridge bot commented Mar 24, 2022

Comment on lines 452 to 470
a = (byte) (a & bitmask); // Clear irrelevant bits
byte count_mask = (byte) (~bitmask << 1); // Count 0's to right

// Prefix mask identifies bits of bitmask that have odd number of 0's to the right
// Move mask identifies the bits to be moved
// temp identifies the bits of the given number to be moved
byte prefix_mask, move_mask, temp;
int iters = 3;

for (int i = 0; i < iters; i++) {
prefix_mask = (byte) (count_mask ^ (count_mask << 1)); // Parallel prefix
prefix_mask = (byte) (prefix_mask ^ (prefix_mask << 2));
prefix_mask = (byte) (prefix_mask ^ (prefix_mask << 4));
move_mask = (byte) (prefix_mask & bitmask); // Bits to move
bitmask = (byte)(bitmask ^ move_mask | (move_mask >> (1 << i))); // Compress bitmask
temp = (byte) (a & move_mask); // Bits of the number a to be moved.
a = (byte) (a ^ temp | (temp >> (1 << i))); // Compress a
count_mask = (byte) (count_mask & ~prefix_mask); // adjust count_mask by identifying bits that have 0 to the right
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @smita-kamath , can we simply count the bits number and then do a left shift on the result, e.g:

1) b =  a & bitmask
2) b = popcount(b)
3) result = (0x1 << b) - 1

Please correct me if I misunderstood something. Thanks so much!

Copy link
Member

@merykitty merykitty Mar 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XiaohongGong The compression is based on the bits of bitmask, not on the bits of b, so bitcompress(0b1001, 0b1100) = 0b10, bitcompress(0b0101, 0b1011) = 0b001 (The number of bits in the result I write to be equal to the number of set bits in the mask for easier visualisation).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, thanks for the explanation!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think it's better to comment an example to the scalar method or API definition.

@merykitty
Copy link
Member

Hi, what do you think about adding this operation for scalar first, and then adding it to vector api later. We could add several other bit manipulation operations such as bitCount and numberOfLeadingZeros for Byte and Short also.
Thanks.

@smita-kamath
Copy link
Contributor Author

Hi @merykitty, That is a good idea. However, the effort to bring it to scalar is much larger as we need to file a JEP and follow the required processes. This is why we plan to introduce it in Vector API first.

/* Implementation note: The implementation is based on Compress or Generalized Extract mentioned in
* Henry S. Warren, Jr's Hackers Delight, Addison Wesley, 2002.
*/
static byte compressBits(byte a, byte bitmask) {
Copy link
Member

@e1iu e1iu Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the 4th incubation(https://bugs.openjdk.java.net/browse/JDK-8280173), the compress bit should be similar to vector mask compress(CompressM). As per my understanding, the second argument bitmask should be the same as a. If so, it also can be removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i mistook the intended implementation when writing the JEP.

Since there is no scalar equivalent the operation is currently under specified, and in this case I think the expand operation also makes sense so compress(expand(x, m), m) = x

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just talked with @sviswa7. I will update the JEP. Also, i will follow up on scalar integral implementations for compress/expand bits, where the Byte/Short implementations defer to that on Integer. Intrinsic scalar implementations can be implemented afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PaulSandoz, do you think this implementation of compress bits is fine? I can plan to create a separate PR for expand bits operation. Do let me know. Thanks.

Copy link
Member

@e1iu e1iu Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i mistook the intended implementation when writing the JEP.

Since there is no scalar equivalent the operation is currently under specified, and in this case I think the expand operation also makes sense so compress(expand(x, m), m) = x

If the true count of m is less than the significant bits of x , I think x will lost bits so that this equation doesn't work. E.g., m = 0b00000001, x = 0b11111111. It's 0b00000001 after expand(x, m) , and if compressed it back with the same bitmask m, it will get 0b00000001.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks, it only works for certain cases, i did not think it through carefully.

// CompressBits placeholder node
class CompressBitsNode : public Node {
public:
CompressBitsNode( Node *in1, Node *in2 ) : Node(0,in1,in2) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: the spaces are not needed after "(" and before ")". And spaces are needed between the arguments. And "*" is better to be next close to the type. So suggest to modify as:
CompressBitsNode(Node* in1, Node* in2) : Node(0, in1, in2) {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Will fix this.

#end[long]
#end[intOrLong]
move_mask = ($type$) (prefix_mask & bitmask); // Bits to move
bitmask = ($type$)(bitmask ^ move_mask | (move_mask >> (1 << i))); // Compress bitmask
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: space between ($type$)(bitmask ^ move_mask | ...)

Comment on lines 472 to 478
#if[intOrLong]
#if[int]
int iters = 5;
#else[int]
int iters = 6;
#end[int]
#end[intOrLong]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not simply with:

#if[int]
        int iters = 5;
#end[int]
#if[long]
        int iters = 6;
#end[long]

#end[long]
#end[intOrLong]
move_mask = ($type$) (prefix_mask & b);
b = ($type$)(b ^ move_mask | (move_mask >> (1 << i)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: add one space between ($type$)(b ^ move_mask ...)

$type$ prefix_mask, move_mask, temp;
a = ($type$) (a & b);
$type$ count_mask = ($type$) (~b << 1);
$type$ mp, mv, t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. Will fix it.

Comment on lines 534 to 538
case T_BYTE: // Returning Op_CompressBits for
case T_SHORT:// all types temporarily.
case T_INT:
case T_LONG: return Op_CompressBits;
default: fatal("COMPRESS_BITS: %s", type2name(bt));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The identity issue "three spaces before "case"". Seems the above bitcount and reverse have the same issue. So please fix them together. Thanks!

@PaulSandoz
Copy link
Member

Here's the issue for scalar compress and expand bits JDK-8283892

@e1iu
Copy link
Member

e1iu commented Mar 30, 2022

Here's the issue for scalar compress and expand bits JDK-8283892

Thanks for the update. That makes more sense to me now.

Copy link
Member

@PaulSandoz PaulSandoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend moving the scalar compress implementations out of the template and into say a new class CompressExpand, that will make it easier to replace later on. For the test code do something similar, place them in a separate class duplicating that in the vector API package (since the tests do not depend on internals of the API).

I started work on the scalar implementations here based on what you did, but tried to make the code more idiomatic, so maybe copy those? (Note these have yet to be be tested.)

@smita-kamath
Copy link
Contributor Author

@PaulSandoz, sounds good. Thank you.

2) Added expand bits operation for int and long 3) Addressed review comments about code style 4) updated tests
@sviswa7
Copy link
Collaborator

sviswa7 commented Apr 12, 2022

@smita-kamath Please update the summary at the top of the PR to reflect the latest:

  1. Both compress and expand are implemented
  2. Support for only int/long vectors
    Also it will be good to update the title of JDK-8283598 to reflect that expand is also implemented as part of this PR.

@jatin-bhateja
Copy link
Member

@smita-kamath Please update the summary at the top of the PR to reflect the latest:

  1. Both compress and expand are implemented
  2. Support for only int/long vectors
    Also it will be good to update the title of JDK-8283598 to reflect that expand is also implemented as part of this PR.

@sviswa7 , any specific reason to drop byte/short versions for these APIs

@smita-kamath smita-kamath changed the title 8283598: [vectorapi] Add new vector operation for compress bits 8283598: [vectorapi] Add new vector operation for compress and expand bits Apr 12, 2022
@openjdk
Copy link

openjdk bot commented Apr 12, 2022

@smita-kamath this pull request can not be integrated into vectorIntrinsics due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout compressbits
git fetch https://git.openjdk.java.net/panama-vector vectorIntrinsics
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge vectorIntrinsics"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Apr 12, 2022
@sviswa7
Copy link
Collaborator

sviswa7 commented Apr 12, 2022

@jatin-bhateja byte/short support for compress/expand were dropped from Vector API on advice from Paul Sandoz.
@smita-kamath Please bring in code from Paul's PR https://git.openjdk.java.net/jdk/pull/8115 and use Integer and Long compress expand.

@@ -0,0 +1,8647 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be part of the commit, nor should perf_scalar_tests.template and perf_tests.template

/* Implementation note: The implementation is based on Compress or Generalized Extract mentioned in
* Henry S. Warren, Jr's Hackers Delight, Addison Wesley, 2002.
*/
static $type$ compressBits($type$ a, $type$ mask) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to make these part of the template. Copy from openjdk/jdk#8115 as Sandhya indicated and place them in a package private separate class, such as CompressExpand. Do the same for the tests and use directly. Then the code will be easier to replace.

@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Apr 13, 2022
Copy link
Collaborator

@sviswa7 sviswa7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest of the patch looks good to me.

return ($type$)((((($type$)a) & $Boxtype$.toUnsignedInt(($type$)-1)) >>> (n & $Boxtype$.SIZE-1)) | (((($type$)a) & $Boxtype$.toUnsignedInt(($type$)-1)) << ($Boxtype$.SIZE - (n & $Boxtype$.SIZE-1))));
#end[intOrLong]
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra empty line could be removed from template.

#end[intOrLong]
}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra empty line could be removed from template.

@openjdk
Copy link

openjdk bot commented Apr 14, 2022

@smita-kamath This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8283598: [vectorapi] Add new vector operation for compress and expand bits

Reviewed-by: psandoz, sviswanathan

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the vectorIntrinsics branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@PaulSandoz, @sviswa7) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 14, 2022
Comment on lines 406 to 407
gen_binary_alu_op "COMPRESS_BITS" "CompressExpandTest.compress(a,b)" "intOrLong"
gen_binary_alu_op "EXPAND_BITS" "CompressExpandTest.expand(a,b)" "intOrLong"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gen_binary_alu_op "COMPRESS_BITS" "CompressExpandTest.compress(a,b)" "intOrLong"
gen_binary_alu_op "EXPAND_BITS" "CompressExpandTest.expand(a,b)" "intOrLong"
gen_binary_alu_op "COMPRESS_BITS" "CompressExpandTest.compress(a, b)" "intOrLong"
gen_binary_alu_op "EXPAND_BITS" "CompressExpandTest.expand(a, b)" "intOrLong"

@smita-kamath
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Apr 14, 2022
@openjdk
Copy link

openjdk bot commented Apr 14, 2022

@smita-kamath
Your change (at version 7008750) is now ready to be sponsored by a Committer.

@sviswa7
Copy link
Collaborator

sviswa7 commented Apr 14, 2022

/sponsor

@openjdk
Copy link

openjdk bot commented Apr 14, 2022

Going to push as commit 5fe7992.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 14, 2022
@openjdk openjdk bot closed this Apr 14, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Apr 14, 2022
@openjdk
Copy link

openjdk bot commented Apr 14, 2022

@sviswa7 @smita-kamath Pushed as commit 5fe7992.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

7 participants