Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8283892: Compress and expand bits #8115

Closed
wants to merge 21 commits into from

Conversation

PaulSandoz
Copy link
Member

@PaulSandoz PaulSandoz commented Apr 5, 2022

Add support to compress bits and expand bits of int and long values, see Hacker's Delight (2nd edition), section 7.4.

Compressing or expanding bits of an int or long value can be composed to enable general permutations, and the "sheep and goats" operation (SAG) see Hacker's Delight (2nd edition), section 7.7. SAG can be used to perform a stable binary radix sort.

The compress and expand functionality maps efficiently to hardware instructions, such as PEXT and PDEP on x86 hardware. Thus the implementations can be very efficient on supporting hardware. Intrinsification will occur in a separate PR.

This paper investigates the beneficial performance impact of the PDEP instruction, and by extension the expand method, when applied to the implementation of a bit-vector select operation in succinct data structures (for example select(r) returns the position of the rth 1).

Testing-wise the approach take is three fold:

  1. Tests compared against simple implementations that are easy to read and verify against the JDK implementations (which later will also be made intrinsic). To compensate all tests are also run flipping the test methods and the methods under test.
  2. Tests composed of compress and expand and vice versa.
  3. Tests with known mask patterns, whose expected values are easily derived from the inputs.

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed
  • Change requires a CSR request to be approved

Issues

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/8115/head:pull/8115
$ git checkout pull/8115

Update a local copy of the PR:
$ git checkout pull/8115
$ git pull https://git.openjdk.java.net/jdk pull/8115/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8115

View PR using the GUI difftool:
$ git pr show -t 8115

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/8115.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 5, 2022

👋 Welcome back psandoz! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 5, 2022
@openjdk
Copy link

openjdk bot commented Apr 5, 2022

@PaulSandoz The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Apr 5, 2022
@mlbridge
Copy link

mlbridge bot commented Apr 5, 2022

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Apr 5, 2022
* specified {@code int} value, {@code i}, in accordance with
* the specified bit mask.
* <p>
* For each one-bit value of the mask, {@code mb} say, from least
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor comments is that "For each one-bit value of the mask mb ...." might be a bit better, otherwise I think these methods and their javadoc looks good. If it comes up then these methods could include an example in the javadoc as they aren't hard once you see an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change to "For each one-bit value {@code mb} of the mask ..."

test/jdk/java/lang/AbstractCompressExpandTest.java Outdated Show resolved Hide resolved

abstract long expectedExpand(long i, long mask);

static int SIZE = 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like SIZE should be a constructor parameter, to make it easier to run ad hoc stress tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's tricky because the test is TestNG based, launched via jtreg, it would need to be a system property.

* value contiguously starting from the least significant bit location.
* All the upper remaining bits of the compressed value are set
* to zero.
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following Alan's comment, I suggest the following examples:

  compress(0x9ABCDEF, 0x0F0F0FF) == 0xACEF
  compress(x, 1<<n) == (x>>n & 1)
  compress(x, -1<<n) == x >>> n
  compress(m, m) == (m==-1||m==0)? m : (1<<bitCount(m))-1
  compress(x, m) == compress(x & m, m)
  compress(expand(x, m), m) == x & compress(m, m)
  (compress(x, m) >>> n) & 1 == /*the bit of x corresponding to the nth set bit in m*/

…In two places. Similarly, examples for expand:

  expand(0x9ABCDEF, 0x0F0F0FF) == 0xC0D0EF
  expand(x, 1<<n) == (x&1) << n
  expand(x, -1<<n) == x << n
  expand(-1, m) == m
  expand(x, m) == expand(x, m) & m
  expand(compress(x, m), m) == x & m
  expand(1<<n, m) == /*the nth set bit in m, as a mask in place; cf. highest/lowestOneBit*/

(Please double check these examples!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples added in latest commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you've added a comment on each example too, I think this really helpers readers to see how they can be used.

The class description of both Integer and Long have an implementation note (they pre-date @implNote) referencing Hacker's Delight. We could potentially expand the list of methods mentioned but it's not strictly necessary are they are just examples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also saw that on the class doc and considered the new methods fit neatly under the category of "bit twiddling" methods.

@cl4es
Copy link
Member

cl4es commented Apr 7, 2022

I experimented with this and drafted a few microbenchmarks along and a micro-optimization to the expand methods (see #8146). Up to you, but I think it makes sense to consider such optimizations to the plain java implementation given that 1) not all platforms have specialized instructions where an intrinsic will make sense and 2) it appears to be a boost to warmup.

@PaulSandoz
Copy link
Member Author

@cl4es thanks, your changes to expand looks reasonable, given the array allocation cost.

@openjdk
Copy link

openjdk bot commented Apr 7, 2022

⚠️ @PaulSandoz This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The insertion of "hexadecimal" makes it clearer, good.

@openjdk openjdk bot removed the csr Pull request needs approved CSR before integration label Apr 14, 2022
@openjdk
Copy link

openjdk bot commented Apr 14, 2022

@PaulSandoz This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8283892: Compress and expand bits

Reviewed-by: alanb, redestad

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 171 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 14, 2022
@PaulSandoz
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Apr 14, 2022

Going to push as commit fbb0916.
Since your change was applied there have been 171 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 14, 2022
@openjdk openjdk bot closed this Apr 14, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 14, 2022
@openjdk
Copy link

openjdk bot commented Apr 14, 2022

@PaulSandoz Pushed as commit fbb0916.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@PaulSandoz PaulSandoz deleted the compress-expand-bits branch April 14, 2022 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
4 participants