Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8188044: We need Math.unsignedMultiplyHigh #4644

Closed
wants to merge 3 commits into from

Conversation

@bplb
Copy link
Member

@bplb bplb commented Jun 30, 2021

Please consider this proposal to add a method unsignedMultiplyHigh(long,long) to each of java.lang.Math and java.lang.StrictMath.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4644/head:pull/4644
$ git checkout pull/4644

Update a local copy of the PR:
$ git checkout pull/4644
$ git pull https://git.openjdk.java.net/jdk pull/4644/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4644

View PR using the GUI difftool:
$ git pr show -t 4644

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4644.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jun 30, 2021

👋 Welcome back bpb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Jun 30, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jun 30, 2021

@bplb The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs label Jun 30, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jun 30, 2021

@bplb
Copy link
Member Author

@bplb bplb commented Jun 30, 2021

This PR does not address intrinsics for the proposed method; that aspect can be handled subsequently.

@bplb
Copy link
Member Author

@bplb bplb commented Jun 30, 2021

/csr

@openjdk openjdk bot added the csr label Jun 30, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jun 30, 2021

@bplb this pull request will not be integrated until the CSR request JDK-8269705 for issue JDK-8188044 has been approved.

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Looks good.

@eamonnmcmanus
Copy link
Member

@eamonnmcmanus eamonnmcmanus commented Jul 1, 2021

Long already has some unsigned arithmetic methods: divideUnsigned, compareUnsigned, toUnsignedString. Math doesn't. Would it make more sense to add the new method to Long? That would also avoid the quirk of having distinct methods in Math and StringMath when the numbers in involved are integers.

@bplb
Copy link
Member Author

@bplb bplb commented Jul 1, 2021

Yes, I am aware of those methods on Long. We do however already have Math.multiplyHigh() so adding the new method to Math would be consistent.

@eamonnmcmanus
Copy link
Member

@eamonnmcmanus eamonnmcmanus commented Jul 1, 2021

Oh right. Then obviously this is the right place for the new method. Sorry for the noise.

long x0 = x & 0xffffffffL;
long x1 = x >>> 32;
long y0 = y & 0xffffffffL;
long y1 = y >>> 32;

long t = x1 * y0 + ((x0 * y0) >>> 32);
long z0 = x0 * y1 + (t & 0xffffffffL);
long z1 = t >>> 32;

return x1 * y1 + z1 + (z0 >>> 32);
Copy link
Contributor

@theRealAph theRealAph Jul 2, 2021

Suggested change
long x0 = x & 0xffffffffL;
long x1 = x >>> 32;
long y0 = y & 0xffffffffL;
long y1 = y >>> 32;
long t = x1 * y0 + ((x0 * y0) >>> 32);
long z0 = x0 * y1 + (t & 0xffffffffL);
long z1 = t >>> 32;
return x1 * y1 + z1 + (z0 >>> 32);
long result = Math.multiplyHigh(x, y);
if (x < 0) result += y;
if (y < 0) result += x;
return result;

Copy link
Contributor

@theRealAph theRealAph Jul 2, 2021

This is just subtracting the 2's-complement offset. I guess the idea, longer term, is that this be an intrinsic anyway, but if you do unsignedMultiplyHigh this way you'll utilize the existing multiplyHigh intrinsic on all platforms that have it.

Copy link
Contributor

@adinn adinn Jul 2, 2021

You can also do that branchlessly which might prove better

     long result = Math.multiplyHigh(x, y);
     result += (y & (x >> 63));
     result += (x & (y >> 63));
     return result;

Copy link
Contributor

@theRealAph theRealAph Jul 2, 2021

You can also do that branchlessly which might prove better

     long result = Math.multiplyHigh(x, y);
     result += (y & (x >> 63));
     result += (x & (y >> 63));
     return result;

I doubt very much that it would be better, because these days branch prediction is excellent, and we also have conditional select instructions. Exposing the condition helps C2 to eliminate it if the range of args is known. The if code is easier to understand.

Benchmark results, with one of the operands changing signs every iteration, 1000 iterations:

Benchmark                  Mode  Cnt     Score    Error  Units
MulHiTest.mulHiTest1   (aph)     avgt    3  1570.587 ± 16.602  ns/op
MulHiTest.mulHiTest2   (adinn)   avgt    3  2237.637 ±  4.740  ns/op

In any case, note that with this optimization the unsigned mulHi is in the nanosecond range, so Good Enough. IMO.

Copy link
Contributor

@theRealAph theRealAph Jul 2, 2021

But weirdly, it's the other way around on AArch64, but there's little in it:

Benchmark             Mode  Cnt     Score   Error  Units
MulHiTest.mulHiTest1  avgt    3  1492.108 ± 0.301  ns/op
MulHiTest.mulHiTest2  avgt    3  1219.521 ± 1.516  ns/op

but this is only in the case where we have unpredictable branches. Go with simple and easy to understand; it doesn't much matter.

@bplb bplb force-pushed the Math-unsignedMultiplyHigh-8188044 branch from 2ad5e39 to a499b2e Jul 2, 2021
@openjdk openjdk bot removed the csr label Jul 2, 2021
jddarcy
jddarcy approved these changes Jul 2, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jul 2, 2021

@bplb This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8188044: We need Math.unsignedMultiplyHigh

Reviewed-by: rriggs, aph, darcy

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 174 new commits pushed to the master branch:

  • 3d84398: 8268364: jmethod clearing should be done during unloading
  • 53ad903: 8269135: TestDifferentProtectionDomains runs into timeout in client VM
  • f8bcbf0: 8269596: Snapshot soft ref policy before marking/copying
  • 4107dcf: 8269466: Factor out the common code for initializing and starting internal VM JavaThreads
  • 2baf498: 8269743: test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/stress/jni/nativeAndMH/Test.java crash with small heap (-Xmx50m)
  • 589f084: 8269110: ZGC: Remove dead code in zBarrier
  • b0e1867: Merge
  • a4d2a9a: 8269745: [JVMCI] restore original qualified exports to Graal
  • e377397: 8268566: java/foreign/TestResourceScope.java timed out
  • 6c76e77: 8260684: vmTestbase/gc/gctests/PhantomReference/phantom002/TestDescription.java timed out
  • ... and 164 more: https://git.openjdk.java.net/jdk/compare/cd20c01942dd8559a31e51ef2a595c6eba44b8ad...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Jul 2, 2021
@bplb
Copy link
Member Author

@bplb bplb commented Jul 2, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Jul 2, 2021

Going to push as commit ca4bea4.
Since your change was applied there have been 174 commits pushed to the master branch:

  • 3d84398: 8268364: jmethod clearing should be done during unloading
  • 53ad903: 8269135: TestDifferentProtectionDomains runs into timeout in client VM
  • f8bcbf0: 8269596: Snapshot soft ref policy before marking/copying
  • 4107dcf: 8269466: Factor out the common code for initializing and starting internal VM JavaThreads
  • 2baf498: 8269743: test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/stress/jni/nativeAndMH/Test.java crash with small heap (-Xmx50m)
  • 589f084: 8269110: ZGC: Remove dead code in zBarrier
  • b0e1867: Merge
  • a4d2a9a: 8269745: [JVMCI] restore original qualified exports to Graal
  • e377397: 8268566: java/foreign/TestResourceScope.java timed out
  • 6c76e77: 8260684: vmTestbase/gc/gctests/PhantomReference/phantom002/TestDescription.java timed out
  • ... and 164 more: https://git.openjdk.java.net/jdk/compare/cd20c01942dd8559a31e51ef2a595c6eba44b8ad...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Jul 2, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jul 2, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jul 2, 2021

@bplb Pushed as commit ca4bea4.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@bplb bplb deleted the Math-unsignedMultiplyHigh-8188044 branch Jul 2, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Raffaello Giulietti on core-libs-dev:

... or even as a one liner, like in the test

return Math\.multiplyHigh\(x\, y\) \+ \(\(x >> \(Long\.SIZE \- 1\)\) \& y\) \+ \(\(y >> 

(Long.SIZE - 1)) & x);

On 2021-07-02 13:08, Andrew Dinn wrote:

2 similar comments
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Raffaello Giulietti on core-libs-dev:

... or even as a one liner, like in the test

return Math\.multiplyHigh\(x\, y\) \+ \(\(x >> \(Long\.SIZE \- 1\)\) \& y\) \+ \(\(y >> 

(Long.SIZE - 1)) & x);

On 2021-07-02 13:08, Andrew Dinn wrote:

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Raffaello Giulietti on core-libs-dev:

... or even as a one liner, like in the test

return Math\.multiplyHigh\(x\, y\) \+ \(\(x >> \(Long\.SIZE \- 1\)\) \& y\) \+ \(\(y >> 

(Long.SIZE - 1)) & x);

On 2021-07-02 13:08, Andrew Dinn wrote:

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Haley on core-libs-dev:

On 7/2/21 12:26 PM, Raffaello Giulietti wrote:

... or even as a one liner, like in the test

return Math.multiplyHigh(x, y) + ((x >> (Long.SIZE - 1)) & y) + ((y >>
(Long.SIZE - 1)) & x);

It was hard to write, so it should be hard to read too! :-)

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Raffaello Giulietti on core-libs-dev:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

Greetings
Raffaello

On 2021-07-02 15:50, Andrew Haley wrote:

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Haley on core-libs-dev:

On 7/2/21 12:26 PM, Raffaello Giulietti wrote:

... or even as a one liner, like in the test

return Math.multiplyHigh(x, y) + ((x >> (Long.SIZE - 1)) & y) + ((y >>
(Long.SIZE - 1)) & x);

It was hard to write, so it should be hard to read too! :-)

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Raffaello Giulietti on core-libs-dev:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

Greetings
Raffaello

On 2021-07-02 15:50, Andrew Haley wrote:

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Haley on core-libs-dev:

On 7/2/21 12:26 PM, Raffaello Giulietti wrote:

... or even as a one liner, like in the test

return Math.multiplyHigh(x, y) + ((x >> (Long.SIZE - 1)) & y) + ((y >>
(Long.SIZE - 1)) & x);

It was hard to write, so it should be hard to read too! :-)

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Dinn on core-libs-dev:

On 02/07/2021 16:30, Raffaello Giulietti wrote:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

That was indeed the point of the change. However, I doubt the difference
in timing is going to be significant given Andrew's micro-benchmark results.

regards,

Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill

1 similar comment
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Dinn on core-libs-dev:

On 02/07/2021 16:30, Raffaello Giulietti wrote:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

That was indeed the point of the change. However, I doubt the difference
in timing is going to be significant given Andrew's micro-benchmark results.

regards,

Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Raffaello Giulietti on core-libs-dev:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

Greetings
Raffaello

On 2021-07-02 15:50, Andrew Haley wrote:

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Dinn on core-libs-dev:

On 02/07/2021 16:30, Raffaello Giulietti wrote:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

That was indeed the point of the change. However, I doubt the difference
in timing is going to be significant given Andrew's micro-benchmark results.

regards,

Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Haley on core-libs-dev:

On 7/2/21 4:30 PM, Raffaello Giulietti wrote:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

I doubt it, because HotSpot generates conditional select instructions for
the popular systems, at least for C2.

I guess it might on a C1-only or pure interpreter system. That certainly
might be an argument for using the shift-and-add magic. With a comment
that would be fine, as the difference in performance doesn't seem to be
worth worrying about. We're looking at sub-nanosecond throughput.

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

2 similar comments
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Haley on core-libs-dev:

On 7/2/21 4:30 PM, Raffaello Giulietti wrote:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

I doubt it, because HotSpot generates conditional select instructions for
the popular systems, at least for C2.

I guess it might on a C1-only or pure interpreter system. That certainly
might be an argument for using the shift-and-add magic. With a comment
that would be fine, as the difference in performance doesn't seem to be
worth worrying about. We're looking at sub-nanosecond throughput.

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 8, 2021

Mailing list message from Andrew Haley on core-libs-dev:

On 7/2/21 4:30 PM, Raffaello Giulietti wrote:

FWIW, adinn's branchless code together with
PR https://git.openjdk.java.net/jdk/pull/4660
make both methods less vulnerable to timing attacks.

I doubt it, because HotSpot generates conditional select instructions for
the popular systems, at least for C2.

I guess it might on a C1-only or pure interpreter system. That certainly
might be an argument for using the shift-and-add magic. With a comment
that would be fine, as the difference in performance doesn't seem to be
worth worrying about. We're looking at sub-nanosecond throughput.

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
6 participants