-
Notifications
You must be signed in to change notification settings - Fork 5.9k
8338694: x86_64 intrinsic for tanh using libm #20657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back vamsi-parasa! A progress list of the required criteria for merging this PR into |
@vamsi-parasa This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 122 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@asgibbons, @sviswa7, @vnkozlov, @jatin-bhateja) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
@vamsi-parasa The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
This PR doesn't include any additional tests. It is often appropriate to add more regression testing when introducing a new implementation of a method. |
__ movdqu(xmm3, ExternalAddress(pv + 16), r11 /*rscratch*/); | ||
__ mulpd(xmm1, xmm1); | ||
__ movdqu(xmm4, ExternalAddress(pv + 32), r11 /*rscratch*/); | ||
__ mulpd(xmm2, xmm1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would encourage either you add detailed comments or give meaningful names to the registers to ease the review process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this is all rather obscure. Ideally the same names that are used in wherever this comes from.
Where does the algorithm come from? What are its accuracy guarantees?
In addition, given the rarity of hyperbolic tangents in Java applications, do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theRealAph, this implementation is based on Intel libm math library and meets the accuracy requirements. The algorithm is provided in the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a copy of this information? Should it be in the commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vamsi-parasa don't hesitate in adding as much and explicit information about the original source from where the algorithm has been picked up, even though the PR explicitly mentions libm. Adding the link to source references is a good practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jatin-bhateja This is based on Intel internal LIBM sources and so there is no public link available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a copy of this information? Should it be in the commit?
@theRealAph The accuracy of standard (non fast mode) LIBM functions ensures errors of < 1 ulp. LIBM is part of Intel C++ compiler. The documentation can be found here: https://www.intel.com/content/www/us/en/docs/cpp-compiler/developer-guide-reference/2021-8/programming-tradeoffs-floating-point-applications.html.
Thank You Joe for the suggestion. Will add more tests. (This PR passes the tier-1 tanh tests in the HyperbolicTests.Java) |
Yes @vamsi-parasa ; running that test is a good backstop and it is written to be applicable to any implementation of {sinh, cosh, tanh} that meet the general quality-of-implementation criteria for java.lang.Math. To be explicit, the WorstCaseTests.java file, and for good measure all the java.lang.Math tests, should also be run too for a change like this. For a hypothetical example, if an intrinsic used different polynomials for different ranges of the input, it would be a reasonable regression tests for that implementation to probe around the boundary of the transition between the polynomials to make sure the monotonicity requirements were being met. That kind of check could be written to be generally applicable and be suitable for a regression tests in java/lang/Math or could be suitable for a regression test in the HotSpot area. HTH |
Mailing list message from Andrew Haley on core-libs-dev: On 8/27/24 12:13, Jatin Bhateja wrote:
If I had to guess, that's because there are no infinities and no NaNs Given that this "normal" execution is the expected use of tanh(), it |
Hi Joe(@jddarcy) and Andrew (@theRealAph) , Please see the updates below:
Added 1500 regression tests in HyperbolicTests.java which compare the accuracy of the Math.tanh intrinsic by using StrictMath.tanh (which calls FdLibm.Tanh.compute) as a reference. The tests are passing within 2.5 ulps of the expected result. The tests are fairly exhaustive and also cover the boundary transitions.
Ran the WorstCaseTests.java and all the tests in java.lang.Math and they're passing on my local machine.
Added new tests in HyperbolicTests.java which probe around the various boundaries of transition. 1500 testcases and they passed within 2.5ulps of the reference StrictMath.tanh
Please let me know if anything more needs to be added. |
static int testTanhIntrinsicWithReference() { | ||
double b1 = 0.02; | ||
double b2 = 5.1; | ||
double b3 = 55 * Math.log(2)/2; // ~19.062 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to use StrictMath.log here or, better use, precompute the value as a constant and document its conceptual origin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the updated code which uses the precomputed value of b3
.
|
||
for(int i = 0; i < testCases.length; i++) { | ||
double testCase = testCases[i]; | ||
failures += testTanhWithReferenceUlpDiff(testCase, StrictMath.tanh(testCase), 2.5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allowable worst-case error is 2.5 ulp, although at many arguments FDLIBM has a smaller error.
For a general Math vs StrictMath test with an allowable 2.5 ulp error, without knowing how accurate FDLIBM is for that function and argument, a large error of approx. 2X the nominal error should be allowed (in case FDLIBM errors in one direction and the Math method errors in the other direction).
If the test is going to use randomness, then its jtreg tags should include
@key randomness
and it is preferable to use jdk.test.lib.RandomFactory to get and Random object since that handles printing out a key so the random sequence can be replicated if the test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the test is going to use randomness, then its jtreg tags should include
@key randomness
and it is preferable to use jdk.test.lib.RandomFactory to get and Random object since that handles printing out a key so the random sequence can be replicated if the test fails.
Please see the test updated to use @key randomness
and jdk.test.lib.RandomFactory
to get and Random object.
The allowable worst-case error is 2.5 ulp, although at many arguments FDLIBM has a smaller error.
For a general Math vs StrictMath test with an allowable 2.5 ulp error, without knowing how accurate FDLIBM is for that function and argument, a large error of approx. 2X the nominal error should be allowed (in case FDLIBM errors in one direction and the Math method errors in the other direction).
So far the tests haven't failed with error of 2.5ulp. Would it be better to make it 5ulp? Please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, this will be the only intrinsic implementation of tanh. Therefore, at the moment it is just checking the consistency of the intrinsic implementation with StrictMath/FDLIBM tanh. If the intrinsic has a ~1 ulp accuracy, it would be expected to often be within 2.5 ulps of FDLIBM tanh. However, as written the regression test would not necessarily pass against any allowable Math.tanh implementation, which is the usual criteria for java.lang.Math tests that aren't otherwise constrained (such as by being limited to a given subset of platforms).
If there was a correctly rounded tanh to compare against, then this style of testing would be valid.
Are there any plan to intrinsify sinh or cosh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead of random we should generate offline additional correctly rounded fixed test points to cater to new algorithm using high precision arithmetic library and then simply extend the HyperbolicTests.java with these new fixed test points using existing ulp testing mechanism in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Sandhya(@sviswa7) for the suggestion! Will update the existing HyperbolicTests.java with new fixed point tests with quad precision reference values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vamsi-parasa In my thoughts the best way to do this is add the additional tests points to HyperbolicTests.java itself in the testcases array of testTanh() method. We should remove all the other changes from HyperbolicTests.java. Also no need for separate TanhTests.java file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Sandhya(@sviswa7), please see the updated code in HyperbolicTests.java
which removes the previous random based tests with the new fixed point tests. Also removed the TanhTests.java
.
@@ -807,7 +807,7 @@ void LIRGenerator::do_MathIntrinsic(Intrinsic* x) { | |||
if (x->id() == vmIntrinsics::_dexp || x->id() == vmIntrinsics::_dlog || | |||
x->id() == vmIntrinsics::_dpow || x->id() == vmIntrinsics::_dcos || | |||
x->id() == vmIntrinsics::_dsin || x->id() == vmIntrinsics::_dtan || | |||
x->id() == vmIntrinsics::_dlog10) { | |||
x->id() == vmIntrinsics::_dlog10 || x->id() == vmIntrinsics::_dtanh) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to have the tanh under #Ifdef _LP64 as we are generating stub only for 64 bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the newly added #ifdef
in the updated code.
Co-authored-by: Andrey Turbanov <turbanoff@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hand-verified the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good to me.
|
Hello Vladimir (@vnkozlov), Could you please run the tests for this PR and let us know? Thanks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I have only one nitpick. I will start testing.
src/hotspot/share/c1/c1_Compiler.cpp
Outdated
@@ -167,6 +167,9 @@ bool Compiler::is_intrinsic_supported(vmIntrinsics::ID id) { | |||
case vmIntrinsics::_dsin: | |||
case vmIntrinsics::_dcos: | |||
case vmIntrinsics::_dtan: | |||
#if defined(X86) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use #ifdef AMD64
for x64 only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Vladimir! Please see the code updated with #ifdef AMD64
.
Thank you Vladimir! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My testing passed.
Thank You Vladimir! |
/integrate |
@vamsi-parasa |
/sponsor |
Going to push as commit 212e329.
Your commit was automatically rebased without conflicts. |
@jatin-bhateja @vamsi-parasa Pushed as commit 212e329. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
// | ||
// Special cases: | ||
// tanh(NaN) = quiet NaN, and raise invalid exception | ||
// tanh(INF) = that INF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
tanh(POSITIVE_INFINITY) = +1.0
tanh(NEGATIVE_INFINITY) = -1.0
The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.tanh() using libm
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20657/head:pull/20657
$ git checkout pull/20657
Update a local copy of the PR:
$ git checkout pull/20657
$ git pull https://git.openjdk.org/jdk.git pull/20657/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 20657
View PR using the GUI difftool:
$ git pr show -t 20657
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20657.diff
Webrev
Link to Webrev Comment