-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8316708: Augment WorstCaseTests with more cases #15879
Conversation
👋 Welcome back darcy! A progress list of the required criteria for merging this PR into |
Since the error bars for sinh and cosh are larger, I didn't include those cases. FDLIBM pow has a bug where the actual error is outside of the spec'ed error bounds so that case is that included in this update. |
Webrevs
|
@@ -296,6 +320,9 @@ private static int testWorstTan() { | |||
{+0x1.D696BFA988DB9p-2, +0x1.FAC71CD34EEA6p-2}, | |||
{+0x1.46AC372243536p-1, +0x1.7BA49F739829Ep-1}, | |||
{+0x0.A3561B9121A9Bp+0, +0x0.BDD24FB9CC14Fp+0}, | |||
|
|||
// Worst-case observed error | |||
{0x1.3f9605aaeb51bp+21, -0x1.9678ee5d64935p-1}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The given expected value seems to contradict the introductory comment.
The exact value y meets -0x1.9678ee5d64935p-1
< y < -0x1.9678ee5d64934p-1
, and is closer to -0x1.9678ee5d64935p-1
.
Thus, the rounded value of y is -0x1.9678ee5d64935p-1
, but its truncated value is -0x1.9678ee5d64934p-1
.
This should be the expected value, but then the test fails.
I don't think that the test logic can support errors > 1 ulp, as is the case here.
Perhaps, rather than a single expected value, there should be explicit lower and upper bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For FDLIBM tan, the stated error in the Math.tan spec is 1 ulp and the FDLIBM sources say tan is "nearly rounded," which could reasonably be interpreted as meaning within 1 ulp. However, the reported error by the paper is 1.02 ulps.
As you note here, the current testing methodology can only really deal with at most a 1 ulp error, assuming the expected value is at the lower endpoint of the range.
To avoid any lurking errors in the FDLIBM port to Java, I generated the expected numbers running jshell on JDK 20, which uses the mostly C-based FDLIBM. For a number of cases I had to decrement the expected value for the test to pass against Math and StrictMath. The decremented value and its successor may or may not bracket the exact value; I didn't verify that.
In other words, there may be other bugs in one or both math libraries the test is detecting.
So far, I've only tried running the test locally, but this would need a cross-platform run being before pushed to cover all the intrinsics that may be in use on the full set of supported platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For FDLIBM tan, the stated error in the Math.tan spec is 1 ulp and the FDLIBM sources say tan is "nearly rounded," which could reasonably be interpreted as meaning within 1 ulp. However, the reported error by the paper is 1.02 ulps.
As you note here, the current testing methodology can only really deal with at most a 1 ulp error, assuming the expected value is at the lower endpoint of the range.
To avoid any lurking errors in the FDLIBM port to Java, I generated the expected numbers running jshell on JDK 20, which uses the mostly C-based FDLIBM. For a number of cases I had to decrement the expected value for the test to pass against Math and StrictMath. The decremented value and its successor may or may not bracket the exact value; I didn't verify that.
In other words, there may be other bugs in one or both math libraries the test is detecting.
So far, I've only tried running the test locally, but this would need a cross-platform run being before pushed to cover all the intrinsics that may be in use on the full set of supported platforms.
Subsequently, the allowable error bound for tan was bumped up to 1.25 ulps to cover the observed 1.02 ulp error under JDK-8326530: Widen allowable error bound of Math.tan.
@@ -52,6 +52,12 @@ | |||
* JDK implementation complies with a 1 ulp bound on the worst-case | |||
* values. Therefore, no addition leeway is afforded when testing | |||
* sinh and cosh. | |||
* | |||
* Additional worst-case observed error inputs for the FDLIBM-dervied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Additional worst-case observed error inputs for the FDLIBM-dervied | |
* Additional worst-case observed error inputs for the FDLIBM-derived |
@@ -243,6 +261,9 @@ private static int testWorstCos() { | |||
{+0x1.7CB7648526F99p-1, +0x1.78DAF01036D0Cp-1}, | |||
{+0x1.C65A170474549p-1, +0x1.434A3645BE208p-1}, | |||
{+0x1.6B8A6273D7C21p+0, +0x1.337FC5B072C52p-3}, | |||
|
|||
// Worst-case observed error | |||
{-0x1.34e729fd08086p+21, +0x1.6a6a0d6a17f0fp-1}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both Math.cos
and StrictMath.cos
produce the correctly rounded result here.
I don't know why the paper says otherwise. Perhaps OpenLibm is not exactly fdlibm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both
Math.cos
andStrictMath.cos
produce the correctly rounded result here. I don't know why the paper says otherwise. Perhaps OpenLibm is not exactly fdlibm.
I've looked a bit over the OpenLibm changelog. They've added a few special cases for exp and pow as least. OpenLibm itself is a direct derivative of one of the BSD math libraries, which in turn was derived from FDLIBM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both
Math.cos
andStrictMath.cos
produce the correctly rounded result here. I don't know why the paper says otherwise. Perhaps OpenLibm is not exactly fdlibm.I've looked a bit over the OpenLibm changelog. They've added a few special cases for exp and pow as least. OpenLibm itself is a direct derivative of one of the BSD math libraries, which in turn was derived from FDLIBM.
The BSDs look to have changed/improved the kernel cos computation in their fork.
@jddarcy This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Keep alive. |
@jddarcy This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Still keep alive. |
@jddarcy This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
@jddarcy This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
ping... |
/open |
@rgiulietti Only the pull request author can set the pull request state to "open" |
/open |
@jddarcy This pull request is now open |
Upon further reflection, I think it is worthwhile to include the worst-case tests from the more of the other math library implementations to better test potential intrinsifications of the various methods. |
To improve the "fingerprinting" coverage of the StrictMath tests, I've added test cases where the worst-case of a non-FDLIBM library is larger than the FDLIBM worst-case. Assuming reasonable methodology of the paper, the output of the other math library must differ from FDLIBM at such a point. |
In a subsequent push, I'll fill those in for the functions not already so updated (cbrt, expm1, log10, log1p, etc.) |
|
||
// Empirical worst-case points in other libraries with | ||
// larger worst-case errors than FDLIBM | ||
{0x0.00000000039a2p-1022, 0x0.0000000000001p-1022, 0x0.00000000039a2p-1022}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find this one on the paper.
{0x0.00000000039a2p-1022, 0x0.0000000000001p-1022, 0x0.00000000039a2p-1022}, | |
{-0x0.fffffffffffffp-1022, 0x0.0000000000001p-1022, 0x0.fffffffffffffp-1022}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find this one on the paper.
Good catch; must have been a cut and paste error on my part.
{-0x1.2bf32aaf122e2p-2, -0x1.62ee0a3a4baf9p-2}, | ||
{-0x1.8000000000000p-53, -0x1.8000000000001p-53}, | ||
{-0x1.2e496d25897ecp-2, -0x1.663d81cb08f56p-2}, | ||
{-0x1.ffffffbaefe27p-2, -0x1.62e42faa93817p-1}, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's another case
}; | |
{-0x1.5efad5491a79bp-1022, -0x1.5efad5491a79bp-1022}, | |
}; | |
// Empirical worst-case points | ||
{0x1.0ffea3878db6bp+0, 0x1.f07a0cca521fp-5}, | ||
{0x1.490af72a25a81p-1, -0x1.c4bf7ae48f078p-2}, | ||
{0x1.69e7aa6da2df5p-1, -0x1.634508c9adfp-2}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no libraries that have worse errors than OpenLibm, so I'm wondering what these values are good for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no libraries that have worse errors than OpenLibm, so I'm wondering what these values are good for?
When I was working on updating the worst-case tests for Math, I would check the input values in Math.foo() and StrictMath.foo() and if they differed pick the smaller one as the reference value. The those inputs that differ, the input is a fingerprint marker for FDLIBM and I added cases to the corresponding StrictMath test.
{-0x1.f8b791cafcdefp+4, -0x1.073ca87470dfap-3}, | ||
{-0x1.0e16eb809a35dp+944, 0x1.b5e361ed01dadp-2}, | ||
{-0x1.842d8ec8f752fp+21, -0x1.6ce864edeaffep-1}, | ||
{-0x1.1c49ad613ff3bp+19, -0x1.fffe203cfabe1p-2}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these cases are for libraries that have better worst case errors than OpenLibm.
Are they needed here?
Using high-precision results from http://www.apfloat.org/calculator/, bounds in the worst-case test adjusted to the bracketing floating-point value closer to zero. Updated tests run successfully on Oracle cross-platform build and test system. |
Tests.java L:111 should read
|
* Use "Table Maker's Dilemma" results from Jean-Michel Muller and | ||
* Vincent Lefèvre, to test the math library. See | ||
* http://perso.ens-lyon.fr/jean-michel.muller/TMD.html for original | ||
* This test containst two distinct kinds of worst-case inputs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* This test containst two distinct kinds of worst-case inputs: | |
* This test contains two distinct kinds of worst-case inputs: |
* | ||
* 1) Exact numerical results that are nearly half-way between | ||
* representable numbers or very close to a representable | ||
* number. (Half-way caess are hardest for round to nearest even; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* number. (Half-way caess are hardest for round to nearest even; | |
* number. (Half-way cases are hardest for round to nearest even; |
* close to a representable number cases are hard for directed | ||
* roundings.) | ||
* | ||
* 2) Worst-case errors as observed emprically across different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* 2) Worst-case errors as observed emprically across different | |
* 2) Worst-case errors as observed empirically across different |
* 2) Worst-case errors as observed emprically across different | ||
* implementations that are not correctly rounded. | ||
* | ||
* For the first categpory, the "Table Maker's Dilemma" results from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* For the first categpory, the "Table Maker's Dilemma" results from | |
* For the first category, the "Table Maker's Dilemma" results from |
* | ||
* For the first categpory, the "Table Maker's Dilemma" results from | ||
* Jean-Michel Muller and Vincent Lefèvre, are used. | ||
* See http://perso.ens-lyon.fr/jean-michel.muller/TMD.html for original |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* See http://perso.ens-lyon.fr/jean-michel.muller/TMD.html for original | |
* See https://perso.ens-lyon.fr/jean-michel.muller/TMD.html for original |
* | ||
* For the first categpory, the "Table Maker's Dilemma" results from | ||
* Jean-Michel Muller and Vincent Lefèvre, are used. | ||
* See http://perso.ens-lyon.fr/jean-michel.muller/TMD.html for original | ||
* test vectors from 2000 and see | ||
* http://perso.ens-lyon.fr/jean-michel.muller/TMDworstcases.pdf with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* http://perso.ens-lyon.fr/jean-michel.muller/TMDworstcases.pdf with | |
* https://perso.ens-lyon.fr/jean-michel.muller/TMDworstcases.pdf with |
{-0x1.85e624577c23ep-1, -0x1.614ac15b6df5ap-1}, | ||
{-0x1.842d8ec8f752fp+21, -0x1.6ce864edeaffdp-1}, | ||
{-0x1.07e4c92b5349dp+4, +0x1.6a096375ffb23p-1}, | ||
// {-0x1.13a5ccd87c9bbp+1008, -0x1.27b3964185d8dp-1}, // check -- need +/- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-precision computations confirm -0x1.27b3964185d8dp-1
@jddarcy This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 41 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
/integrate |
Going to push as commit 7f02f07.
Your commit was automatically rebased without conflicts. |
A new paper
"Accuracy of Mathematical Functions in Single, Double, Double Extended, and Quadruple Precision"
by Brian Gladman, Vincenzo Innocente and Paul Zimmermann
https://members.loria.fr/PZimmermann/papers/accuracy.pdf
details the inputs with generate the worst-case observed errors in different math library implementations. The FDLIBM-related worst cases should be added to the test suite.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15879/head:pull/15879
$ git checkout pull/15879
Update a local copy of the PR:
$ git checkout pull/15879
$ git pull https://git.openjdk.org/jdk.git pull/15879/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 15879
View PR using the GUI difftool:
$ git pr show -t 15879
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15879.diff
Webrev
Link to Webrev Comment