Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue 21601 - std.math : pow(float/double, -2) produces sometimes wrong result #7783

Merged
merged 2 commits into from Mar 1, 2021

Conversation

berni44
Copy link
Contributor

@berni44 berni44 commented Feb 2, 2021

Recently I compared several algorithms for calculating pow(b, e) with b being a floating point and e an integer. I checked the current implementation, the implementation suggested in PR #7297, the implementation suggested in issue 16200 and the original algorithm from stepanov quoted in issue 16200 with the intend to replace the current algorithm with the one mentioned in issue 16200.

First, I tested the algorithms for correctness, by comparing several random generated calls with the exact result calculated with the linux tool 'bc' and rounded appropriately. The results showed, that the current algorithm has a design flaw for exponent -2 and PR #7297 has further design flaws for exponent -3 and 3.

float current PR #7297 issue 16200 stepanov
wrong 4098 (1.64%) 9411 (3.76%) 0 0
difference (avg.) 466.22 858.19 0.00 0.00
difference (max.) 2,011,150 2,011,150 0 0
double current PR #7297 issue 16200 stepanov
wrong 1386 (1.37%) 3145 (3.15%) 194 (0.19%) 194 (0.19%)
difference (avg.) 28,323,033,265.48 51,480,639,103.68 0.00 0.00
difference (max.) 631,739,613,622,540 782,593,570,062,451 1 1
x87 real current PR #7297 issue 16200 stepanov
wrong 5155 (68.73%) 5133 (68.44%) 5169 (68.92%) 5169 (68.92%)
difference (avg.) 7.24 7.24 7.25 7.25
difference (max.) 238 238 238 238

250000 tests for float, 100000 tests for double and 7500 tests for real. Tests did not include results, that reached a fixpoint (NaN, +/-inf, 0.0); difference has been messured in the number of 'nextUp' calls that where needed to get from the smaller number to the larger.

IMHO, the result clearly rules PR #7297 out.

The issue with the current implementation can easily be fixed by removing the special treatment for -2 (this is, what this PR does - the suggested speed optimization is more difficult and left for an other PR).

I also replaced the switch statement by an if statement. With that, the hack around a bug in the switch statement (see PR #7294) is not needed anymore and I removed this too.

Furthermore I had to fix some unittests for quantize: They used == for comparing floating point numbers which luckily worked with the old version, but does not with the new one.

@berni44 berni44 requested a review from ibuclaw as a code owner February 2, 2021 10:31
@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @berni44! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
21601 normal std.math : pow(float/double, -2) produces sometimes wrong result

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + phobos#7783"

@berni44 berni44 changed the title Fix Issue 21601 - std.math : pow(float/double, -2) produces sometimes wrong result WIP: Fix Issue 21601 - std.math : pow(float/double, -2) produces sometimes wrong result Feb 3, 2021
@berni44
Copy link
Contributor Author

berni44 commented Feb 3, 2021

I noticed that this PR introduces another bug:

byte b = -128; pow(2.0,b); will give a wrong result now. Exponent int.min and long.min is still broken in the current implementation. For byte and short this has been incidentally fixed by the hack I removed.

I'm not sure how I want to continue here.

@berni44 berni44 changed the title WIP: Fix Issue 21601 - std.math : pow(float/double, -2) produces sometimes wrong result Fix Issue 21601 - std.math : pow(float/double, -2) produces sometimes wrong result Feb 3, 2021
@berni44
Copy link
Contributor Author

berni44 commented Feb 3, 2021

Sorry, ignore my last comment - byte.min etc all work well due to the cast to m which is the unsigned version.

@berni44
Copy link
Contributor Author

berni44 commented Feb 18, 2021

@ibuclaw , @RazvanN7 Any comments here?

@RazvanN7
Copy link
Collaborator

Furthermore I had to fix some unittests for quantize: They used == for comparing floating point numbers which luckily worked with the old version, but does not with the new one.

Doesn't this mean that this change might break user code?

@berni44
Copy link
Contributor Author

berni44 commented Feb 22, 2021

Doesn't this mean that this change might break user code?

Yes, that is possible. But it will only affect code, that is bad designed (like the unittests I had to correct) - that is, code, that is based on an implied guarantee, that does not exist.

With floating point numbers, there is always the difficulty to decide how to cope with rounding errors. IEEE gives a clear standard, what the result of FPU instructions has to be. One can rely on this. And one knows ahead, that the error is the most minimal possible (taking rounding mode into account).

This guarantee cannot be given for code using the FPU: With every instruction used, a certain amount of error will occur. If you are lucky, they cancel out, if not they sum up. This means, that the only guarantee, such a function can give, is, that the result will be in a certain interval around the correct result.

For the pow-function this heavily depends on the size of the exponent. With the exponent -2 the guarantee is, that the result is +/-2 nextUp/nextDown from the correct result - there are two operations needed: Taking the reciprocal and squaring, and each of the two may move the result one step off.

With the proposed algorithm, the result is always in this interval, while the current algorithm breaks in some cases - see the issue, this PR addresses.

@RazvanN7
Copy link
Collaborator

Yes, that is possible. But it will only affect code, that is bad designed (like the unittests I had to correct) - that is, code, that is based on an implied guarantee, that does not exist.

The fact that code is wrongly designed doesn't change the fact that users might be depending on that faulty behavior. There might be projects out there that used this as a feature and rely on this behavior. As such, we cannot simply silently change the semantics of the code (even if we are actually fixing it). We need to somehow issue a deprecation and instruct the user to use the new, improved, version. If the behavior is changed only for the -2 exponent then we can simply issue a deprecation message that warns the user that the behavior is buggy and offer an alternative.

@berni44
Copy link
Contributor Author

berni44 commented Feb 24, 2021

We need to somehow issue a deprecation and instruct the user to use the new, improved, version. If the behavior is changed only for the -2 exponent then we can simply issue a deprecation message that warns the user that the behavior is buggy and offer an alternative.

I disagree here. Think of the impact, this would have: Because -2 is a runtime parameter, we would have to add a deprecation message to the whole pow function (which is underneath used by ^^, as far as I know). That means, all users, that use pow or ^^ will get deprecation messages for about two years, although their code is completely correct (*). Only a tiny fraction of affected users will use the exponent -2 at all. And further, most of them will not care about the least significant bit of the result or, if they care, the will have reasonably code, that will not break with the change. So for the sake of avoiding a code breaking for about 0.0001% of all users we get on the nerves (or even hinder) the other 99,9999%. Seems to me doing more harm than good. (And yet I did not even touch the impact of asking the users to use newpow instead of pow or something like this.)

Because I think that neither not fixing, nor using a deprecation cycle is a good solution, maybe we could do something in between? I think of adding a release note, something like: "We noticed that the implementation of pow(f,-2) with f being a floating point value, was buggy for some values of f. Unfortunately the fix implies small changes for other values for f (with exponent -2) too: The least significant bits of the result might differ from the current implementation. (This is due to the peculiarities of floating point numbers and cannot be avoided with reasonable means.) To avoid problems, make sure, that your algorithms do not rely on the least significant bits of floating point calculations, for example by using isClose instead of ==." (Maybe just as an announcement in 2.96.0 and as a note in 2.97.0 (where the change happens) or so.)

What do you think? Would this be OK?

(*) As a side note: This resembles somewhat the deprecation message one gets when using transposed, warning about save being wrong. In my application I use transposed as an input range, so nothing to warn about. Still I get several hundreds of warnings, when compiling. (Of course I use -d meanwhile, but that means, that I'll miss other deprecation warnings...)

@RazvanN7
Copy link
Collaborator

RazvanN7 commented Feb 24, 2021

You make a good point. I think that adding a changelog entry for this should suffice. Please add a changelog entry.

@RazvanN7 RazvanN7 added the 72h no objection -> merge The PR will be merged if there are no objections raised. label Feb 24, 2021
@ibuclaw
Copy link
Member

ibuclaw commented Feb 25, 2021

I'm not convinced by people relying on buggy math being an issue. Buggy math algorithms could cost companies trillions of dollars (just ask @donc).

@RazvanN7 RazvanN7 added auto-merge-squash and removed 72h no objection -> merge The PR will be merged if there are no objections raised. labels Mar 1, 2021
@dlang-bot dlang-bot merged commit a38908c into dlang:master Mar 1, 2021
@berni44 berni44 deleted the issue_21601 branch April 28, 2021 17:39
kinke added a commit to kinke/phobos that referenced this pull request Jun 2, 2021
kinke added a commit to ldc-developers/phobos that referenced this pull request Jun 2, 2021
dlang-bot pushed a commit that referenced this pull request Jun 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants