Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-45876: Correctly rounded stdev() and pstdev() for the Decimal case #29828

Merged
merged 37 commits into from Dec 1, 2021

Conversation

rhettinger
Copy link
Contributor

@rhettinger rhettinger commented Nov 29, 2021

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 1a6c58d 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 29, 2021
@rhettinger rhettinger added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 29, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 152ed3f 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 29, 2021
denominator = 1 << -q
return numerator / denominator # Convert to float


def _decimal_sqrt_of_frac(n: int, m: int) -> Decimal:
"""Square root of n/m as a float, correctly rounded."""
# Premise: For decimal, computing (n/m).sqrt() can be off by 1 ulp.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, while the corresponding result is true for binary floating-point, it's not true for decimal floating-point: for decimal, the error can be more than 1 ulp.

Example using 3-digit precision:

>>> from decimal import Decimal, getcontext
>>> n, m = 209, 20
>>> getcontext().prec = 3
>>> (Decimal(n) / Decimal(m)).sqrt()
Decimal('3.22')
>>> getcontext().prec = 10
>>> (Decimal(n) / Decimal(m)).sqrt()
Decimal('3.232645975')

The correctly-rounded value here is around 1.265 ulps away, at 3.232645975...

However, I believe that one can show that the worst case error is at most 1.3ulp (actually 0.5 + sqrt(10)/4 = 1.290569... ulps), in which case it remains true that the correctly-rounded result will always be one of root, root.next_plus() or root.next_minus().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-correction: I guess "off by <= 1ulp" is accurate, so long as it's clear that we're talking about the maximum difference between the computed value and the correctly-rounded value. I was thinking in terms of the max difference between the computed value and the true mathematical value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the comment.

"""Square root of n/m as a float, correctly rounded."""
# Premise: For decimal, computing (n/m).sqrt() can be off by 1 ulp.
# Method: Check the result, moving up or down a step if needed.
if n <= 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be checking m here rather than n? It looks as though we multiply through by m below, so we need m to be positive for the inequalities to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case where m is zero is handled before the inequality test. The Decimal(n) / Decimal(m) step raises DivisionByZero which is a subclass of ZeroDivisionError. There is a test for this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was really more worried about the case m negative, not m zero. It seems more natural to have the normalization step n, m = -n, -m ensure that the denominator is positive than that the numerator is positive: right now, we're effectively converting -2 / 5 to 2 / (-5), and what we care about below for the maths to work is that m is positive, not that n is positive: you're implicitly converting the inequality n / m > ((root + plus) / 2) ** 2 to the inequality n > ((root + plus) / 2)**2 * m, and that conversion is only valid if m is positive.

It doesn't matter, because if just one of n and m is negative then the (Decimal(n) / Decimal(m)).sqrt() step will fail (except in the case of really extreme m where the division rounds from something tiny and negative to negative zero); it just seems like a surprising choice and for me at least it made the code harder to read and reason about.

denominator = 1 << -q
return numerator / denominator # Convert to float


def _decimal_sqrt_of_frac(n: int, m: int) -> Decimal:
"""Square root of n/m as a float, correctly rounded."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copypasta: "as a float"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@rhettinger rhettinger added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 30, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 309cb0a 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 30, 2021
@rhettinger rhettinger merged commit a39f46a into python:main Dec 1, 2021
@rhettinger rhettinger deleted the stdev_deci_sqrt branch December 1, 2021 00:20
shihai1991 added a commit to shihai1991/cpython that referenced this pull request Dec 1, 2021
* main: (21 commits)
  bpo-45876:  Have stdev() also use decimal specific square root. (pythonGH-29869)
  bpo-45876:  Correctly rounded stdev() and pstdev() for the Decimal case (pythonGH-29828)
  bpo-45711: Change exc_info related APIs to derive type and traceback from the exception instance (pythonGH-29780)
  bpo-30533:Add function inspect.getmembers_static that does not call properties or dynamic properties. (python#20911)
  bpo-45476: Disallow using asdl_seq_GET() as l-value (pythonGH-29866)
  bpo-45476: Add _Py_RVALUE() macro (pythonGH-29860)
  bpo-33381: [doc] strftime's %f option may pad zeros on the left or the right (pythonGH-29801)
  Fix EncodingWarning in Tools/freeze/test/freeze.py (pythonGH-29742)
  no-issue: remove unused import from test_graphlib.py (pythonGH-29853)
  bpo-45931: Prevent Directory.Build.props/targets from leaking from directories above the repo when building on Windows (pythonGH-29854)
  bpo-45653: fix test_embed on windows (pythonGH-29814)
  bpo-45917: Add math.exp2() method - return 2 raised to the power of x (pythonGH-29829)
  bpo-43905: Expand dataclasses.astuple() and asdict() docs (pythonGH-26154)
  bpo-44391: Remove unused argument from a varargs call. (pythonGH-29843)
  bpo-45881: configure --with-freeze-module --with-build-python (pythonGH-29835)
  bpo-45847: PY_STDLIB_MOD_SIMPLE now checks py_stdlib_not_available (pythonGH-29844)
  bpo-45828: Use unraisable exceptions within sqlite3 callbacks (FH-29591)
  bpo-40280: Emscripten systems use .wasm suffix by default (pythonGH-29842)
  bpo-45723: Sort the grand AC_CHECK_HEADERS check (pythonGH-29846)
  bpo-45847: Make socket module conditional (pythonGH-29769)
  ...
shihai1991 added a commit to shihai1991/cpython that referenced this pull request Dec 1, 2021
* main: (21 commits)
  bpo-45876:  Have stdev() also use decimal specific square root. (pythonGH-29869)
  bpo-45876:  Correctly rounded stdev() and pstdev() for the Decimal case (pythonGH-29828)
  bpo-45711: Change exc_info related APIs to derive type and traceback from the exception instance (pythonGH-29780)
  bpo-30533:Add function inspect.getmembers_static that does not call properties or dynamic properties. (python#20911)
  bpo-45476: Disallow using asdl_seq_GET() as l-value (pythonGH-29866)
  bpo-45476: Add _Py_RVALUE() macro (pythonGH-29860)
  bpo-33381: [doc] strftime's %f option may pad zeros on the left or the right (pythonGH-29801)
  Fix EncodingWarning in Tools/freeze/test/freeze.py (pythonGH-29742)
  no-issue: remove unused import from test_graphlib.py (pythonGH-29853)
  bpo-45931: Prevent Directory.Build.props/targets from leaking from directories above the repo when building on Windows (pythonGH-29854)
  bpo-45653: fix test_embed on windows (pythonGH-29814)
  bpo-45917: Add math.exp2() method - return 2 raised to the power of x (pythonGH-29829)
  bpo-43905: Expand dataclasses.astuple() and asdict() docs (pythonGH-26154)
  bpo-44391: Remove unused argument from a varargs call. (pythonGH-29843)
  bpo-45881: configure --with-freeze-module --with-build-python (pythonGH-29835)
  bpo-45847: PY_STDLIB_MOD_SIMPLE now checks py_stdlib_not_available (pythonGH-29844)
  bpo-45828: Use unraisable exceptions within sqlite3 callbacks (FH-29591)
  bpo-40280: Emscripten systems use .wasm suffix by default (pythonGH-29842)
  bpo-45723: Sort the grand AC_CHECK_HEADERS check (pythonGH-29846)
  bpo-45847: Make socket module conditional (pythonGH-29769)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants