Pathological performance of decimals() with very large exponents or precision #838

Zac-HD · 2017-09-07T01:46:26Z

After merging #789, which fixed several bugs affecting previous uses of the decimals() strategy, I started poking around at the behaviour for very large exponents.

Decimals with very large exponents (up to 10**18 - 1) are perfectly valid inputs to the decimals() strategy... but casting the bounds to integers or Fractions will consume a great deal of memory, hang for a very long time, and even if that works the Conjecture buffer is then too small to produce an example.
We can take advantage of the floating-point nature of Decimals: cancel out exponents of the bounding values before we construct the underlying strategy, and adjust the drawn value accordingly. This is fiddly but doable, if we careful about the precision context and places argument.
The decimal precision context can be set so high that operations simply never complete - as in, would require 10**18 - 1 digits. Try Context(prec=MAX_PREC).divide(1, 3) as an example, but be prepared to kill the process 😉
I think it is reasonable to limit the precision of the bounding values to 10**4 digits each. If more is desired, the error can direct users to the fractions() strategy for truly unlimited-precision arithmetic. Internally, we can then clip precision to at most 10**4 digits without risking bounds errors, if we also choose a rounding mode that can't take us outside of the bounds (also fiddly but possible).

Motivating example: decimals('10E9999999999999998', '10E9999999999999999').example() hangs forever.

When casting string arguments to decimal, we should use a context that traps InvalidOperation and reraises it as InvalidArgument with a useful message. (See Inconsistent acceptance of strings as bounds in hypothesis.strategies #814 for discussion)

The text was updated successfully, but these errors were encountered:

lmshk · 2017-09-12T13:02:02Z

This also includes decimals with very small exponents. On my machine,

from decimal import Decimal
from hypothesis.strategies import decimals

decimals(max_value=Decimal(0).next_minus()).example()

takes minutes (on Python 3.5.2, hyphotesis 3.25.0) and returns a value with (literally) a million decimal places. (I had expected getcontext().prec many.)

(I am trying to generate decimals from half-open intervals and thought that this way I might get around filtering the bounds.)

Zac-HD · 2017-09-12T13:24:07Z

Yeesh, that's the bug alright 😞. Thanks for the report and I'm sorry you ran into it!

The immediate cause is that Decimal(0).next_minus() gives Decimal('-1E-1000026'), which certainly surprised me. FWIW I think of 'negative one-and-a-bit million' as a large but negative number, and the correct behaviour (adjust exponent, take advantage of fact it's a floating-point format) is the same as above.

As a workaround, I suggest picking a small bound and using that, eg Decimal(10) ** -30, optionally with the places= argument if you have a view on that.

DRMacIver · 2017-09-12T13:33:21Z

Filtering the bounds is also a perfectly reasonable thing to do I think - filter is only really problematic if the event is common, and hitting exact bounds shouldn't be.

Zac-HD · 2024-03-12T08:26:00Z

...since nobody seems to have needed this in the, uh, six-and-a-half years since I opened the issue, I'm going to close it again 😅

Zac-HD added the bug something is clearly wrong here label Sep 7, 2017

Zac-HD self-assigned this Oct 7, 2017

Zac-HD mentioned this issue Oct 29, 2017

[WIP] Handle pathological inputs to decimals() #955

Closed

Zac-HD added the performance go faster! use less memory! label Feb 25, 2018

Zac-HD added this to the 3.x milestone Mar 1, 2018

Zac-HD removed this from the 3.x milestone Mar 16, 2018

Zac-HD removed their assignment Mar 17, 2018

Zac-HD removed the bug something is clearly wrong here label Apr 30, 2019

Zac-HD closed this as completed Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pathological performance of decimals() with very large exponents or precision #838

Pathological performance of decimals() with very large exponents or precision #838

Zac-HD commented Sep 7, 2017 •

edited

Loading

lmshk commented Sep 12, 2017

Zac-HD commented Sep 12, 2017 •

edited

Loading

DRMacIver commented Sep 12, 2017

Zac-HD commented Mar 12, 2024

Pathological performance of decimals() with very large exponents or precision #838

Pathological performance of decimals() with very large exponents or precision #838

Comments

Zac-HD commented Sep 7, 2017 • edited Loading

lmshk commented Sep 12, 2017

Zac-HD commented Sep 12, 2017 • edited Loading

DRMacIver commented Sep 12, 2017

Zac-HD commented Mar 12, 2024

Zac-HD commented Sep 7, 2017 •

edited

Loading

Zac-HD commented Sep 12, 2017 •

edited

Loading