Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathological performance of decimals() with very large exponents or precision #838

Closed
3 tasks
Zac-HD opened this issue Sep 7, 2017 · 4 comments
Closed
3 tasks
Labels
performance go faster! use less memory!

Comments

@Zac-HD
Copy link
Member

Zac-HD commented Sep 7, 2017

After merging #789, which fixed several bugs affecting previous uses of the decimals() strategy, I started poking around at the behaviour for very large exponents.

  • Decimals with very large exponents (up to 10**18 - 1) are perfectly valid inputs to the decimals() strategy... but casting the bounds to integers or Fractions will consume a great deal of memory, hang for a very long time, and even if that works the Conjecture buffer is then too small to produce an example.

  • We can take advantage of the floating-point nature of Decimals: cancel out exponents of the bounding values before we construct the underlying strategy, and adjust the drawn value accordingly. This is fiddly but doable, if we careful about the precision context and places argument.

  • The decimal precision context can be set so high that operations simply never complete - as in, would require 10**18 - 1 digits. Try Context(prec=MAX_PREC).divide(1, 3) as an example, but be prepared to kill the process 😉

  • I think it is reasonable to limit the precision of the bounding values to 10**4 digits each. If more is desired, the error can direct users to the fractions() strategy for truly unlimited-precision arithmetic. Internally, we can then clip precision to at most 10**4 digits without risking bounds errors, if we also choose a rounding mode that can't take us outside of the bounds (also fiddly but possible).

Motivating example: decimals('10E9999999999999998', '10E9999999999999999').example() hangs forever.

@Zac-HD Zac-HD added the bug something is clearly wrong here label Sep 7, 2017
@lmshk
Copy link

lmshk commented Sep 12, 2017

This also includes decimals with very small exponents. On my machine,

from decimal import Decimal
from hypothesis.strategies import decimals

decimals(max_value=Decimal(0).next_minus()).example()

takes minutes (on Python 3.5.2, hyphotesis 3.25.0) and returns a value with (literally) a million decimal places. (I had expected getcontext().prec many.)

(I am trying to generate decimals from half-open intervals and thought that this way I might get around filtering the bounds.)

@Zac-HD
Copy link
Member Author

Zac-HD commented Sep 12, 2017

Yeesh, that's the bug alright 😞. Thanks for the report and I'm sorry you ran into it!

The immediate cause is that Decimal(0).next_minus() gives Decimal('-1E-1000026'), which certainly surprised me. FWIW I think of 'negative one-and-a-bit million' as a large but negative number, and the correct behaviour (adjust exponent, take advantage of fact it's a floating-point format) is the same as above.

As a workaround, I suggest picking a small bound and using that, eg Decimal(10) ** -30, optionally with the places= argument if you have a view on that.

@DRMacIver
Copy link
Member

Filtering the bounds is also a perfectly reasonable thing to do I think - filter is only really problematic if the event is common, and hitting exact bounds shouldn't be.

@Zac-HD Zac-HD self-assigned this Oct 7, 2017
@Zac-HD Zac-HD added the performance go faster! use less memory! label Feb 25, 2018
@Zac-HD Zac-HD added this to the 3.x milestone Mar 1, 2018
@Zac-HD Zac-HD removed this from the 3.x milestone Mar 16, 2018
@Zac-HD Zac-HD removed their assignment Mar 17, 2018
@Zac-HD Zac-HD removed the bug something is clearly wrong here label Apr 30, 2019
@Zac-HD
Copy link
Member Author

Zac-HD commented Mar 12, 2024

...since nobody seems to have needed this in the, uh, six-and-a-half years since I opened the issue, I'm going to close it again 😅

@Zac-HD Zac-HD closed this as completed Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance go faster! use less memory!
Projects
None yet
Development

No branches or pull requests

3 participants