Fix _randbelow_with_getrandbits(). #117388

lschoe · 2024-03-30T11:55:45Z

For n a power of two, 50% reduction of random bits use.

bedevere-app · 2024-03-30T11:55:51Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

bedevere-app · 2024-03-30T12:27:21Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

rhettinger · 2024-03-30T16:03:06Z

We're previously decided NOT to do this. The method isn't broken — it just isn't optimal for a power of two input. Also, we don't change the output of existing random number sequences without a very good reason (reproducible science is a priority). Also, the n - 1 vs n slows down the method which is used inside tight loops.

eendebakpt · 2024-03-30T16:48:24Z

For reference, previous discussion is here #81181

lschoe · 2024-03-30T17:35:11Z

I've seen discussions about this method in the past, and in fact I've waited with this PR for quite some time (years).
However, I was thinking it's now a good time with Python 3.13a to reconsider this n-1 version because of the following:

since Python 3.9, the call r = getrandbits(k) is allowed for k=0 in which case it gives r=0 as correct result
since Python 3.11, the method is specified to work for n > 0 (case n == 0 is excluded), so bit_length() is not used with a negative integer
the 50% reduction for the n power of two case is a big reduction, especially for costlier random bits generated via the secrets module
given the optimizations since Python 3.11 continuing into 3.13, the use of n-1 may now be affordable even in tight(er) loops

Reproducibility across Python 3.12 and 3.13 is affected, I don't know how bad this will be.

rhettinger · 2024-03-30T19:33:22Z

FWIW, if it is a power of two you care about, just use getrandbits(). It is much faster.

Reproducibility across Python 3.12 and 3.13 is affected, I don't know how bad this will be.

It is always bad for some users. We really don't want to do this. In my former field, audit and public accounting, reproducibility was a fundamental, immovable requirement. In scientific fields, reproducible sequences are also becoming the norm.

the use of n-1 may now be affordable even in tight(er) loops ...

That is speculative. A quick timeit run shows that n - 1 is slower than just n. To see how much people care about this, consider that the entire-point of the __init__subclass__ code was to save just a single comparison.

Also timeit shows that genrandbits() is itself very cheap (27ns on my machine) so there isn't much saving to be had.

I've waited with this PR for quite some time (years)

That doesn't make sense to me. For anyone with the level of sophistication to want this micro-optimization that improves one corner case at the expense of all the others, it is trivially easy to implement this yourself or just call getrandbits() or use numpy which is designed for higher performance.

Further note that _randbelow is private. It is only a helper function for randrange(), shuffle(), and sample(). The latter two make many calls to _randbelow at many different sizes that are not a power of two, so they would always be worse off. Only randrange() could potentially benefit but even then the other code in randrange and _randbelow add substantial overhead, so the percentage gain in the power of two case will be small and unlikely to be noticeable.

For n a power of two, 50% reduction of random bits use.

For every other case, not a power of two, it slows down the code and it damages reproducibility. These are fatal flaws in the proposal.

lschoe · 2024-04-01T11:34:04Z

OK, I would agree that reproducibility is the main issue here, but the question is if it's worthwhile to make a change like this anyway for a new Python version like 3.13.

The n-1 takes some extra time, but if we omit caching getrandbits = self.getrandbits from the existing code, the time gained by this change seems to more than compensate for the time needed for the n-1, like this:

    def _randbelow_with_getrandbits(self, n):
        "Return a random int in the range [0,n).  Defined for n > 0."

        k = (n - 1).bit_length()  # 2**(k-1) < n <= 2**k with k >= 0
        while (r := self.getrandbits(k)) >= n:
            pass
        return r

While the existing code has n = 2**k as its worst case, this code restores n = 2**k as its best case.
The best case halves the getrandbits calls which has more impact if these random bits are generated via the secrets module.

> py -3.13 -m timeit -s "from random import getrandbits" "getrandbits(8)"
10000000 loops, best of 5: 27.7 nsec per loop
> py -3.13 -m timeit -s "from secrets import randbits" "randbits(8)"
1000000 loops, best of 5: 345 nsec per loop

So, with the above code, there is no slowdown anymore and the best case is restored.

Of course, I know I can make these patches myself, and write extra code to get the same effect. But the point is to have this "in the language" implemented by default.

Reproducibility across Python 3.12 and 3.13 will be affected. I don't know how bad that is (only 2 test cases needed a fix to make all tests pass with the new code).

rhettinger · 2024-04-03T05:43:42Z

Thank you for the suggestion, but I will decline.

Fix _randbelow_with_getrandbits().

65d144b

For n a power of two, 50% reduction of random bits use.

lschoe requested a review from rhettinger as a code owner March 30, 2024 11:55

bedevere-app bot added the awaiting review label Mar 30, 2024

Fix two failing (deterministically seeded) tests.

695bee6

rhettinger self-assigned this Mar 30, 2024

rhettinger closed this Mar 30, 2024

This was referenced Apr 9, 2025

random._randbelow soft fix with performance gain #132320

Closed

gh-132320: random._randbelow fix #132321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix _randbelow_with_getrandbits(). #117388

Fix _randbelow_with_getrandbits(). #117388

Uh oh!

lschoe commented Mar 30, 2024

Uh oh!

bedevere-app bot commented Mar 30, 2024

Uh oh!

bedevere-app bot commented Mar 30, 2024

Uh oh!

rhettinger commented Mar 30, 2024

Uh oh!

eendebakpt commented Mar 30, 2024

Uh oh!

lschoe commented Mar 30, 2024

Uh oh!

rhettinger commented Mar 30, 2024 •

edited

Loading

Uh oh!

lschoe commented Apr 1, 2024

Uh oh!

rhettinger commented Apr 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Fix _randbelow_with_getrandbits(). #117388

Fix _randbelow_with_getrandbits(). #117388

Uh oh!

Conversation

lschoe commented Mar 30, 2024

Uh oh!

bedevere-app bot commented Mar 30, 2024

Uh oh!

bedevere-app bot commented Mar 30, 2024

Uh oh!

rhettinger commented Mar 30, 2024

Uh oh!

eendebakpt commented Mar 30, 2024

Uh oh!

lschoe commented Mar 30, 2024

Uh oh!

rhettinger commented Mar 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lschoe commented Apr 1, 2024

Uh oh!

rhettinger commented Apr 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rhettinger commented Mar 30, 2024 •

edited

Loading