Skip to content

Conversation

lschoe
Copy link

@lschoe lschoe commented Mar 30, 2024

For n a power of two, 50% reduction of random bits use.

For n a power of two, 50% reduction of random bits use.
@bedevere-app
Copy link

bedevere-app bot commented Mar 30, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Mar 30, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@rhettinger rhettinger self-assigned this Mar 30, 2024
@rhettinger
Copy link
Contributor

We're previously decided NOT to do this. The method isn't broken — it just isn't optimal for a power of two input. Also, we don't change the output of existing random number sequences without a very good reason (reproducible science is a priority). Also, the n - 1 vs n slows down the method which is used inside tight loops.

@rhettinger rhettinger closed this Mar 30, 2024
@eendebakpt
Copy link
Contributor

For reference, previous discussion is here #81181

@lschoe
Copy link
Author

lschoe commented Mar 30, 2024

I've seen discussions about this method in the past, and in fact I've waited with this PR for quite some time (years).
However, I was thinking it's now a good time with Python 3.13a to reconsider this n-1 version because of the following:

  • since Python 3.9, the call r = getrandbits(k) is allowed for k=0 in which case it gives r=0 as correct result
  • since Python 3.11, the method is specified to work for n > 0 (case n == 0 is excluded), so bit_length() is not used with a negative integer
  • the 50% reduction for the n power of two case is a big reduction, especially for costlier random bits generated via the secrets module
  • given the optimizations since Python 3.11 continuing into 3.13, the use of n-1 may now be affordable even in tight(er) loops

Reproducibility across Python 3.12 and 3.13 is affected, I don't know how bad this will be.

@rhettinger
Copy link
Contributor

rhettinger commented Mar 30, 2024

FWIW, if it is a power of two you care about, just use getrandbits(). It is much faster.

Reproducibility across Python 3.12 and 3.13 is affected, I don't know how bad this will be.

It is always bad for some users. We really don't want to do this. In my former field, audit and public accounting, reproducibility was a fundamental, immovable requirement. In scientific fields, reproducible sequences are also becoming the norm.

the use of n-1 may now be affordable even in tight(er) loops ...

That is speculative. A quick timeit run shows that n - 1 is slower than just n. To see how much people care about this, consider that the entire-point of the __init__subclass__ code was to save just a single comparison.

Also timeit shows that genrandbits() is itself very cheap (27ns on my machine) so there isn't much saving to be had.

I've waited with this PR for quite some time (years)

That doesn't make sense to me. For anyone with the level of sophistication to want this micro-optimization that improves one corner case at the expense of all the others, it is trivially easy to implement this yourself or just call getrandbits() or use numpy which is designed for higher performance.

Further note that _randbelow is private. It is only a helper function for randrange(), shuffle(), and sample(). The latter two make many calls to _randbelow at many different sizes that are not a power of two, so they would always be worse off. Only randrange() could potentially benefit but even then the other code in randrange and _randbelow add substantial overhead, so the percentage gain in the power of two case will be small and unlikely to be noticeable.

For n a power of two, 50% reduction of random bits use.

For every other case, not a power of two, it slows down the code and it damages reproducibility. These are fatal flaws in the proposal.

@lschoe
Copy link
Author

lschoe commented Apr 1, 2024

OK, I would agree that reproducibility is the main issue here, but the question is if it's worthwhile to make a change like this anyway for a new Python version like 3.13.

The n-1 takes some extra time, but if we omit caching getrandbits = self.getrandbits from the existing code, the time gained by this change seems to more than compensate for the time needed for the n-1, like this:

    def _randbelow_with_getrandbits(self, n):
        "Return a random int in the range [0,n).  Defined for n > 0."

        k = (n - 1).bit_length()  # 2**(k-1) < n <= 2**k with k >= 0
        while (r := self.getrandbits(k)) >= n:
            pass
        return r

While the existing code has n = 2**k as its worst case, this code restores n = 2**k as its best case.
The best case halves the getrandbits calls which has more impact if these random bits are generated via the secrets module.

> py -3.13 -m timeit -s "from random import getrandbits" "getrandbits(8)"
10000000 loops, best of 5: 27.7 nsec per loop
> py -3.13 -m timeit -s "from secrets import randbits" "randbits(8)"
1000000 loops, best of 5: 345 nsec per loop

So, with the above code, there is no slowdown anymore and the best case is restored.

Of course, I know I can make these patches myself, and write extra code to get the same effect. But the point is to have this "in the language" implemented by default.

Reproducibility across Python 3.12 and 3.13 will be affected. I don't know how bad that is (only 2 test cases needed a fix to make all tests pass with the new code).

@rhettinger
Copy link
Contributor

Thank you for the suggestion, but I will decline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants