-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Fix _randbelow_with_getrandbits(). #117388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For n a power of two, 50% reduction of random bits use.
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
We're previously decided NOT to do this. The method isn't broken — it just isn't optimal for a power of two input. Also, we don't change the output of existing random number sequences without a very good reason (reproducible science is a priority). Also, the |
For reference, previous discussion is here #81181 |
I've seen discussions about this method in the past, and in fact I've waited with this PR for quite some time (years).
Reproducibility across Python 3.12 and 3.13 is affected, I don't know how bad this will be. |
FWIW, if it is a power of two you care about, just use
It is always bad for some users. We really don't want to do this. In my former field, audit and public accounting, reproducibility was a fundamental, immovable requirement. In scientific fields, reproducible sequences are also becoming the norm.
That is speculative. A quick timeit run shows that Also timeit shows that
That doesn't make sense to me. For anyone with the level of sophistication to want this micro-optimization that improves one corner case at the expense of all the others, it is trivially easy to implement this yourself or just call Further note that
For every other case, not a power of two, it slows down the code and it damages reproducibility. These are fatal flaws in the proposal. |
OK, I would agree that reproducibility is the main issue here, but the question is if it's worthwhile to make a change like this anyway for a new Python version like 3.13. The def _randbelow_with_getrandbits(self, n):
"Return a random int in the range [0,n). Defined for n > 0."
k = (n - 1).bit_length() # 2**(k-1) < n <= 2**k with k >= 0
while (r := self.getrandbits(k)) >= n:
pass
return r While the existing code has
So, with the above code, there is no slowdown anymore and the best case is restored. Of course, I know I can make these patches myself, and write extra code to get the same effect. But the point is to have this "in the language" implemented by default. Reproducibility across Python 3.12 and 3.13 will be affected. I don't know how bad that is (only 2 test cases needed a fix to make all tests pass with the new code). |
Thank you for the suggestion, but I will decline. |
For n a power of two, 50% reduction of random bits use.