-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random objects twice as big as necessary on 64-bit builds #67676
Comments
The Modules/_randommodule.c implements the 32-bit version of the MersenneTwister and its struct uses (unsigned long) for each of the 624 elements of the state vector. On a 32-bit build, the unsigned longs are 4 bytes. However, on a 64-bit build, they are 8 bytes each eventhough only the bottom 32-bits are used. This causes the random object to be twice as big as necessary. sys.getsizeof(_random.Random()) reports 5016 bytes. This wastes memory, grinds the cache, and slows performance. The (unsigned long) declaration should probably be replaced with (uint32_t). |
Would it be possible to benchmark this change, to ensure that it doesn't kill performances? A quick micro-benchmark using timeit should be enough ;) I agree with the change, I already noticed the unused bits long time ago, when I took at look at the Mersenne Twister implementation. |
Oh, by the way, using 32 bits unsigned integers would avoid all the "& 0xffffffff" everywhere. |
Yes, I noticed this when reimplementing the random module in Numba. (note Numpy also uses C longs internally)
There is no way it can kill performance. |
Here is a patch. It also optimizes getrandbit() and seed() as was originally proposed in bpo-16496. |
Where?! BYW, we might want to use PY_UINT32_T rather than uint32_t directly. |
Oh, sorry, here is it. |
Some microbenchmark results on 32-bit Linux: $ ./python -m timeit -s "from random import getrandbits" -- "getrandbits(64)"
Before: 1000000 loops, best of 3: 1.41 usec per loop
After: 1000000 loops, best of 3: 1.34 usec per loop
$ ./python -m timeit -s "from random import getrandbits" -- "getrandbits(2048)"
Before: 100000 loops, best of 3: 5.84 usec per loop
After: 100000 loops, best of 3: 5.61 usec per loop
$ ./python -m timeit -s "from random import getrandbits" -- "getrandbits(65536)"
Before: 10000 loops, best of 3: 145 usec per loop
After: 10000 loops, best of 3: 137 usec per loop |
The patch looks good to me. For utint32_t, see my old issue bpo-17884: "Try to reuse stdint.h types like int32_t". |
... and you can also re-read my explanations in that issue about why simply using uint32_t and int32_t doesn't work! We need something like PY_UINT32_T (and co) for portability. The only part of bpo-17884 that's still valid is that it may well make sense to insist that exact-width 32-bit and 64-bit signed and unsigned integer types are available when building Python. |
Updated patch addresses some Victor's comments. |
Ping. |
You should fix the comment as mentioned in the review, otherwise looks good to me. |
This is good to go. |
This would have gone quicker if the size bug-fix hadn't been commingled with the optimization. |
New changeset 4b5461dcd190 by Serhiy Storchaka in branch 'default': |
New changeset 16d0e3dda31c by Zachary Ware in branch 'default': |
Thanks Zachary for fixing this. |
Should this be backported? IMO, it is a bug. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: