-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating random unique integers (i.e. non-repeating, random permutations of an integer type) #270
Comments
I believe Xoroshiro is also capable of generating such distributions (see also what this post says about Xoroshiro). We thought a lot about error handling, and reporting via both panic and Since these generators are not ideal general-purpose random number generators I would prefer not adding them directly to the For now I would suggest implementing a stand-alone crate and implementing |
Thanks @cipriancraciun for all the links! Generating random non-repeating integers reminds me a little of dhardy#82 (comment). We seem to have two options:
Working out that last point shouldn't be too hard (I could give pointers). I am not sure how to expose it in a nice API, but do think it would be a useful addition for
It could, but not easily. It works best with integer sizes. Otherwise we have to mask out bits between at least all three xor-shift steps. And every different number of bits needs three different shift constants. They are not easy to calculate, so we would have to include a list of constants for every number of bits we want to generate. |
A few of the points you mentioned would be possible, but quite a bit more complex:
|
The property you are looking for is usually called equidistribution. You could use a linear congruential generator with a full period, the Mersenne Twister or xorshift/xoroshiro. You might have to adapt the standard implementations to have the period you want as suggested in the other comments. |
For anyone for whom this might help, I've recently published a rand-unique crate to generate random sequences of unique integers. It's based on phreshing's unique prng as linked by OP, from which I have built and used a C++ implementation of successfully in the past. This Rust library is using the same logic with support for the common unsigned integer types (u8, u16, u32, u64, and usize). This is a very simple crate making use of hardcoded primes, quadratic residue, and xor to provide the pseudo-randomness. My current thought is that a "random unique sequence" is conceptually different enough from a traditional prng that the interfaces should be slightly different. There is naturally an index and an end to a sequence, where there isn't with a prng. The sequence can wrap of course, but often the user might prefer the sequence to terminate. This is hopefully a decent stab at the beginnings of a random unique sequence. I'm happy to extend the functionality as desired, and any feedback on the interface or algorithm would be welcome. I'm open to doing the work to make rand-unique suitable for merging into the rand crate too if that's a desirable outcome. I'm currently looking at handling arbitrary length sequences (eg. of length 5 or 1000, instead of only u8/256, u16/65536 etc.). This would be useful for generating permutations of arbitrarily sized collections/slices, or building permuting indexes/views into them. It's taking a bit more thought to make the algorithm suitable for small sequences though. Again, suggestions welcome! |
Another approach (but definitely not O(1) and probably only for small numbers of values) is to use choose_multiple. Thanks all for the thoughts, but I don't think this is anything we want to add to rand, so closing. |
Sometimes one needs a source of "random looking" non-repeating integers (non repeating until all the possible values have been generated). In concrete terms the simplest such generator (without the "random looking" property) is a simple counter which generates the values
min .. max
. Another concrete example would be generating a random permutation of themin .. max
range and iterating through that. (However the first is not "random looking" , and the second is not efficient.)What would be the use-case for such a generator?
generating integer "handles" for various objects; although one could use plain counters, they have a few weaknesses: they are extremely "deterministic" and thus guessable, and they are not easily "seen" in logs (imagine trying to visually look through a log which displays many such handles and you're searching for handle "272132", but it appears between other handles in the range
272090 .. 272180
; moreover using afind
functionality you can easily miss-type272123
and you'll find the wrong handle);generating small unique identifiers to be used as keys in a map-like structure, identifiers that disperse well and which guarantee that you don't need to check if that key already exists; for example say that we need to generate files in a folder tree with the pattern
01/23/4567
(where0..7
are the bytes in au64
value); using an "standard" random generator we'll have to loop in a retry; (or anINSERT
in an SQL database with a small primary key, where an error would abort the transaction;)I did find some implementations for such a generator:
I have even re-implemented the last one in Rust (but only for
u32
):However it would be lovely if such a generator would exist in the
rand
library for the integer types (at leastu16
,u32
,u64
andu128
).The only requirements for such a generator would be:
As optional requirements or observations:
I would provide an implementation but I don't have the "mathematical" knowledge to implement or prove them. However if someone points me to such algorithms I could provide the implementation in Rust.
Hope this "feature request" is useful for someone!
The text was updated successfully, but these errors were encountered: