Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move PRNG algorithms out of Rand #431

Closed
5 of 8 tasks
pitdicker opened this issue May 6, 2018 · 47 comments
Closed
5 of 8 tasks

Move PRNG algorithms out of Rand #431

pitdicker opened this issue May 6, 2018 · 47 comments
Labels
B-API Breakage: API P-high Priority: high T-RNG Topic: generators
Milestone

Comments

@pitdicker
Copy link
Contributor

pitdicker commented May 6, 2018

Edit by dhardy: adding a tracker for the planned tasks:

Also: we should consider whether we want to rename PRNGs when moving or adding them.


As decided in dhardy#58.

The idea is to only keep the StdRng and SmallRng wrappers in Rand, and to not expose PRNG algorithms directly.

The field of PRNG algorithms is not static, it seems like a good idea to cut Rand partly loose from the developements to improve our backward-compatability and usability story. Advantages (from the original issue)

  • rand provide easy to reason generator(s) to satisfy the bulk of the users
  • rand provides low-friction upgrades and keeps the optimal-ish algorithms readily available
  • rand will not accumulate and/or deprecate specific algorithms throughout time
  • Specific rand Rng names won't bleed out to the world, like in forum posts, stackoverflow, etc..

Instead we want to provide an official companion crate, rand_rngs, with a couple of blessed implementations.

Whether Rand should depend on rand_rngs or copy the two algorithms used by StdRng and SmallRng is not fully decided. It might reduce code size if a crate depends on both StdRng and its algorithm (currently Hc128Rng) directly, but that seems rare and not worth much. It has the disadvantage that both implementations should remain in sync. Maybe the CI can check this?

If Rand depends on rand_rngs, it would restrict rand_rngs to only depend on rand_core. But that is what rand_core is for. I am for having Rand depend on rand_rngs.

rand_rngs

rand_rngs should provide a good selection of PRNGs.

Including multiple PRNGs in one crate has the advantage of having a single place to provide guidance, as it helps comparing different algorithms with each other. Also it gives one place to offer consistent benchmarks, to run PRNG test suites, to keep up a common quality level, and possibly to develop functionality that is useful for more than one PRNG (e.g. jumping).

Which PRNGs to include has been the topic of endless discussions. A few are basically decided. I fully expect the number of PRNGs to grow over time. But we also should be careful not to expose too many, as there are hundreds. Every PRNG should have one thing that gives it a clear advantage over others, otherwise it is not worth the inclusion.

Normal PRNGs

dhardy#60 explored normal PRNGs. The following 5 I feel comfortable about for an initial version of rand_rngs:

  • PCG-XSH-RR 64/32 (LCG)
  • PCG-XSL-RR 128/64 (MCG)
  • SFC (32-bit variant)
  • SFC (64-bit variant)
  • Xoroshiro128+

The two PCG variants offer good quality and reasonable performance. SFC provides high performance and a chaotic PRNG (not a fixed period). Xoroshiro128+ has acceptable quality but high performance.

An PRNG put together by me, XoroshiroMT, is also good quality and sits between PCG and Xoroshiro128+ qua performance. But now that there is just a new Xoshiro PRNG, it may be worth evaluating that one first, as it is said to also be good quality.

CSPRNGs

For CSPRNGs we currently have two good implementations in Rand

  • ChaCha20
  • HC-128

ChaCha20 offers reasonably good performance and uses little memory, while HC-128 brings high performance at the cost of using much memory.

I would also like to see some implementation of AES-CTR DRBG eventually, as it is commonly used, and has hardware support on modern desktop processors.

Other / deprecated PRNGs

We currently have the ISAAC, ISAAC-64 and Xorshift128/32 PRNGs in rand. They have no real advantages over the PRNGs metioned before. It is better to use HC-128 instead of ISAAC, and Xoroshiro128+ or PCG instead of Xorshift. We can include them in something like a deprecated module, but I propose to move them to stand-alone crates outside the Rand repository.

Steps to take

Does this sound about right?

@vks
Copy link
Collaborator

vks commented May 6, 2018 via email

@pitdicker
Copy link
Contributor Author

See the first post and the linked issue.

@dhardy
Copy link
Member

dhardy commented May 6, 2018

I actually see no reason that we shouldn't have one crate per RNG (or perhaps family of RNGs), but I think it's worth having these in the Rand repo and the crates owned by the Rand team (currently me and the libs team). Or we could try organising more and have a rust-rand project and set of repositories on GH maybe.

@TheIronBorn
Copy link
Collaborator

TheIronBorn commented May 6, 2018

Both the new xoshiro and xoroshiro** variants I think are worth consideration.

Vigna claims xoroshiro256** has a higher degree polynomial than PCG 128 variants https://www.reddit.com/r/programming/comments/8gx2d3/new_line_of_fast_prngs_released_by_the_author_of/dygkkng/, so xoshiro variants are probably even stronger.

Edit: poor wording

@burdges
Copy link
Contributor

burdges commented May 6, 2018

Is the separate repository is merely a transitional detail? It improves code size if any code that gets used actually gets exposed somewhere, even if only in another crate. It would improve tests if all this lives in the rand repository, right?

@dhardy
Copy link
Member

dhardy commented May 7, 2018

@TheIronBorn this issue isn't about what algorithms we include, it's just about organisation and presentation.

@dhardy
Copy link
Member

dhardy commented May 7, 2018

@burdges I'm not sure what you mean about transitional; no. The easiest option is I think just to have sub-projects in this repo (like rand_core).

I'm not even sure if it does help with testing having everything in the same repo; in one way it makes it worse since CI must run all unit-tests on any change (although with API-breaking changes this may be necessary).

Once we have stable core APIs it may make more sense to use separate repos since most changes (docs, new features, tweaks to other features) will not affect other crates. I.e. once rand_core is stable then changes to things like the distribution code will have no effect on the RNG implementations, so separate repos do make some sense.

@pitdicker
Copy link
Contributor Author

In this issue and the previous one I have given arguments that it would really benefit users to have one crate, as it helps comparing and contrasting them. And real but smaller benefits for us. Are those reasons wrong?

Maybe it is good to also collect the reasons multiple crates can be a good idea.

Two related thoughts:

  • most non-cryptographic PRNGs only need 5-10 lines for the algorithm. About 30 additional lines to implement the necessary traits. Then a handful of tests, documentation, license header, and benchmarks. Seems a bit small for a crate to me, but that is not unique.
  • how do we show separate crates are similar enough to compare directly (benchmarks, quality, support)? Anyone is free to create a rand_* crate.

CI must run all unit-tests on any change.

The unit tests we have now for PRNGs run in a fraction of a second, easily a 100 times less than the set-up time for Travis.

@vks
Copy link
Collaborator

vks commented May 7, 2018 via email

@pitdicker pitdicker added B-API Breakage: API T-RNG Topic: generators labels May 11, 2018
@dhardy
Copy link
Member

dhardy commented May 16, 2018

So sounds like

  • we should put new crates within the same repo, at least for now (like rand_core is)
  • everything in the prng module should move
  • still to decide exact crates
  • probably we should make rand depend on whichever crates are necessary for the StdRng and SmallRng algorithms

How many crates do we want? Only 1-3 (e.g. crypto, non-crypto and deprecated)? One crate per family of RNGs? One crate per RNG?

  • having many crates means we don't have to deprecate generators; we can just update the doc and ignore the relevant crate(s) (potentially move to a new repo)
  • having many crates may make comprehensive testing and benchmarking painful (extra scripts required; lots of binaries to compile); having a "master" crate may alleviate this
  • each crate has extra boilerplate (cargo.toml, readme, possibly tests and benchmarks)
  • each crate has to be published independently

@newpavlov do you have anything to say about the many-small-crates design of RustCrypto's hashes (and other projects)?

@newpavlov
Copy link
Member

newpavlov commented May 16, 2018

In my opinion small crates work pretty well (but not too small, i.e. crates for family of algorithms, not for each variation), yes there is some headache with coordinated upgrades to the new trait version, but if rand_core will be relatively stable, I don't think it will be a big problem.

having many crates may make comprehensive testing and benchmarking painful

cargo test --all and cargo bench --all more or less solve this problem. There is an issue with features when you work with virtual manifest, but I don't think that RNG crates will suffer from it.

each crate has extra boilerplate (cargo.toml, readme, possibly tests and benchmarks)

RustCrpyto partially solves this problem by having feature gated dev module in trait crates, which define functions and macros to make it easier to write tests and benchmarks. Though I am not sure if RNG crates should follow this approach.

I would like suggest to move CryptoRng crates to the RustCrypto/CSRNGs repo. (of course with full access granted to @dhardy and lang team) Pros:

  • CSPRNGs usually tightly coupled with stream ciphers, so they could share underlying implementations, which will make easier to work on optimizations.
  • I plan to have other CSRNGs in addition to algorithms from rand crate, e.g. rdrand (@nagisa gave his consent) and aesrng developed by @vks

Con is of course is the split of RNG crates between two organizations.

@dhardy
Copy link
Member

dhardy commented May 16, 2018

Actually we already stopped using --all for most tests because of this issue. It's a shame it's not been solved.

@burdges
Copy link
Contributor

burdges commented May 16, 2018

It'd be nice to save code size by using stream cipher code for CSPRNGs of course, but not that pressing. At first blush, I kinda think moving CSPRNG crates into another project, especially one less blessed by Mozilla, sounds premature and distracting. At minimum, all the BlockRng business will make reading the code annoying.

@dhardy
Copy link
Member

dhardy commented May 17, 2018

With our current PRNGs and currently-proposed additions, a crate-per-family approach looks something like:

  • rand_rng_xorshift
  • rand_rng_xoroshiro (or could be with xorshift; also can include xoshiro)
  • rand_rng_pcg
  • rand_rng_sfc
  • rand_rng_isaac
  • rand_rng_chacha
  • rand_rng_hc (could potentially also house HC-256, if we have any use for it)

Lets not worry about the generators for now (there will probably be more); this is several crates already, and some of them will be very small (pcg unless we add more algorithms, sfc, xorshift unless combined with xoroshiro; also hc and chacha are only 500 lines).

There is nothing inherently wrong with this, but it is a significant amount extra boilerplate just to save pulling unused PRNGs into a project (the entire rand::prng module is only 2600 lines, albeit because we have avoided expanding this much until now).

Are there any alternatives worth discussing?

@vks
Copy link
Collaborator

vks commented May 17, 2018

I would probably use shorter names and drop the rng_, but other than that I think one crate per family is reasonable. I think this is nicer for other crates, because they can depend on the specific algorithm without having to worry whether it is currently used by Rand or not, and we can keep Rand minimal at the same time.

@pitdicker
Copy link
Contributor Author

Just to be sure, the only real argument for having multiple crates is that at some point is becomes 'cleaner' for us to deprecate an RNG?

And it adds the problem of boilerplate, making the PRNGs harder to compare, more difficult to guide users, and anyone is free to make a crate with a seemingly official name.

@pitdicker
Copy link
Contributor Author

I would like suggest to move CryptoRng crates to the RustCrypto/CSRNGs repo. (of course with full access granted to @dhardy and lang team) Pros:

  • CSPRNGs usually tightly coupled with stream ciphers, so they could share underlying implementations, which will make easier to work on optimizations.
  • I plan to have other CSRNGs in addition to algorithms from rand crate, e.g. rdrand (@nagisa gave his consent) and aesrng developed by @vks

Con is of course is the split of RNG crates between two organizations.

I am afraid this would only cause more work, not less. While algorithms may be (mostly) the same, RustCrypto and Rand can have different goals. What happens when one decides some algorithm is fit for inclusion, but the other does not. Or that the trade-off for performance vs security should be different? We would have to have a very clear sense of direction and and idea of what both projects see in such a crate.

Now I am not saying we can't work together, and work together well. But at this point I think it is better for us if we keep this only in mind as an option, and rethink it next year or something like that. Rand has changed a lot in the past couple of months, and it seems like a good idea to me to remain flexible and get our story straight first. Especially the amount of security features to provide is something we have quite a few open issues for, and seems a bit fuzzy.

@newpavlov
Copy link
Member

What happens when one decides some algorithm is fit for inclusion, but the other does not.

if it's officially published algorithm (i.e. not home-brew crypto) designed as CSRNG I will be happy to see it in the repo, so RustCrypto/CSRNGs is intended to be more inclusive than rand.

Or that the trade-off for performance vs security should be different?

Ehm, for CryptoRng implementations priority should be security, otherwise it should not implement this trait.

@pitdicker
Copy link
Contributor Author

pitdicker commented May 17, 2018

Ehm, for CryptoRng implementations priority should be security, otherwise it should not implement this trait.

😄 Maybe I should write a bit clearer about which trade-offs I had in mind. Nothing terrible I hope.

As an example, the BlockRng wrapper (used by ChaChaRng and Hc128Rng) could choose to zero out a value after it was read from the buffer. That brings down the performance of the RNG by 10-20%. It would make it more difficult for someone peeking at the memory to recover previous values. But in my opinion, when someone can inspect the memory of a process, you have already lost. So we currently don't zero after reading a random number from a block of buffered results.

Another one I have in mind is fork protection. I hope we can add that to the ReseedingRng wrapper soon. It means checking a global static and reseeding if it has changed. Should that happen on every iteration, or only when a new block of values has to be generated? Again that is the choice between 1-2% overhead, or 10-20%. I would say having something like 15 duplicated values after a fork (which is not supposed to happen) is an acceptable trade-off, especially because it means users have no good reason to turn off the protection.

@burdges
Copy link
Contributor

burdges commented May 17, 2018

Zeroing ChaChaRng output does nothing. If you can hammer the offset then you can hammer the block number, well they even live in the same struct.

@dhardy
Copy link
Member

dhardy commented May 18, 2018

Just to be sure, the only real argument for having multiple crates is that at some point is becomes 'cleaner' for us to deprecate an RNG?

What's the alternative — maintain a crate that may grow to a hundred or more PRNGs, or be extremely selective about which we include? Perhaps we won't include enough PRNGs for size to be an issue, but in that case what's the motivation for using an external crate in the first place?

The only other reason I can see for separate crates is to allow an app to use rand_core + some PRNG without importing much extra code, i.e. to minimise what needs to be reviewed.

@pitdicker
Copy link
Contributor Author

What's the alternative — maintain a crate that may grow to a hundred or more PRNGs, or be extremely selective about which we include?

PractRand contains 200+ RNGs, and that is already a selection. In my opinion having that many is not useful. Even the opposite, too many choices distract and complicate, so having many choice would be bad.

What I had in mind (with the first post, and the relevant issues in your repro), was to have a small, well chosen selection. Not all that different from how we work now. Users that want more than StdRng or SmallRng should just be able to go to one crate and have a clear overview of good choices. With a selection of PRNGs that come from different families, have different performance, quality, memory usage, security and features. I don't see that become more than 20 choices.

My idea was not to have some super-crate or organization to implement every (possibly even obsolete) PRNG design under the sun. But to have a good resource for users.

And other crates can still have a role in providing more variants or features, like the current PCG crate.

@vks
Copy link
Collaborator

vks commented May 18, 2018

@pitdicker What you are suggesting would be possible by reexporting the smaller crates in a "curation" crate. I would prefer this over duplicating code.

Note that a small selection might become not so small as algorithms become obsolete and get replaced but have to be kept for backwards compability.

@dhardy
Copy link
Member

dhardy commented May 18, 2018

In this case we have three options:

  • Don't add individual crates. In this case we could still potentially add them later.
  • Use individual crates only. Add a section in the Rand documentation giving an overview of good generators across these crates.
  • Add individual crates and a "curation" crate. The only advantages over the above are somewhere more specific to house overview documentation and a convenient place for users to import from. I don't see much advantage here however.

If we keep the number of choices low (e.g. 20 algorithms) then the first option is a good choice, though as @vks says this may grow over time (removing algorithms is not a good idea except perhaps if a CSPRNG is found to be insecure, because it causes problems for any libraries needing reproducibility with old algorithms and requires extra maintenance).

Also, no one has answered my other question yet: if we don't have per-family crates, is there any point at all moving the algorithms out of the main Rand crate?

@dhardy dhardy added the P-high Priority: high label Jul 7, 2018
@dhardy
Copy link
Member

dhardy commented Jul 7, 2018

Well, nothing has happened in the last month, yet I feel this issue is quite important.

I suggest we move forward as follows:

  1. We replace SmallRng's implementation with something else (do not discuss which algorithm here!), keeping the implementation hidden. This gives us a better SmallRng without dependence on any other crate or public PRNG in Rand.
  2. We deprecate XorshiftRng. If we can, we point them to another crate with a compatible implementation.

We should move forward with some small(er) crates implementing families of PRNGs. On a case-by-case basis we decide whether to host these externally or within the Rand repository (which may or may not be considered to imply a higher level of review). (Alternatively we could start a "Rust Rand" organisation on GitHub; not sure that it's worth it though.)

The following PRNG crates already exist:

Related, external RNGs:

Quality and maintenance:

  • Of the above, only @vks's crate uses rand_core 0.2; all others use rand 0.3.x or 0.4.x if they depend on Rand at all. I don't know if this says we should be doing more to help other authors or that most of the above are basically abandoned; either way we can't really promote external RNG crates which don't support the latest Rand.
  • Code in the Rand repo may get better quality review than external crates. For non-crypto PRNGs this is not such a big deal so long as the implementations test against published test vectors.

Therefore I think we should slightly prefer internal implementations of PRNGs in our documentation and be open to pull-requests to add PRNGs here, but not require it.

Promoting RNGs: I'm not convinced at this time it's worth having a "curation crate", though we could do so. A better option might be to add a document to Rand reviewing various RNG crates (there are a few other things which could go in a Rand "book").

Organisation:

Edit: we should probably just use the crate names without subdirectory.

Suggestion: we add an rngs directory, then any sub-crates in this repo can go under here (e.g. REPO_ROOT/rngs/xoshiro)

"Crypto" RNGs

We currently have only two implementations, Hc128Rng and ChaChaRng. For now I suggest we leave them in Rand's main crate, though we could move to rngs/csprng (rand_csprng crate) later.

Since the ISAAC generators are no longer used by ThreadRng, lets move them out to rand_isaac crate/subdir).

Naming

This would be a good time to rename PRNGs, if appropriate.

Currently all internal PRNGs have a Rng suffix, e.g. XorShiftRng. Some other implementations follow this convention; some don't. To me it seems redundant to use within a crate of RNGs. Therefore we could rename e.g. IsaacRng to Isaac32 or just Isaac.

Crate names: I suggest that all crates created as part of the Rand project use the prefix rand_ (following the convention of rand_core). I'm not sure whether we should make a rename when adopting an external crate into the Rand project; possibly we should do so but also with a compatibility release under the old name.


Does this sound like a good plan?

@vks
Copy link
Collaborator

vks commented Jul 7, 2018

I agree with your plan, it is almost exactly what I would suggest as well.

A better option might be to add a document to Rand reviewing various RNG crates (there are a few other things which could go in a Rand "book").

In addition, I would recommend to use specific RNGs and their crates for reproducibility. At the same time, we could explicitly weaken the reproducibility guarantees for SmallRng and StdRng, so we can replace the algorithms in a feature release.

About the pcg_rand crate: I recently tried to port it to rand_core 0.2, which was surprisingly tricky due the heavy use of generics (similar to the reference C++ implementation). I ran into a compiler bug and gave up, but it might be possible to redesign the library to work around the bug (see robojeb/pcg_rand#8).

@dhardy
Copy link
Member

dhardy commented Jul 8, 2018

The reproducibility limitations already apply to SmallRng and StdRng (maybe doc could be improved).

I also thought the use of generics in that crate is a bit over the top (the same criticisms apply to @imneme's libraries — most users want one relatively simple PRNG; not metacode for building hundreds of variants of generators). So I'd rather start from @pitdicker's code.

@dhardy
Copy link
Member

dhardy commented Jul 10, 2018

I added a tracker at the top of this issue.

@vks do you want to keep your xoshiro crate external or move it into this repo? I really don't care either way; externally you get more freedom but also the expectation of maintenance 😉

Also, what should we do with Xorshift? It's not especially good, but we should probably keep it around somewhere in case anyone wants to reproduce results with it. @astocko has an xorshift crate but it needs updating, and doesn't have the same generator.


On another point: do we want to continue with only one repo or have multiple? Already we have some confusion of branch & tag names with 0.5 referring to the rand version but also hosting rand_core 0.2.x.

Separate repos would give us more git and CI overhead but smaller history and less CI work to do on each change.

@vks
Copy link
Collaborator

vks commented Jul 11, 2018

@dhardy In principle, I'm fine with either way. It depends on what we decide about the fate of PRNGs in Rand: If Rand somehow depends on a PRNG, I think the crate should be in the Rand repository. If Rand just recommends the crate, either way is probably fine. Or should we be worried that a crate we recommend gets compromised? After all, Rand is the third-most popular crate. However, I anticipate the recommended crates will be used by few users, making them a smaller target.

If we decide to give the recommended crates a rand_ prefix, they should probably live in the Rand repository. This would probably the the easiest way to move xorshift and isaac out of Rand, for which crates already exist. Similarily, we could add rand_pcg, rand_xoshiro etc. (Unfortunately there already is a pcg_rand crate, which might be confusing.)

About splitting repos: I'm not sure. It worked well for num, but there the crates are more orthogonal. I would expect there are few issues that only apply to rand_core.

@dhardy
Copy link
Member

dhardy commented Jul 11, 2018

For non-crypto PRNGs we only need one in Rand (SmallRng) and there isn't a lot of code involved, so we could just copy the code. For CSPRNGs, yes, we should only depend on Rand crates (or perhaps a RustCrypto crate).

I think we should use the rand_ prefix for Rand sub-projects but not for external crates, and if we adopt an external crate into Rand we can rename the crate then (it's not hard for users to switch this, and avoids stepping on the toes of inactive authors).

True, at the moment most issues are strongly linked to the main Rand project even if about some other part such as a rand_core trait or another distribution implementation. Maybe we should reconsider later, but for now continue with one repo.

@droundy
Copy link

droundy commented Jul 13, 2018

Would a pull request for a Xorshift1024 crate be accepted at this stage? I'm wanting to have xorshift1024* for a monte carlo simulation I'm writing, and it looks like moving forward having it hosted in this repository would be ideal (and match with the plans outlined here). I'd open a separate issue, but it seems slightly subsumed by this one.

@TheIronBorn
Copy link
Collaborator

TheIronBorn commented Jul 13, 2018

You mean this xorshift1024*φ algorithm?

Given that there are a couple generations of Vigna's generators which more or less obsolete the algorithm, I'd say it's unlikely it would be included.
We are considering adding this xorshift crate which has xorshift1024* generators.

xoshiro512 or xoroshiro1024 would be more likely. The xoshiro crate has xoshiro512 and xoroshiro1024 generators could be added. (xoroshiro1024 is described only in the Scrambled Linear Pseudorandom Number Generators paper)

It's also possible that other generators with similarly large periods might be added.

@TheIronBorn
Copy link
Collaborator

TheIronBorn commented Jul 13, 2018

Note: xorshift4096* also exists. Probably impractical but it's curious. (not sure why it's no longer listed)

@dhardy
Copy link
Member

dhardy commented Jul 14, 2018

Well, I'll try again: @astocko are you interested in maintaining your Xorshift crate?

But given the lack of commits in the last year it seems unlikely. The code is CC0 so it may be sensible at this point to pull out what we want and make a new rand_xorshift crate.

@vks what do you think should be the divide between Xorshift, Xoroshiro and Xoshiro generators, since you already have the latter two in your crate?

@iddm
Copy link

iddm commented Jul 14, 2018

I see you mentioned me here as an author of randomorg crate. I am the one and I just wanted to remind you that this crate unfortunately is not going to have Rng traits implementations due to its nature of remote data source.

@vks
Copy link
Collaborator

vks commented Jul 14, 2018 via email

@dhardy
Copy link
Member

dhardy commented Jul 14, 2018

We could just throw it all under rand_vigna, but a separate crate for Xorshift is also fine. His recommendations have changed over time, so putting this years' recommendations under vigna2018 doesn't seem sensible for what we hope will become a stable crate.

@vks
Copy link
Collaborator

vks commented Jul 14, 2018

Well, stability is why I decided to create a new crate, so the old one can still be used for reproducibility. One of the advantages of moving the algorithms out of Rand it that we can just add and drop crates from the recommendations without breaking anyone's code.

@afnanenayet
Copy link

Hey, author of the Pcg32 crate here. The other PcgRand crate seems much more full-fledged than mine, but I'd be happy to continue maintaining Pcg32.

I personally would advocate for some sort of vetting/benchmarking with the rand crates. Even something like dieharder scores would be nice to see.

@TheIronBorn
Copy link
Collaborator

We currently have quality comparisons here: https://docs.rs/rand/0.5.4/rand/prng/index.html#basic-pseudo-random-number-generators-prngs

I don't know if we've thought about listing actual test data though (probably TestU01 or PractRand if anything)

@dhardy
Copy link
Member

dhardy commented Jul 31, 2018

@droundy if you're still interested, we'd accept Xorshift1024 in the new rand_xorshift sub-crate (see #557).

@dhardy
Copy link
Member

dhardy commented Jul 31, 2018

@pitdicker I think it's time to turn your small-rngs repo into a real crate 😄

As you say, it is more useful having a small crate with a few high-quality PRNGs than to have a huge collection, so how about we create rand_small_rngs with the following guide-lines:

  • non-crypto only
  • keep it small (5-20 RNGs)
  • try to avoid overlap with other high-quality crates — which probably means Xorshift and Xoshiro should be omitted from this crate, though (for now at least) some PCG variants included

Potentially we can add a separate rand_rngs sub-project for comparison purposes (benchmarks, quality testing). These should be separate: the former is a small collection of high-quality RNGs not hosted elsewhere (and should remain small); the latter is a comparison across many RNGs (and could host the generators.rs benchmarks later).

@droundy
Copy link

droundy commented Jul 31, 2018

I've just gone ahead and implements a xoroshiro256+ in my own code, which ended up seeming simplest, after I decided that compatibility with my old C code didn't matter. Maybe I'll use a "standard" xoroshiro implementation at some later point, but for now it seemed easiest to just translate the C code myself. That way I didn't have to mess with serde feature flags, and I could easily get proper seeding with a u64 as recommended by Vigna.

@peteroupc
Copy link

peteroupc commented Aug 22, 2018

If it helps you decide which PRNGs to include where, I have written an article on random number generators and categorized "good" RNGs into two categories. Essentially:

  • A cryptographic PRNG is designed such that prior unseen outputs are hard to guess, and has a state length of at least 128 bits.
  • A statistical PRNG "must either satisfy the one-way property or be significantly more likely than not to pass all tests (other than MatrixRank and LinearComp) of BigCrush", and has a state length of at least 64 bits.

Any RNG other than a cryptographic or statistical RNG, as defined in my article, should be considered low-quality, I think.

@afnanenayet
Copy link

I updated my pcg crate to implement the RngCore/Rng traits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-API Breakage: API P-high Priority: high T-RNG Topic: generators
Projects
None yet
Development

No branches or pull requests

10 participants