Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem space size vs seed size #363

Closed
vlsi opened this issue Jun 19, 2022 · 7 comments
Closed

Problem space size vs seed size #363

vlsi opened this issue Jun 19, 2022 · 7 comments

Comments

@vlsi
Copy link
Contributor

vlsi commented Jun 19, 2022

Testing Problem

Problem space might exceed 64bits quite fast, so some of the cases are inherently impossible to represent and catch with jqwik.

For instance, if I want to verify a + b == b + a where a and b are long variables, then the problem space is 128bits, and 64bit seed is not enough. It will miss "a lot" of cases, and it will never be able to generate them no matter how much time we allocate for the testing.
Of course, "edge case" injection solves that to a degree.

Discussion

Do you know if using a bigger random state is known to affect property-test quality?

/cc @inponomarev

@jlink
Copy link
Collaborator

jlink commented Jun 19, 2022

As far as I know, more than 64 bits means that RNG algorithms get measurably slower since some ops can no longer be done with a single machine op. If that will result in a measurably slower generation of values probably depends on the number of required generation steps. In the standard cases I'd doubt that it has an effect.

As for test quality, I'd assume that it won't effect it since value distribution should more or less be the same for reasonable pseudo-random generation algorithms.

@vlsi
Copy link
Contributor Author

vlsi commented Jun 19, 2022

See https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/random/package-summary.html#algorithms, https://docs.oracle.com/en/java/javase/17/core/pseudorandom-number-generators.html#GUID-6946463F-C64B-4361-AB53-0D52FB279DE0

Large Permutations
For applications that generate large permutations, consider a generator whose period is much larger than the total number of possible permutations; otherwise, it will be impossible to generate some of the intended permutations. For example, if the goal is to shuffle a deck of 52 cards, the number of possible permutations is 52! (52 factorial), which is approximately 2^225.58, so it may be best to use a generator whose period is roughly 2^256 or larger, such as L64X256MixRandom, L64X1024MixRandom, L128X256MixRandom, or L128X1024MixRandom.

The current net.jqwik.engine.SourceOfRandomness#newRandom(long) is able to generate only 2.287... × 10^-49 == 2^64/52! of all card shuffles.

In other words, the current RNG is able to produce "only" 18446744073709551616 == 2^64 permutations (the permutation is pre-determined by the input seed), while the deck of 52 cards can have 80658175170943878571660636856403766975289505440883277824000000000000 = 52! permutations.

Judging by https://issues.apache.org/jira/browse/RNG-168?focusedCommentId=17520192&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17520192, the difference in performance is not really dramatic.

I'm inclined that something like L64X1024MixRandom or L128X1024MixRandom might be valuable for jqwik: it should not be that hard to maintain, it should have pretty much the same performance, and it would allow users to test more permutations than the current random generator.

PS By the way, commons-rng looks like a really nice, well-thought, and well-tested dependency. It is the second time I bump into @aherbert's analysis, and I do enjoy reading it.

@jlink
Copy link
Collaborator

jlink commented Jun 27, 2022

At least commons-rng has no further runtime dependencies :-)

What I think would be the best way to go about that is to:

  1. Extract random generation into its own abstraction
  2. Allow users to plug-in their own implementation
  3. Provide a few implementations to choose from

For me that looks like another component of jqwik 2.0.

@vlsi Do you have a concrete problem domain where 64 bit of entropy do not suffice? It could be a driver…

@lmartelli
Copy link

This does not look like a seed size issue to me. If your random source is N bits, you can still generate random values of any M>N bits by concatenating M/N+1 N bits values. Having a random source with a greater greater range just improves performances when you need that range or more.

@jlink
Copy link
Collaborator

jlink commented Nov 25, 2022

This does not look like a seed size issue to me. If your random source is N bits, you can still generate random values of any M>N bits by concatenating M/N+1 N bits values. Having a random source with a greater greater range just improves performances when you need that range or more.

You can generate any number of bits > N, but the sequence of bits is determined by the initial seed. If this seed has only N bits of entropy, you cannot generate more different sequences than 2^N.

So I think @vlsi is right in stating that with a seed of N bits you'll never cover the full range of N^2 possibilities - given you use the same deterministic pseudo-random generator each time.

@vlsi
Copy link
Contributor Author

vlsi commented Dec 10, 2022

I played with commons-rng a bit, and it turns out their interfaces are really involved, and exposing commons-rng generators in jqwik api seems bad.

It looks like the approach for jqwik should be adding interfaces like JqwikRandom, JqwikRandomState.

Then nextSeed=random.nextLong() could be replaced with next=random.split()

vlsi pushed a commit to vlsi/jqwik that referenced this issue Dec 10, 2022
vlsi pushed a commit to vlsi/jqwik that referenced this issue Dec 11, 2022
vlsi pushed a commit to vlsi/jqwik that referenced this issue Dec 11, 2022
vlsi pushed a commit to vlsi/jqwik that referenced this issue Dec 11, 2022
vlsi pushed a commit to vlsi/jqwik that referenced this issue Dec 27, 2022
vlsi pushed a commit to vlsi/jqwik that referenced this issue Dec 27, 2022
@jlink
Copy link
Collaborator

jlink commented May 14, 2024

This will be supported by jqwik 2. So I close the feature request for jqwik 1.

@jlink jlink closed this as completed May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants