Arb.set with a range hangs the test if the given gen inside the set cannot produce enough values for the range #1931

ataronet · 2020-12-21T22:57:39Z

Found in Kotest version 4.3.2

Example

  "test set of enum with less than 3 elements hangs" {
                checkAll(Arb.set(Arb.enum<Enum>(), 1..3)) {
                    it shouldContain Enum.SingleValue
                }
            }

enum class Enum {
        SingleValue // I also tried this with 2 elements in the enum and it hangs. 3 elements would not hang.
    }

The text was updated successfully, but these errors were encountered:

sksamuel · 2020-12-22T08:51:06Z

The documentation for Arb.set describes this behavior:

/**

Returns an [Arb] whose of values are a set of values generated by the given element generator.
The size of each set is determined randomly within the specified [range].
Note: This may fail to terminate if the element generator cannot produce a large enough number of
unique items to satify the required set size
*/

Unfortunately I don't know how easy it is to make this a runtime check.

ataronet · 2020-12-22T09:42:31Z

So it's a known issue. I'll let you know if I can think of a solution in run time. I'd prefer to get an exception on runtime and not hang. But it might be impossible.

sksamuel · 2020-12-22T09:53:59Z

The best I can think of is that we add uniqueValues: Int to each generator (or max size)
Then things like Arb.set can ask the consistent generators what their unique/max size is and throw if it's not big enough.
The drawback is it wouldn't work with things like filter, as the compiler would not known how many values would be filtered.
@myuwono any thoughts ?

myuwono · 2020-12-24T07:10:28Z

hmm, this is almost (although not quite) similar to the Arb.distinct() problem that we had quite recently. Under the hood Arb.set tries to conjure a set of some target cardinality specified by repeatedly taking the next sample from an arb. This is Arb.set:

/**
 * Returns an [Arb] whose of values are a set of values generated by the given element generator.
 * The size of each set is determined randomly within the specified [range].
 *
 * Note: This may fail to terminate if the element generator cannot produce a large enough number of
 * unique items to satify the required set size
 */
fun <A> Arb.Companion.set(gen: Gen<A>, range: IntRange = 0..100): Arb<Set<A>> {
   check(!range.isEmpty())
   check(range.first >= 0)
   return arbitrary {
      val genIter = gen.generate(it).iterator()
      val targetSize = it.random.nextInt(range)
      val set = mutableSetOf<A>()
      while (set.size < targetSize && genIter.hasNext()) {
         set.add(genIter.next().value)
      }
      check(set.size == targetSize)
      set
   }
}

as seen above, that while loop will keep on sampling until the targetSize is reached. Unfortunately sometimes that requirement is simply impossible due to the insufficient cardinality of the underlying arb. Consider the sample code below:

data class Foo(val value: String)

val arbFoo: Arb<Foo> = Arb.of("foo", "bar").map { Foo(it) } 

val setOfThreeOrFour: Set<Foo> = Arb.set(arbFoo, 3..4).single() // this will never resolve

enum class Pet {
    Dog, Cat;
}

val arbDog: Arb<Pet> = Arb.enum<Pet>().filter { it == Pet.Dog }

val twoSetsOfPets: Set<Pet> = Arb.set(arbDog, 2..3).single() // this will never resolve either

Arb<T> is a population of values of type T, type information doesn't provide us any reliable info. There isn't a deterministic way to pre-determine the number of distinct output of any Arbs but to sample them. This is what set is already doing - which i believe is proper. Having said that, there's multiple alternatives to solve that.

Thinking out loud:

@sksamuel's proposition of adding uniqueValues: Set<T> to each generator is interesting. (Note: The int value can be determined when we have more information gathered in this set as the arb continuously being sampled.) This works and it will even have the potential to resurrect distinct but that also means we have to track the samples / pre-sample everything and/or keep a shared mutable state of those samples which can eat up the memory really quickly and potentially slow down generation possibly exponentially (my gut feel).
Another straightforward alternative is to give the while loop in Arb.set some stop condition to give up, e.g.

if (iterations > 1000 && targetSizeIsNotReached) {
   throw IllegalStateException("the target size requirement of $targetSize could not be satisfied after 1000 consecutive samples")
}

@sksamuel what do you reckon?

sksamuel · 2020-12-24T17:30:02Z

I think the 2nd solution will stop the tests hanging which is probably good enough. Generators are used in test code, so we don't need extra special error handling - throwing is sufficient imo.

…

On Thu, 24 Dec 2020 at 01:10, Mitchell Yuwono ***@***.***> wrote: hmm, this is almost (although not quite) similar to the Arb.distinct() problem that we had quite recently. Under the hood Arb.set tries to conjure a set until it reaches the target cardinality specified by repeatedly taking the next sample from an arb. This is Arb.set: /** * Returns an [Arb] whose of values are a set of values generated by the given element generator. * The size of each set is determined randomly within the specified [range]. * * Note: This may fail to terminate if the element generator cannot produce a large enough number of * unique items to satify the required set size */fun <A> Arb.Companion.set(gen: Gen<A>, range: IntRange = 0..100): Arb<Set<A>> { check(!range.isEmpty()) check(range.first >= 0) return arbitrary { val genIter = gen.generate(it).iterator() val targetSize = it.random.nextInt(range) val set = mutableSetOf<A>() while (set.size < targetSize && genIter.hasNext()) { set.add(genIter.next().value) } check(set.size == targetSize) set } } as seen above, that while loop will keep on sampling until the targetSize is reached. Unfortunately sometimes that requirement is simply impossible due to the insufficient cardinality of the underlying arb. Consider the sample code below: data class Foo(val value: String) val arbFoo: Arb<Foo> = Arb.of("foo", "bar").map { Foo(it) } val setOfThreeOrFour: Set<Foo> = Arb.set(arbFoo, 3..4).single() // this will never resolve enum class Pet { Dog, Cat; } val arbDog: Arb<Pet> = Arb.enum<Pet>().filter { it == Pet.Dog } val twoSetsOfPets: Set<Pet> = Arb.set(arbDog, 2..3).single() // this will never resolve either Arb<T> is a population of values of type T, type information doesn't provide us any reliable info. There isn't a deterministic way to pre-determine the number of distinct output of any Arbs but to sample them. This is what set is already doing - which i believe is proper. Having said that, there's multiple alternatives to solve that. Thinking out loud: - @sksamuel <https://github.com/sksamuel>'s proposition of adding uniqueValues: Set<T> to each generator is interesting. (Note: The int value can be determined when we have more information gathered in this set as the arb continuously being sampled.) This works and it will even have the potential to resurrect distinct but that also means we have to track the samples / pre-sample everything and/or keep a shared mutable state of those samples which can eat up the memory really quickly and potentially slow down generation possibly exponentially (my gut feel). - Another straightforward alternative is to give the while loop in Arb.set some stop condition to give up, e.g. if (iterations > 1000 && targetSizeIsNotReached) { throw IllegalStateException("the target size requirement of $targetSize could not be satisfied after 1000 consecutive samples") } @sksamuel <https://github.com/sksamuel> what do you reckon? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1931 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFVSGQFW73RCCSZBGLLJQTSWLSPBANCNFSM4VE2YTEQ> .

ataronet · 2020-12-24T20:19:13Z

+1 for the second opinion from me too. This is a kind of wrongly using the Arbs so I'd understand much quicker what's going on with a exception.
Similarly there is an exception when you have too many edge cases with too few iterations for your test. To me it is the same thing - just the other way around, too few options for the test to run.
Thank you both.

sksamuel · 2021-01-04T03:16:42Z

@myuwono Do you want to add a PR for this? If not I will try to find time before I release 4.4 in the next two days.

myuwono · 2021-01-04T12:03:49Z

@sksamuel sorry was in holiday mood! 🌴

sksamuel · 2021-01-04T12:27:24Z

:)

sksamuel added bug 🐛 Issues that report a problem or error in the code. property-testing 📏 Related to the property testing mechanisms within the testing framework. labels Dec 25, 2020

myuwono mentioned this issue Jan 4, 2021

fix Arb.set hanging when underlying arb cardinality was insufficient #1964

Merged

sksamuel closed this as completed in #1964 Jan 4, 2021

sksamuel added this to the 4.4 milestone Jan 4, 2021

sksamuel mentioned this issue Jan 4, 2021

Roadmap 4.4 #1755

Closed

43 tasks

shichimitogarashi mentioned this issue Feb 6, 2021

Generation of larger sets via Arb.set now throws an exception #2051

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arb.set with a range hangs the test if the given gen inside the set cannot produce enough values for the range #1931

Arb.set with a range hangs the test if the given gen inside the set cannot produce enough values for the range #1931

ataronet commented Dec 21, 2020

sksamuel commented Dec 22, 2020

ataronet commented Dec 22, 2020

sksamuel commented Dec 22, 2020

myuwono commented Dec 24, 2020 •

edited

sksamuel commented Dec 24, 2020 via email

ataronet commented Dec 24, 2020

sksamuel commented Jan 4, 2021

myuwono commented Jan 4, 2021

sksamuel commented Jan 4, 2021

Arb.set with a range hangs the test if the given gen inside the set cannot produce enough values for the range #1931

Arb.set with a range hangs the test if the given gen inside the set cannot produce enough values for the range #1931

Comments

ataronet commented Dec 21, 2020

sksamuel commented Dec 22, 2020

ataronet commented Dec 22, 2020

sksamuel commented Dec 22, 2020

myuwono commented Dec 24, 2020 • edited

sksamuel commented Dec 24, 2020 via email

ataronet commented Dec 24, 2020

sksamuel commented Jan 4, 2021

myuwono commented Jan 4, 2021

sksamuel commented Jan 4, 2021

myuwono commented Dec 24, 2020 •

edited