Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Property Test Discussion #1121

Closed
sksamuel opened this issue Dec 13, 2019 · 49 comments
Closed

Property Test Discussion #1121

sksamuel opened this issue Dec 13, 2019 · 49 comments
Labels
enhancement ✨ Suggestions for adding new features or improving existing ones. property-testing 📏 Related to the property testing mechanisms within the testing framework.
Milestone

Comments

@sksamuel
Copy link
Member

sksamuel commented Dec 13, 2019

I am working on overhauling our property testing as part of the upcoming 4.0 release. To this end, I have brought together the requirements (based off tickets existing in this tracker), and come up with a basic design. I would like feedback on this design before I fully implement it. There are also some questions that I don't have an answer to yet that I would like to discuss. At this stage everything is open to change.

Property Test Requirements

Deterministic Re-runs

If a test failed it is useful to be able to re-run the tests with the same values. Especially in cases where shrinking is not available. Therefore, the test functions accept a seed value which is used to create the Random instance used by the tests. This seed can then be programatically set to re-run the tests with the same random instance.

By default the seed is null, which means the seed changes each time the test is run.

Exhaustive

The generators are passed an Exhaustivity enum value which determines the behavior of generated values.

  • Random - all values should be randomly generated
  • Exhaustive - every value should be generated at least once. If, for the given iteration count, all values cannot be generated, then the test should fail immediately.
  • Auto - Exhaustive if possible and supported, otherwise random.

By default Auto mode is used.

Question - do we want to be able to specify exhaustivity per parameter?

In #1101 @EarthCitizen talks about Series vs Gen. I do like the nomenclature, but we would need another abstraction (Arbitrary?) on top which would then be able to provide a Gen or Series as required based on the exhaustive flag.

Question - do we want to implement this way as opposed to the way I have outlined in the code below

Min and Max Successes

These values determine bounds on how many tests should pass. Typically min and max success would be equal to the iteration count, which gives the forAll behavior.
For forNone behavior, min and max would both be zero. Other values can be used to mimic behavior like forSome, forExactly(n) and so on.

By default, min and max success are set to the iteration count.

Distribution

It is quite common to want to generate values across a large number space, but have a bias towards certain values. For example, when writing a function to test emails, it might be more useful to generate more strings with a smaller number of characters than larger amounts. Most emails are probably < 50 characters for example.

The distribution mode can be used to bias values by setting the bound from which each test value is generated.

  • Uniform - values are distributed evenly across the space. For an integer generator of values from 1 to 1000 with 10 runs, a random value would be generated from 0.100, another from 101..200 and so on.
  • Pareto - values are biased towards the lower end on a roughly 80/20 rule.

By default the uniform distribution is used.

The distribution mode may be ignored by a generator if it has no meaning for the types produced by that generator.

The distribution mode has no effect if the generator is acting in exhaustive mode.

Question - it would be nice to be able to specify specific "biases" when using specific generators. For example, a generator of A-Z chars may choose to bias towards vowels. How to specify this when distribution is a sealed type? Use an interface and allow generators to create their own implementations?

Shrinking Mode

The ShrinkingMode determines how failing values are shrunk.

  • Off - Shrinking is disabled for this generator
  • Unbounded - shrinking will continue until the minimum case is reached as determined by the generator
  • Bounded(n) - the number of shrink steps is capped at n. After this, the shrinking process will halt, even if the minimum case has not been reached. This mode is useful to avoid long running tests.

By default shrinking is set to Bounded(1000).

Question1 - do we want to be able to control shrinking per parameter? Turn it off for some parameters, and not others?

When mapping on a generator, shrinking becomes tricky.
If you have a mapper from GenT to GenU and a value u fails, you need to turn that u back into a t, so you can feed that t into the original shrinker. So you can either keep the association between the original value and mapped value, or precompute (lazily?) shinks along with the value.

Question2 - which is the best approach?

Gen design

The gens accept the Random instance used for this run.
They accept an iterations parameter so they know the sample space when calculating based on a distribution
They accept the exhausitivity mode and the distribution mode.
Question - move the iteration count into the distribution parameter itself?

Note that the gens no longer specify a shrinker, but should provide the shrinks along with the value (see shrinker section for discussion).

/**
 * A Generator, or [Gen] is responsible for generating data* to be used in property testing.
 * Each generator will generate data for a specific type <T>.
 *
 * The idea behind property testing is the testing framework will automatically test a range
 * of different values, including edge cases and random values.
 *
 * There are two types of values to consider.
 *
 * The first are values that should usually be included on every test run: the edge cases values
 * which are common sources of bugs. For example, a function using [Int]s is more likely to fail
 * for common edge cases like zero, minus 1, positive 1, [Int.MAX_VALUE] and [Int.MIN_VALUE]
 * rather than random values like 159878234.
 *
 * The second set of values are random values, which are used to give us a greater breadth to the
 * test cases. In the case of a functioin using [Int]s, these random values could be from across
 * the entire integer number line.
 */
interface Gen<T> {

   /**
    * Returns the values that are considered common edge case for this type.
    *
    * For example, for [String] this may include the empty string, a string with white space,
    * a string with unicode, and a string with non-printable characters.
    *
    * The result can be empty if for type T there are no common edge cases.
    *
    * @return the common edge cases for type T.
    */
   fun edgecases(): Iterable<T>

   /**
    * Returns a sequence of values to be used for testing. Each value should be provided together
    * with a [Shrinker] to be used if the given value failed to pass.
    *
    * This function is invoked with an [Int] specifying the nth test value.
    *
    * @param random the [Random] instance to be used for random values. This random instance is
    * seeded using the seed provided to the test framework so that tests can be deterministically rerun.
    *
    * @param iterations the number of values that will be required for a successful test run.
    * This parameter is provided so generators know the sample space that will be required and can thus
    * distribute values accordingly.
    *
    * @param exhaustivity specifies the [Exhaustivity] mode for this generator.
    *
    * @param distribution specifies the [Distribution] to use when generating values.
    *
    * @return the test values as a lazy sequence.
    */
   fun generate(
      random: Random,
      iterations: Int,
      exhaustivity: Exhaustivity = Exhaustivity.Auto,
      distribution: Distribution
   ): Sequence<Pair<T, Shrinker<T>>>

   companion object
}

fun Gen.Companion.int(lower: Int, upper: Int) = object : Gen<Int> {
   private val literals = listOf(Int.MIN_VALUE, Int.MAX_VALUE, 0)
   override fun edgecases(): Iterable<Int> = literals
   override fun generate(
      random: Random,
      iterations: Int,
      exhaustivity: Exhaustivity,
      distribution: Distribution
   ): Sequence<Pair<Int, Shrinker<Int>>> {

      val randomized = infiniteSequence { k ->
         val range = distribution.get(k, iterations, lower.toLong()..upper.toLong())
         random.nextLong(range).toInt()
      }

      val exhaustive = generateInfiniteSequence {
         require(iterations <= upper - lower)
         (lower..upper).iterator().asSequence()
      }.flatten()

      val seq = when (exhaustivity) {
         Exhaustivity.Auto -> when {
            iterations <= upper - lower -> exhaustive
            else -> randomized
         }
         Exhaustivity.Random -> randomized
         Exhaustivity.Exhaustive -> exhaustive
      }
      return seq.map { Pair(it, IntShrinker) }
   }
}

fun <T, U> Gen<T>.map(f: (T) -> U): Gen<U> {
   val outer = this
   return object : Gen<U> {
      override fun edgecases(): Iterable<U> = outer.edgecases().map(f)
      override fun generate(
         random: Random,
         iterations: Int,
         exhaustivity: Exhaustivity,
         distribution: Distribution
      ): Sequence<Pair<U, Sequence<U>>> =
         outer.generate(random, iterations, exhaustivity, distribution)
            .map { (value, shrinks) ->
               Pair(f(value), shrinks.map { f(it) }.asSequence())
            }
   }
}

sealed class Distribution {

   abstract fun get(k: Int, iterations: Int, range: LongRange): LongRange

   /**
    * Splits the range into discrete "blocks" to ensure that random values are distributed
    * across the entire range in a uniform manner.
    */
   object Uniform : Distribution() {
      override fun get(k: Int, iterations: Int, range: LongRange): LongRange {
         val step = (range.last - range.first) / iterations
         return (step * k)..(step * (k + 1))
      }
   }

   /**
    * Values are distributed according to the Pareto distribution.
    * See https://en.wikipedia.org/wiki/Pareto_distribution
    * Sometimes referred to as the 80-20 rule
    *
    * tl;dr - more values are produced at the lower bound than the upper bound.
    */
   object Pareto : Distribution() {
      override fun get(k: Int, iterations: Int, range: LongRange): LongRange {
         // this isn't really the pareto distribution so either implement it properly, or rename this implementation
         val step = (range.last - range.first) / iterations
         return 0..(step * k + 1)
      }
   }
}

sealed class Exhaustivity {

   /**
    * Uses [Exhaustive] where possible, otherwise defaults to [Random].
    */
   object Auto : Exhaustivity()

   /**
    * Forces random generation of values.
    */
   object Random : Exhaustivity()

   /**
    * Forces exhausive mode.
    */
   object Exhaustive : Exhaustivity()
}

sealed class ShrinkingMode {

   /**
    * Shrinking disabled
    */
   object Off : ShrinkingMode()

   /**
    * Shrinks until no smaller value can be found. May result in an infinite loop if shrinkers are not coded properly.
    */
   object Unbounded : ShrinkingMode()

   /**
    * Shrink a maximum number of times
    */
   data class Bounded(val bound: Int) : ShrinkingMode()
}
@sksamuel sksamuel added discussion enhancement ✨ Suggestions for adding new features or improving existing ones. property-testing 📏 Related to the property testing mechanisms within the testing framework. labels Dec 13, 2019
@sksamuel sksamuel added this to the 4.0 milestone Dec 13, 2019
@LeoColman
Copy link
Member

Deterministic Re-runs
I have a suggestion for the deterministic re-runs: Allow for passing multiple seeds. I don't really know how this would execute, but my idea is the following:

I run property test, and my test fails. I know the seed of the failure, so I want to execute values from that seed as I know they'll fail. I then make the test pass.

After the test passes, I'll always use the same seed, and that's not exactly very good, as I won't have enough randomness from now on (i'll always execute on the exact same 1000 values).

For this reason, I think that the seed should generate, for instance, 1000 cases, but we should still have randomness to the test, so maybe an additional 1000 random cases should be generated as well.

And at some point I could have like Gen.withSeeds(10, 20, 30, 40, 50, RANDOMS).assertAll { } and I would use all the seeds to generate the values randomly.

As property testing should be used mostly in Unit tests, adding more test cases shouldn't be bad, and this could be optional any way.

I just am not sure on the syntax

@LeoColman
Copy link
Member

LeoColman commented Dec 14, 2019

Exhaustive

Auto - Exhaustive if possible and supported, otherwise random.

How would we know if it's supported? Could we split between interface ExhaustiveGen and interface RandomGen? Or maybe something inbetween?

I don't really like doing magic based on a parameter. For example, Without looking at the code, I must know if Gen.int() is exhaustible or not. But how could we do that?

@sksamuel
Copy link
Member Author

sksamuel commented Dec 14, 2019 via email

@xgouchet
Copy link

@Kerooker regarding the deterministic re runs, I think what should be done is by default leave the seed null (changing everytime), and if a test failed, print to System.err the seed that was used for the failing test.

You can then apply that same seed to reproduce the failing case, fix your code so the tests pass, and then remove the seed again to resume using random seeds.

@LeoColman
Copy link
Member

@xgouchet
To fix the code, I should create a test that catches that regression if it ever happens again. And for that, I will probably have to create more example-based tests.

That's better than having more than one seed, probably

@LeoColman
Copy link
Member

LeoColman commented Dec 14, 2019

Distribution

How to specify this when distribution is a sealed type? Use an interface and allow generators to create their own implementations?

I think so, yes.

sealed class Distribution {
    
    object Uniform : Distribution
    object Pareto : Distribution
    abstract class Custom : Distribution()
}

class MyCustomDistribution : Distribution.Custom() { ... }

@sksamuel
Copy link
Member Author

Distribution

How to specify this when distribution is a sealed type? Use an interface and allow generators to create their own implementations?

I think so, yes.

sealed class Distribution {
    
    object Uniform : Distribution
    object Pareto : Distribution
    abstract class Custom : Distribution()
}

class MyCustomDistribution : Distribution.Custom() { ... }

In that case I'm not sure of the value of the "built in" ones. Could just be an interface and each gen provides zero or more implementations.

@sksamuel
Copy link
Member Author

@xgouchet
To fix the code, I should create a test that catches that regression if it ever happens again. And for that, I will probably have to create more example-based tests.

That's better than having more than one seed, probably

Yes I think so. I'm not a fan of multiple seeds.

@LeoColman
Copy link
Member

The generator itself would know if it's supported and return the
appropriate sequence.

How would I know it's supported? For example, I want to test all integers. How can I know if it supports exhaustiveness until I try to execute it?

@sksamuel
Copy link
Member Author

sksamuel commented Dec 15, 2019 via email

@sksamuel
Copy link
Member Author

sksamuel commented Dec 15, 2019 via email

@LeoColman
Copy link
Member

LeoColman commented Dec 15, 2019

Take a look at

I would like a way to have this informed without having to look at the code. When dealing with a big code base, having to look if the Gen is exhaustive or not might be a little bit troublesome. And even more problematic if the Gen is very complex, such as generating far too many business rules for an example-based test to be enough

Perhaps splitting exhaustivity and randomness into different interfaces might be better for this purpose. But I don't have a clue on how to make that syntax clear when using the gen.

@sksamuel
Copy link
Member Author

sksamuel commented Dec 15, 2019 via email

@LeoColman
Copy link
Member

A point I'd like to raise is Gen merge. I faced some cases like this:

data class MyData(string: String, int: Int, long: Long)

object MyDataGen : Gen<MyData> {

     fun random(seed: Long?) = generateInfiniteSequence {
        MyData(Gen.string().random(seed).first(), Gen.choose(100, 999).random(seed).first(), Gen.choose(100L, 999L).random(seed).first()
    }
}

This seems very long and not really reliable, as I'm skipping constants for example. I was trying to figure out a way to improve this kind of generation, but I couldn't come up with anything.

@LeoColman
Copy link
Member

It might work but we need to see a concrete syntax suggestion.

I'll try to think about that for some time. Perhaps @EarthCitizen have some ideas in that regard

@sksamuel
Copy link
Member Author

sksamuel commented Dec 15, 2019 via email

@EarthCitizen
Copy link
Contributor

EarthCitizen commented Dec 15, 2019

OK. So, here is a rough idea of how the sealed classes idea could work. This is not intended to be complete, but rather a POC on the sealed classes approach.

Goals:

  • Separate these 3 concerns into separate classes:
    • Randomly generated values
    • Scaled values (this implementation does not meet the requirement of different distributions, but that could easily be fixed)
    • Series (finite exhaustive sets of values)
  • Allow repeatable tests (as we do now)

What is not included here (but not incompatible):

  • Shrinking
    • Arbitrary - should be similar to how it is done now
    • Scaled - simply reduce the scale until a passing value is found
    • Series - N/A
  • Rules for how many times to iterate when there is a mixture of Gen types used (assertAll)
    • This could be a little tricky at first, but I believe a sane set of rules can be worked out by thinking outside of how we currently do this (based solely iteration count)
      • Perhaps things like:
        • Iterations stop when Series runs out?
        • Or Series wraps when iteration count exceeds it?
sealed class Gen<T>

abstract class Arbitrary<T> : Gen<T>() {
    fun asIterable(random: Random = Random) = Iterable {
        object : Iterator<T> {
            override fun hasNext(): Boolean = true
            override fun next(): T = generate(random)
        }
    }

    fun asSequence(random: Random = Random) = Sequence { asIterable(random).iterator() }

    abstract fun generate(r: Random): T

    companion object {
        fun ints() = object : Arbitrary<Int>() {
            override fun generate(r: Random): Int = r.nextInt()
        }
        fun longs() = object : Arbitrary<Long>() {
            override fun generate(r: Random): Long = r.nextLong()
        }
    }
}

abstract class Scaled<T> : Gen<T>() {
    abstract fun scale(s: Int): T
    companion object {
        const val SCALE_MIN = 0
        const val SCALE_MAX = 99
        const val SCALE_STEPS = 100

        fun positiveInt() = object : Scaled<Int>() {
            override fun scale(s: Int): Int = when(s) {
                SCALE_MIN -> 0
                SCALE_MAX -> Int.MAX_VALUE
                else -> (Int.MAX_VALUE / SCALE_STEPS) * s
            }
        }
    }
}

// Constructor private in order to enforce finite iterable
class Series<T> private constructor(private val i: Iterable<T>) : Gen<T>(), Iterable<T> {
    override fun iterator(): Iterator<T> = i.iterator()
    companion object {
        fun <T> of(s: Set<T>) = Series(s)
        fun of(r: CharRange) = Series(r)
        fun of(r: IntRange) = Series(r)
    }
}

@sksamuel
Copy link
Member Author

sksamuel commented Dec 15, 2019

I do like the nomenclature of Series and Arbitrary as both subclasses of Gen (even though I think Arbitrary means something different in Quickcheck style property test frameworks).

I tend to drive my API design from a users perspective and not from an implementation one. So lets consider how this would work.

If we have Series and Arbitrary as subtypes of Gen, then if we want to use a Series we need to be explicit (we can't just derive the generator from the type parameter).

forAll(Series.int(100), Arbitrary.strings(20)) { ... }

In this example, the first gen is an 1000 int series - every int from 1 to 100 inclusive. You don't need to specify iterations because its implicit in the fact it's a series. The second gen is 20 random strings from across the string continuum.

If we say that iteration count is the multiple of gen counts, then this could work. We have 100 x 20 iterations (2000 iterations).

This means it would work if we have say a series for a enum.

forAll(Series.enum<Season>(), Arbitrary.strings(250)) { ... }

In this example, the four seasons as the first param x 250 random strings = 1000 iterations.

I'm not sure we need this Scaled thing at all. If you want to "scale" whatever that is, then it can be a parameter to the int gen. Gen.int(some bound).

In this design, the exhaustive parameter goes away because it's now explicit in the generator type. We've taken care of the iteration count. Seed could still be passed in, but only handed off to generators and not series.

If we still want to allow implied generators, we can add backwards compatible forAlls that derive gens from the type parameter using reflection like now.

@EarthCitizen
Copy link
Contributor

Yeah, Arbitrary in QuickCheck is a typeclass. In this case I was looking for a synonym of random.

In the case of Scaled(not explained well in my previous post), the idea behind this was, for example, when you have an exponential scale (or another type of distribution curve) from 0 to INT_MAX, and you don't want to check every value, and shrinking back at the same increment becomes very easy. The implementation I gave above is only linear (I believe). In a real implementation, the Scale instance would not be the one controlling the curve/distribution. Scaling is also very useful when you are generating a nested tree structure, like an AST, and you can a) limit the depth of the tree from outside of the generator and b) shrink back the depth of the tree looking for a pass.

@sksamuel
Copy link
Member Author

I'm not clear on why you need Scaled as a type, when "scale" (or distribution) can be a parameter to those gen's that support the concept (like number based ones).

@EarthCitizen
Copy link
Contributor

Yeah, I don't disagree. I was just explaining what I was trying to explain originally.

@sksamuel
Copy link
Member Author

Ok, do you think it's worth pursuing that idea, or going down the line using distribution with the Arbitraries ?

@EarthCitizen
Copy link
Contributor

EarthCitizen commented Dec 16, 2019

Just thinking out loud, defining things with my own words to ensure I am not confused:

  • Arbitrary - spot checking

    • distribution - means that over time the generated values show weighted preference toward certain places in the spectrum over others
    • scale - means that there is an upper limit placed on the magnitude/depth of the value generated, this can be optionally ignored by the Arbitrary instance
  • Series - exhaustive

    • numeric range distribution
      • linear distribution - every value in the range
      • exponential distribution (for example) - if the values are to stay unique in the series, then the number of items would have to be reduced, no change number of elements if they do not stay unique

Please correct me on anything that I appear to be confused on.

@sksamuel
Copy link
Member Author

So my take on it is slightly different.

Series is for an exhaustive list of values. Maybe series is the wrong term.
Series.int(0, 1000) would generate all ints between 0 and 1000.

Arbitrary is for "random" values.
Arbitrary.int(1000) would generate 1000 random ints
Arbitrary.int(1000, 0..10000) would generate 1000 ints randomly distributed between 0 and 10000.
Arbitrary.int(1000, 0..10000, Distribute.Pareto) would generate 1000 ints with a distribution (more lower numbers).

I'm not sure scale is needed as you can pass scale directly into the constructors of the arbitaries that support it.

@EarthCitizen
Copy link
Contributor

EarthCitizen commented Dec 16, 2019

Yeah, I think we are saying the same thing.

Scale (as a parameter) would be equivalent to the Size in Hedgehog. It is a way the gens can be globally shrunk by the testing framework, and as a global parameter at the point of test execution. Abstracting some of the bounding and shrinking to a higher level. It allows you to say "I want to scale down all of the tests in this particular suite", for example. You could use Arbitrary.ints() everywhere, then specify scaling dynamically or choose scaling for particular areas.

But, I concede that this could be extra functionality that is not truly needed.

@sksamuel
Copy link
Member Author

I see. I think it's probably more complicated than useful. You probably know the ranges when you're specifying the generators.
I'll have another go at an implementation for a further round of feedback.

@sksamuel
Copy link
Member Author

Here is another implementation proposal based on feedback received so far.

tl;dr - there are two types of Gen. Arbitrary for random values and edge cases and Progression for an exhaustive set of values (maybe progression could be renamed to Exhaustive?). Therefore, no need for an exhaustive parameter anymore. The distribution parameter is now part of the arbitrary constructors so each arbitrary is free to implement whatever distributions make sense for itself.

The arbitrary instances generate values that include a function for returning shrunk values. These shrunk values themselves then contain a function for further shrunk values and so on. This mechanism allows us to map and filter on gens and preserve the shrinking functionality.

The forAll method provides for minSuccess, maxFailure, a seed for the random instance, and a shrinking mode which can be used to disable shrinking, or bound the number of shrinks before stopping.

The iteration count for the property test is taken as the combination of the length of the sequences generated from the Gen instances). So for progressions, that's the number of values in the closed set. For arbitraries you must pass in how many you want. This is better I believe because if you want to test 1000 random numbers in a function that also accepts a boolean, you can now have 1000 x true and 1000 x false, rather than 1000 x (random int, random boolean).

Code follows:

/**
 * A Generator, or [Gen] is responsible for generating data to be used in property testing.
 * Each generator will generate data for a specific type <T>.
 *
 * There are two supported types of generators - [Arbitrary] and [Progression] - defined
 * as sub interfaces of this interface.
 *
 * An arbitrary is used when you need random values across a large space.
 * A progression is useful when you want all values in a smaller closed space.
 *
 * Both types of generators can be mixed and matched in property tests. For example,
 * you could test a function with 1000 random positive integers (arbitrary) and every
 * even number from 0 to 200 (progression).
 */
interface Gen<T> {
   fun generate(random: Random): Sequence<PropertyInput<T>>
}
/**
 * An [Arbitrary] is a type of [Gen] which generates two types of values: edge cases and random values.
 *
 * Edge cases are values that should usually be included on every test run: those edge cases
 * which are common sources of bugs. For example, a function using ints is more likely to fail
 * for common edge cases like zero, minus 1, positive 1, [Int.MAX_VALUE] and [Int.MIN_VALUE]
 * rather than random values like 965489.
 *
 * The random values are used to give us a greater breadth to the test cases. In the case of a
 * function using ints, these random values could be from across the entire integer number line,
 * or could be limited to a subset of the integer space.
 */
interface Arbitrary<T> : Gen<T> {

   /**
    * Returns the values that are considered common edge case for the type.
    *
    * For example, for [String] this may include the empty string, a string with white space,
    * a string with unicode, and a string with non-printable characters.
    *
    * The result can be empty if for type T there are no common edge cases.
    *
    * @return the common edge cases for type T.
    */
   fun edgecases(): Iterable<T>

   /**
    * Returns a sequence of random [PropertyInput] values to be used for testing.
    *
    * @param random the [Random] instance to be used for generating values. This random instance is
    * seeded using the seed provided to the test framework so that tests can be deterministically re-run.
    * Implementations should honour the random provider whenever possible.
    *
    * @return the random test values as instances of [PropertyInput].
    */
   fun randoms(random: Random): Sequence<PropertyInput<T>>

   override fun generate(random: Random): Sequence<PropertyInput<T>> =
      edgecases().map { PropertyInput(it) }.asSequence() + randoms(random)

   companion object
}
fun <T, U> Arbitrary<T>.map(f: (T) -> U): Arbitrary<U> = object : Arbitrary<U> {
   override fun edgecases(): Iterable<U> = this@map.edgecases().map(f)
   override fun randoms(random: Random): Sequence<PropertyInput<U>> =
      this@map.randoms(random).map { it.map(f) }
}
fun <T> Arbitrary<T>.filter(predicate: (T) -> Boolean): Arbitrary<T> = object : Arbitrary<T> {
   override fun edgecases(): Iterable<T> = this@filter.edgecases().filter(predicate)
   override fun randoms(random: Random): Sequence<PropertyInput<T>> =
      this@filter.randoms(random).filter { predicate(it.value) }
}
/**
 * A [Progression] is a type of [Gen] which generates a deterministic, repeatable, and finite set
 * of values for a type T.
 *
 * An example of a progression is the range of integers from 0 to 100.
 * Another example is all strings of two characters.
 *
 * A progression is useful when you want to generate an exhaustive set of values from a given
 * sample space, rather than random values from that space. For example, if you were testing a
 * function that used an enum, you might prefer to guarantee that every enum value is used, rather
 * than selecting randomly from amongst the enum values (with possible duplicates and gaps).
 *
 * Progressions do not shrink their values. There is no need to find a smaller failing case, because
 * the smaller values will themselves naturally be included in the tested values.
 */
interface Progression<T> : Gen<T> {

   /**
    * @return the values for this progression as a lazy list.
    */
   fun values(): Sequence<T>

   override fun generate(random: Random): Sequence<PropertyInput<T>> = values().map { PropertyInput(it) }

   companion object
}

Some sample progressions:

fun Progression.Companion.int(range: IntRange) = object : Progression<Int> {
   override fun values(): Sequence<Int> = range.asSequence()
}

fun Progression.Companion.azstring(range: IntRange) = object : Progression<String> {
   private fun az() = ('a'..'z').asSequence().map { it.toString() }
   override fun values(): Sequence<String> = range.asSequence().flatMap { size ->
      List(size) { az() }.reduce { acc, seq -> acc.zip(seq).map { (a, b) -> a + b } }
   }
}

/**
 * Returns a [Progression] of the two possible boolean values - true and false.
 */
fun Progression.Companion.bools() = object : Progression<Boolean> {
   override fun values(): Sequence<Boolean> = sequenceOf(true, false)
}

Some sample arbitraries:

fun Arbitrary.Companion.int(
   iterations: Int,
   range: IntRange = Int.MIN_VALUE..Int.MAX_VALUE,
   distribution: IntDistribution = IntDistribution.Uniform
) = object : Arbitrary<Int> {
   override fun edgecases(): Iterable<Int> = listOf(Int.MIN_VALUE, Int.MAX_VALUE, 0)
   override fun randoms(random: Random): Sequence<PropertyInput<Int>> {
      return sequence {
         for (k in 0 until iterations) {
            val block = distribution.get(k, iterations, range.first.toLong()..range.last.toLong())
            val next = random.nextLong(block).toInt()
            val input = PropertyInput(next, IntShrinker)
            yield(input)
         }
      }
   }
}

sealed class IntDistribution {

   abstract fun get(k: Int, iterations: Int, range: LongRange): LongRange

   /**
    * Splits the range into discrete "blocks" to ensure that random values are distributed
    * across the entire range in a uniform manner.
    */
   object Uniform : IntDistribution() {
      override fun get(k: Int, iterations: Int, range: LongRange): LongRange {
         val step = (range.last - range.first) / iterations
         return (step * k)..(step * (k + 1))
      }
   }

   /**
    * Values are distributed according to the Pareto distribution.
    * See https://en.wikipedia.org/wiki/Pareto_distribution
    * Sometimes referred to as the 80-20 rule
    *
    * tl;dr - more values are produced at the lower bound than the upper bound.
    */
   object Pareto : IntDistribution() {
      override fun get(k: Int, iterations: Int, range: LongRange): LongRange {
         // this isn't really the pareto distribution so either implement it properly, or rename this implementation
         val step = (range.last - range.first) / iterations
         return 0..(step * k + 1)
      }
   }
}

fun Arbitrary.Companion.long(
   iterations: Int,
   range: LongRange = Long.MIN_VALUE..Long.MAX_VALUE
) = object : Arbitrary<Long> {
   override fun edgecases(): Iterable<Long> = listOf(Long.MIN_VALUE, Long.MAX_VALUE, 0)
   override fun randoms(random: Random): Sequence<PropertyInput<Long>> {
      return sequence {
         for (k in 0 until iterations) {
            val next = random.nextLong(range)
            val input = PropertyInput(next, LongShrinker)
            yield(input)
         }
      }
   }
}

/**
 * Returns a stream of values where each value is a randomly
 * chosen Double.
 */
fun Arbitrary.Companion.double(): Arbitrary<Double> = object : Arbitrary<Double> {

   val literals = listOf(
      0.0,
      1.0,
      -1.0,
      1e300,
      Double.MIN_VALUE,
      Double.MAX_VALUE,
      Double.NEGATIVE_INFINITY,
      Double.NaN,
      Double.POSITIVE_INFINITY
   )

   override fun edgecases(): Iterable<Double> = literals

   override fun randoms(random: Random): Sequence<PropertyInput<Double>> {
      return generateSequence {
         val d = random.nextDouble()
         PropertyInput(d, DoubleShrinker)
      }
   }
}

fun Arbitrary.Companion.positiveDoubles(): Arbitrary<Double> = double().filter { it > 0.0 }
fun Arbitrary.Companion.negativeDoubles(): Arbitrary<Double> = double().filter { it < 0.0 }

/**
 * Returns an [Arbitrary] which is the same as [Arbitrary.double] but does not include +INFINITY, -INFINITY or NaN.
 *
 * This will only generate numbers ranging from [from] (inclusive) to [to] (inclusive)
 */
fun Arbitrary.Companion.numericDoubles(
   from: Double = Double.MIN_VALUE,
   to: Double = Double.MAX_VALUE
): Arbitrary<Double> = object : Arbitrary<Double> {
   val literals = listOf(0.0, 1.0, -1.0, 1e300, Double.MIN_VALUE, Double.MAX_VALUE).filter { it in (from..to) }
   override fun edgecases(): Iterable<Double> = literals
   override fun randoms(random: Random): Sequence<PropertyInput<Double>> {
      return generateSequence {
         val d = random.nextDouble()
         PropertyInput(d, DoubleShrinker)
      }
   }
}

And some example tests you might write:

fun main() {

   // tests 1000 random ints with all integers 0 to 100
   forAll(
      Arbitrary.int(1000),
      Progression.int(0..100)
   ) { a, b ->
      a + b == b + a
   }

   // tests 10 random longs in the range 11 to MaxLong with all combinations of a-z strings from 0 to 10 characters
   forAll(
      Arbitrary.long(10, 11..Long.MAX_VALUE),
      Progression.azstring(0..10)
   ) { a, b ->
      b.length < a
   }

   // convenience functions which mimics the existing style
   // tests random longs, using reflection to pick up the Arbitrary instances
   // each arb will have the same number of iterations
   forAll<Long, Long>(1000) { a, b -> a + b == b + a }
}

You can see all this in the io.kotest.property package in module kotest-assertions.

@breandan
Copy link
Contributor

breandan commented Jan 8, 2020

Thanks for the link @sksamuel! Skimming through the comments, just some initial thoughts.

Not sure if I understand how #1135 addresses #472 yet. One way is to strengthen compositionality, but I'm not sure if you want to support fully monadic binding as things start to become complicated pretty quickly. For example, to support composition with generator tuplings like we implemented in #506, if two generators accept a common input, shrinking becomes a multi-objective optimization problem in which there is not only one, but a set of solutions along the Pareto frontier. In general, shrinking becomes a lot harder the more compositional things are. Letting the user manually implement the shrinker might be less elegant, but seems more practical and aligns with QuickCheck's implementation.

I like the idea of encapsulating the value and shrinker in PropertyInput instead of Pair<T, Shrinker<T>>. I also like the idea of having a more configurable Generator, although I wonder if it would be better to have default arguments for all these parameters (random, iterations, and distribution) so as not to overwhelm new users. What are your goals around backwards API compatibility with these changes?

@LeoColman
Copy link
Member

I think most cases have some standard default values. Random is a random seed, iterations is 1000 IIRC

@sksamuel
Copy link
Member Author

sksamuel commented Jan 8, 2020

@breandan

First of all, everything here is open to change.

For #472 the idea is that the PropertyInput keeps the previous input before it was mapped, so it use that in the "bare" shrinker to get the next shrunk value.

Of course, if you have bind and it has 8 inputs, you could be potentially shrinking 8 ways, so it's better to write something custom. We will support that (in the current rework, you would implement your own Arb and include the shrinker in that, but perhaps we do-couple them).

Some parameters are passed to the forAll method (or the forNone, forSome, whatever). And some parameters are passed to the generators themselves.

  • The random (seed) is passed to forAll, and forAll takes care of passing that to each gen in turn as it's needed.
  • The distribution is unique to some types of Arb, and so you pass that when you create the arb.
  • The iterations is set on each Arb. This is somewhat unusual as other prop test frameworks (and KT currently) set it across all gens at once. But by doing it this way, you can have 100 randoms from genA cross joined with 10 values from genB giving you 1000 combinations. The advantage is this - say you have an enum with 10 possible states. You might want to test 100 random ints with every possible enum state, rather than 1000 random combinations where the enums are randomly picked each time. It basically combines smallcheck and quickcheck in one superset. I'm really keen on this, but the downside is the burden is on the user a bit more and the syntax is more involved.

A possible example of a prop test in the current design with all the knobs.

forAll(
  Arb.int(1000, Distribution.Pareto), // 1000 randoms from the entire int range, biased towards 0
  Arb.double(10, 1..100.0), // 10 random doubles in the range 1..100.0
  Prog.enum<CustomerType>, // every value from the enum CustomerType
  args = PropTestArgs(
    seed = 112549833451L, // setting the random seed
    shrinking = ShrinkingMode.Bounded(1000)  // no more than 1000 shrinks per arg
  )
) { a,b,c -> ... }

This would give you 1000 x 10 x |CustomerType| combinations.

I propose to also keep convenience methods, like this:

forAll<Int, Double, CustomerType>(1000) { a,b,c -> ... }

Which will take 333 from each parameter to get you back to ~1000.

@DannyRyman
Copy link

DannyRyman commented Jan 14, 2020

One impact to the new API for us is that currently we are using the generators outside of property based testing, simply to generate random data for traditional TDD based tests. I don't know whether others are also doing this; but I find it to be a nice capability to be able to use the same generators outside of the property based testing?

Because of the new API needing to be passed the random and because of the PropertyInput wrapper, it is more more involved to do this:

Arbitrary.structure().samples(Random.Default).take(1).map { it.value }.first()

Previously:

Gen.structure()

I guess we can write our own extension functions to hide the details; but if others use the generators in this way, maybe this should be considered in the updated API design?

@LeoColman
Copy link
Member

LeoColman commented Jan 14, 2020

I agree with @DannyRyman.

I use
UserGen.next() a lot, for example, when I just need a random user.

I believe we intend to support this, and if we don't ATM, we surely will before a release candidate is ready.

It also fits property testing concepts, as I'm escaping example-based testing by just testing that my function works for any given user (even when checking it only once)

@sksamuel
Copy link
Member Author

sksamuel commented Jan 14, 2020 via email

@LeoColman
Copy link
Member

I guess we can write our own extension functions to hide the details;

I believe we can probably add these extensions to Kotest itself :P

@LeoColman
Copy link
Member

It's actually what we do with next and take

@sksamuel
Copy link
Member Author

sksamuel commented Jan 14, 2020

We could have an extension method on Arbitrary called next which does the plumbing for you - putting in a default random instance and taking the first value. But it does mean that Arb needs a consistent way to set the size, it can't be in the constructor.

@LeoColman
Copy link
Member

LeoColman commented Jan 14, 2020 via email

@LeoColman
Copy link
Member

LeoColman commented Jan 14, 2020 via email

@sksamuel
Copy link
Member Author

sksamuel commented Jan 14, 2020

Try writing next() for an Arb. You can't because Arbs are created with a size parameter, so there's nothing stopping you doing:

Arb.int(0).next() // create an arb of size 0 and ask for next

which would throw an error.

So for this to work the size parameter has to move out of Arb's constructors and into the interface method. So generate takes a size.

This means that you would need to do forAll like this:

forAll(1000, Arb.int(), Arb.double()) { a, b -> ... }

Like we do now.

But then, if you do

forAll(2, Arb.int(), Prog.enum<Season>()) { a, b -> ... }

There are 4 seasons and you've only asked for 2 iterations, so it's not exhaustive. So to enforce exhaustivity we need to introduce a parameter to choose whether we care about exhaustivity or not. And what happens if you ask for 10 iterations on a progression of size 3 - do you just loop back around again?

We could throw an exception if you ask for iterations < |max_progression_size|. Or we could introduce a new interface, so you start off with an Arb, which you pass a size into to get a Thing, and the Thing generates the sequence.

forAll(Arb.int().generate(100), Arb.double().generate(200), Prog.int(0, 10)) { a, b, c -> ... }

Then you could do everything we need, but the usability is dropping each time we increase complexity.

@LeoColman
Copy link
Member

If the default size for Arbitrary is 1000, for example, then Arbitrary.int().next() would work just fine, wouldn't it?

@LeoColman
Copy link
Member

Then we can throw an IllegalArgumentException if you try to do Arbitrary.int(0).next()

@sksamuel
Copy link
Member Author

sksamuel commented Jan 14, 2020 via email

@sksamuel
Copy link
Member Author

Just throwing another brain storm out here.

Rename Progression to Exhaustive, move iterations to the forAll methods, and then do this:

forAll(
  1000,
  Arbitrary.ints(),
  Arbitrary.doubles(),
  Exhausive.int(0, 100)
) { a, b, c -> .... }

This generates 1000 tests pulling from the two Arbs. The Exhausive provides 100 values and then cycles. If the iteration count is less than the max of any exhaustives an exception is thrown. In other words, an exhaustive generator will throw an error if you try to pull less iterations than it generates.

Pros - less setup for user, can use a gen to provide single values.
Cons - Can't do |a| x |b| x |c| iterations.

@LeoColman
Copy link
Member

can use a gen to provide single values

It's not necessary to move the iterations to ForAll to allow this, as we discussed above

@sksamuel
Copy link
Member Author

sksamuel commented Jan 22, 2020

Ok everyone, another iteration to the prop test proposals and I think we're getting closer on a final implementation (if everyone agrees!).

Code here: #1174

What's changed?

  • Progressions are now called Exhaustives. I think it better describes that the values are a complete closed set, not necessarily a sequence.

  • Gen has been restored as a top level construct. A Gen provides a single value of T per invocation, as well as an optional shrinker and edgecases. The burden on a user implementing their own gen is much less now.

  • Arbitraries (Arbs) are simply created from a Gen using the take function specifying the iteration count. This means that we can use Gen's outside of prop testing when we just want a value (as @DannyRyman pointed out). It means we can do other things with Gens in the future too. The complete number of tests are the counts of each parameter multiplied like before, but the take operation makes it clearer imo.

  • Shrinking control has moved inside the take method, so if you want to turn shrinking off, you can do so on a per-parameter basis, like

forAll(Gen.int().take(100, ShrinkingMode.Off)) { a -> .... }
  • All prop test methods are now suspendable, rather than inline, so they compile much quicker. The downside is you need to execute them in a coroutine, but KT is fully enabled for coroutines in every test scope transparently to the user, so I don't forsee this being an issue.

The end result is code like this:

     forAll(
         Gen.int(0..500).take(100),
         Gen.positiveInts().take(10),
         Exhaustive.enum<SomeEnum>()
      ) { a, b, c ->
    // test   
      }

And we'll support convenience functions like:

Gen.int().forAll(1000) { a -> ... }

and

forAll<Int, Int>(1000) { a, b -> ... }

Please feedback as quickly as you like. Once the property test changes are finalised (or thereabouts) we can move 4.0 to a beta release.

@LeoColman
Copy link
Member

All prop test methods are now suspendable, rather than inline, so they compile much quicker. The downside is you need to execute them in a coroutine, but KT is fully enabled for coroutines in every test scope transparently to the user, so I don't forsee this being an issue

The only issue I see with this is if people want to use the module outside of KotlinTest environment. However, Kotlin environment itself is very easy to integrate with coroutines, so this won't be much more than an annoyance.

And we'll support convenience functions like:
Gen.int().forAll(1000) { a -> ... }

I suppose these values will have defaults. I think this is great!

However...

Gen has been restored as a top level construct

Are we going to have Gen, Exhaustive AND Arbitrary?
Is Exhaustive a Gen?

I think this will be a little bit confusing. At least it is confusing for me right now, maybe because I'm already in the mindset of Arb and Prog. I'll wait for an API proposal to comment further

@sksamuel
Copy link
Member Author

sksamuel commented Jan 23, 2020 via email

@sksamuel
Copy link
Member Author

sksamuel commented Jan 23, 2020 via email

@sksamuel
Copy link
Member Author

sksamuel commented Feb 7, 2020

All the recent work has been merged to master. I think its stable enough to be used.

@sksamuel sksamuel closed this as completed Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ Suggestions for adding new features or improving existing ones. property-testing 📏 Related to the property testing mechanisms within the testing framework.
Projects
None yet
Development

No branches or pull requests

6 participants