Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Arb.distinct that will terminate #2262

Closed
Tracked by #2226
sksamuel opened this issue May 19, 2021 · 5 comments · Fixed by #2304
Closed
Tracked by #2226

Add Arb.distinct that will terminate #2262

sksamuel opened this issue May 19, 2021 · 5 comments · Fixed by #2304
Labels
enhancement ✨ Suggestions for adding new features or improving existing ones. property-testing 📏 Related to the property testing mechanisms within the testing framework.
Milestone

Comments

@sksamuel
Copy link
Member

sksamuel commented May 19, 2021

The current distinct is likely to hang if it cannot generate enough unique values. That is why we deprecated it.
But the functionality is still useful.
I propose we add an alternative, that makes those caveats crystal.

arb.distinct(maxAttempts = 1000, throwOnLimit = true)

Those parameters would not have defaults so it would be obvious what you're doing.

@myuwono ?

@sksamuel sksamuel added this to the 5.0 milestone May 19, 2021
@sksamuel sksamuel mentioned this issue May 19, 2021
71 tasks
@aSemy
Copy link
Contributor

aSemy commented Jun 8, 2021

I've been thinking about generating distinct values recently. It's really important to have constraints on Arbs, but also prevent those constraints from hanging (if the Arb doesn't generate enough distinct values), and also try to keep Arbs stateless.

Let's say I write a test that generates short Strings. I'm testing something stateful, so I want to make sure the generated values are distinct. I would be satisfied if any non-distinct generated values were skipped by Kotest, and marked as such in any report.

How about shifting the distinct check into the test? Just like how there's TestCaseConfig.enabledIf, the Property Test should only be enabled if the generated values conform to any given constraint. If the generated value doesn't conform, the test is skipped - and mark it as such in the report.

Here's how I think this could work:

import dev.adamko.warehouse.kotest.ArbConstraint.Companion.addConstraint
import dev.adamko.warehouse.kotest.DistinctArbConstraint.Companion.distinctConstraint
import io.kotest.core.spec.style.FunSpec
import io.kotest.matchers.shouldBe
import io.kotest.matchers.string.shouldHaveMinLength
import io.kotest.property.Arb
import io.kotest.property.Exhaustive
import io.kotest.property.Gen
import io.kotest.property.RandomSource
import io.kotest.property.arbitrary.az
import io.kotest.property.arbitrary.string
import io.kotest.property.checkAll


/**
 * New Arb, with constraints.
 *
 * The field 'constraints' could also be added to [Gen].
 *
 * Alternatively, make [Gen] and [Arb] and [Exhaustive] interfaces, then we can do delegation.
 * This would be my preference - it's much more flexible.
 */
class ConstrainedArb<T>(
    private val delegatedArb: Arb<T>,
    constraint: ArbConstraint<T>
) : Arb<T>() {

  // store constraints
  val constraints: MutableSet<ArbConstraint<T>> = mutableSetOf()

  fun getConstraintViolations(arbVal: T) =
      constraints.filter { !it.isValid(arbVal) }

  init {
    constraints.add(constraint)
  }

  // Delegate to the actual arb
  override fun edgecase(rs: RandomSource) = delegatedArb.edgecase(rs)
  override fun sample(rs: RandomSource) = delegatedArb.sample(rs)
  override fun values(rs: RandomSource) = delegatedArb.values(rs)

  companion object {
    fun <T> ConstrainedArb<T>.addConstraint(constraint: ArbConstraint<T>) =
        apply { constraints.add(constraint) }
  }
}

/** Define a constraint with this interface */
fun interface ArbConstraint<T> {
  fun isValid(arbVal: T): Boolean

  companion object {
    fun <T> Arb<T>.addConstraint(constraint: ArbConstraint<T>) =
        ConstrainedArb(this, constraint)
  }
}

/** Default 'Distinct' constraint - others constraints can also be included */
class DistinctArbConstraint<T> : ArbConstraint<T> {

  /** Store previously seen values */
  private val previous: MutableSet<T> = mutableSetOf()
  override fun isValid(arbVal: T): Boolean = previous.add(arbVal)

  companion object {
    /** helper method for adding this constraint */
    fun <T> Arb<T>.distinctConstraint() = addConstraint(DistinctArbConstraint())
  }
}

/** Example [ConstrainedArb] usage */
class ConArb : FunSpec({

  context("usernames") {

    val propContext = checkAll(
        Arb.string(1, Arb.az())
            .distinctConstraint() // use a pre-made one
            .addConstraint { it.isNotBlank() } // or use the fun-interface
    ) { username ->
      // hardcoded for demo - this check should be done behind the scenes
      if (!check.isValid(username)) {
        classify("constraint-violation")
        return@checkAll
      }
      test("testing username $username") {
        println("testing username $username")
        username shouldHaveMinLength 1
        database.add(username) shouldBe true
      }
    }

    // I'd like to see some sort of output about skipped tests,
    // ideally not with a green checkmark.
    // Here's a demo for now, so at least
    val skippedTestCount = propContext.classifications()["constraint-violation"]
    test("Tests: ${propContext.attempts()}, skipped: $skippedTestCount") {
      println("skipped $skippedTestCount")
    }

  }
})

// hardcoded for demonstration
val check = DistinctArbConstraint<String>()
val database = mutableSetOf<String>()

Result:

image

image

I'm fuzzy on how to skip the tests in the actual Kotest framework code, but here's something that I hope shows the intention (not my suggested implementation). It would probably be better to share enabledIf logic between PropTestConfig TestCaseConfig.

suspend fun <A> proptest(
    iterations: Int,
    genA: Gen<A>,
    config: PropTestConfig,
    property: suspend PropertyContext.(A) -> Unit
): PropertyContext {

  require(
      iterations >= genA.minIterations()) { "Require at least ${genA.minIterations()} iterations to cover requirements" }

  val context = PropertyContext()
  val random = config.seed?.random() ?: RandomSource.default()

  when (genA) {
    is Arb        -> {
      genA.generate(random, config.edgeConfig)
          .take(iterations)
          .forEach { a ->
            val shrinkfn = shrinkfn(a, property, config.shrinkingMode)
            config.listeners.forEach { it.beforeTest() }
            test(context, config, shrinkfn, listOf(a.value), random.seed) {

              if (genA is ConstrainedArb) {
                val violations = genA.getConstraintViolations(a.value)
                if (violations.isNotEmpty()) {
                  // todo skip the test with a message
                  return@test
                }
              }
              context.property(a.value)
            }
            config.listeners.forEach { it.afterTest() }
          }
    }
    is Exhaustive -> {
      genA.values.forEach { a ->
        config.listeners.forEach { it.beforeTest() }
        test(context, config, { emptyList() }, listOf(a), random.seed) {
          context.property(a)
        }
        config.listeners.forEach { it.afterTest() }
      }
    }
  }

  context.checkMaxSuccess(config, random.seed)
  return context
}

@myuwono
Copy link
Contributor

myuwono commented Jun 8, 2021

@aSemy thank you! Quite honestly i was also thinking along similar lines. Because as you've rightly highlighted it's important to have these arbs stateless. Maybe we can introduce such constraint as an additional parameter to checkall.

In a nutshell..

test("my distinct") {
  val myset: MutableSet<B> = mutableSetOf()
  fun isDistinct(a: A, b: B): Boolean = ...
  checkAll(arbA, arbB, constraint = ::isDistinct) { a, b ->
    ...
    
  }
}

I think @aSemy 's idea makes sense.

We have to think about ergonomics but this actually solves a lot of problems, not only distinct.

Wdyt @sksamuel ?

@aSemy
Copy link
Contributor

aSemy commented Jun 10, 2021

@myuwono Thanks! Happy to be discussing this.

I like your suggestion too - mutli-value checking is important too. It would be nice to define this inside of the checkAll scope - then it's more flexible with regards to the variable number of arguments.

I think what this issue is dancing around is the need for implementing a proper 'skip test' functionality, for all contexts. I think the way to do this is to throw a custom Exception. It's not ideal, but then any 'skip' checks can be performed inside the context, so any check could be more flexible with regards with multi-arg checking.

Using the property test context as an example:

  /** todo - handle this in the test runner, mark the test as skipped so it's visible to the user */
  class SkipTestException(context: PropertyContext, skipReason: String) : Exception(skipReason)

  /** Immediately stop this test */
  fun PropertyContext.skip(skipReason: String): Nothing =
      throw SkipTestException(this, skipReason)

  context("test usernames and ids") {

    val previousUsernames = mutableSetOf<String>()

    checkAll<String, List<Long>> { username, ids ->

      // custom skip tests
      if (previousUsernames.add(username)) {
        skip("repeated username $username")
      }
      beUnique<Long>().test(ids).let {
        if (!it.passed()) {
          skip(it.failureMessage())
        }
      }

      // test continues...
    }
  }

This approach could be used for every context, and would sync up nicely with TestCaseConfig.enabledIf.

And once it's in place, there could be some more helper functions to make it more smooth. Like a boolean option on PropTestConfig (default false) that will automatically store all previous values.

@sksamuel
Copy link
Member Author

We can easily add a skip exception but the sticking point in the past is junit platform - once you've started a test you cannot mark it as ignored. So the best we can do is mark it as passed. So I was a bit reluctant - will a green "skipped" test be an issue ? I'm sure we can do something if that's what we want.

I'm not too clear on what the proposal is here. That certain generated values we skip, and mark them as such? For example repeated values, or in the earlier example, blank strings? If we add a constraint that x > 10, why do we want to say we generated 152 x < 10 and we skipped them ?

@sksamuel
Copy link
Member Author

sksamuel commented Aug 1, 2021

Any thoughts on the above ?

sksamuel added a commit that referenced this issue Aug 19, 2021
* Add Arb.distinct that will terminate #2262

* Added delicate kotest annotation
@sksamuel sksamuel added enhancement ✨ Suggestions for adding new features or improving existing ones. property-testing 📏 Related to the property testing mechanisms within the testing framework. labels Aug 19, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ Suggestions for adding new features or improving existing ones. property-testing 📏 Related to the property testing mechanisms within the testing framework.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants