Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-leaf test isolation #379

Closed
ajalt opened this issue Jul 15, 2018 · 12 comments
Closed

Per-leaf test isolation #379

ajalt opened this issue Jul 15, 2018 · 12 comments
Labels
enhancement ✨ Suggestions for adding new features or improving existing ones.
Milestone

Comments

@ajalt
Copy link
Contributor

ajalt commented Jul 15, 2018

I've been thinking about possible improvements to kotlintest's runner structure for the next major version of the library.

Currently, kotlintest makes it easy to hierarchically structure test cases, but setting up test state requires listeners, which are more cumbersome. Since the entire spec is executed in a single pass, you can't set up test state inside a spec without it being shared between each test.

Listeners work, and are reusable. But in my experience, sharing setup and teardown code between test classes is uncommon outside of a few junit rules like creating temporary files. Tests with the same setup are usually all in the same class.

My favorite hierarchical test runner is Catch2 for C++. Tests in Catch2 are structured similarly to kotlintest, with top-level test cases that contain nested sections that form a tree of tests. The big difference from kotlintest is in how the control flow works. Quoting the Catch2 docs:

Sections can be nested to an arbitrary depth (limited only by your stack size). Each leaf section (i.e. a section that contains no nested sections) will be executed exactly once, on a separate path of execution from any other leaf section (so no leaf section can interfere with another). A failure in a parent section will prevent nested sections from running - but then that's the idea.

Example

Let's say we want to do an integration test of a database model where each test is run inside a transaction that is rolled back after the test ends:

class UserTableTest: KotlintestSpec({
    val db = Database.connectToDatabase()
    db.transaction { transaction ->
        section("an existing user") {
            val user = db.insertNewUser()
            section("a new user is added") {
                db.insertNewUser()
                db.fetchUsers().size shouldBe 2
            }
            section("the user is deleted") {
                db.delete(user)
                db.fetchUsers().size shouldBe 0
            }
        }
        
        section("two related users") {
            // etc ...
        }

        transaction.rollback()
    }
    db.close()
})

All of the setup and teardown is declared naturally with zero boilerplate. Since the entire spec is run once for each leaf section, we don't have to worry about test cases interfering with each other. This, in my opinion, a big improvement over JUnit rules, test listeners, etc.

Implementation

So how would we implement it? Catch2 takes advantage of macros and templates to set up its magic. We don't have those tools in Kotlin, so we would probably have to implement a runtime solution.

Here's an algorithm that we could use: https://gist.github.com/ajalt/df82eb9a32383c720f22081f449781f7

This algorithm allows sections to be nested arbitrarily. An advantage of this is that all the different test styles (like given/when/then, feature/scenario/on/it, String.() -> Unit etc.), could be implemented as aliases for section, obviating the need for the many current spec classes.

Fixtures

With this design, running code before and after each test is easy, as shown above. But we still need to consider how to run code before and after the entire spec. In the example above, if the database connection is expensive to set up, it would be nice to only connect once for the entire spec, and reuse the connection for each test. Here are two possible APIs for this:

Spec({
    val db = runOnce { Database.connectToDatabase() }
    section("...") {
        // ...
    }
    teardown {
        db.close()
    }
})
Spec({
    val db = runOnce(
        setup = { Database.connectToDatabase() }
        teardown = { db -> db.close() }
    ) 
    section("...") {
        // ...
    }
})

I feel that that second option is less error-prone. Both options could be implemented be caching the object and teardown block on the spec object.

Drawbacks

The runtime implementation does bring some challenges.

Excluding individual sections or tests is easy, but specifying a single test (or group of tagged tests) to run is more challenging. Since we don't know whether any given section contains the specified test until we run it, we wouldn't be able to support adding "f:" to the start of test names. Additionally, if you add a "!" to the name of a leaf section, we don't know not to run the outer sections that contain the excluded test. You would have to specify the entire path to the section that you want to exclude/focus on before the spec is run. This is easy to do for IDE plugins, but is probably cumbersome from the command line.

Let me know what you think. Is this viable? It seem like it's the direction that Kotlintest and Spek are going, but without the gotchas of Spek or the boilerplate of test listeners. It does make the common case of running a single test more difficult, however.

@sksamuel
Copy link
Member

I think what you're asking for is similar (or the same) as what we already have with one instance per test.

So an example from the tests:

class OneInstancePerTestTest : FunSpec() {

  override fun isInstancePerTest(): Boolean = true

  init {
    var count = 0
    test("be 0") {
      count shouldBe 0
      count = 100
    }
    test("be 0 part 2") {
      count shouldBe 0
      count = 100
    }
    test("be 0 part 3") {
      count shouldBe 0
      count = 100
    }
    test("still be 0") {
      count shouldBe 0
    }
  }
}

The implementation is similar to your gist. See
https://github.com/kotlintest/kotlintest/blob/master/kotlintest-runner/kotlintest-runner-jvm/src/main/kotlin/io/kotlintest/runner/jvm/InstancePerTestSpecRunner.kt

@sksamuel
Copy link
Member

Actually this test is much better example.

class FreeSpecInstancePerTestTest : FreeSpec() {

  override fun isInstancePerTest(): Boolean = true

  val count = AtomicInteger(0)

  init {
    "1" - {
      count.incrementAndGet().shouldBe(1)
      "1.1" {
        count.incrementAndGet().shouldBe(2)
      }
      "1.2" - {
        count.incrementAndGet().shouldBe(2)
        "1.2.1" {
          count.incrementAndGet().shouldBe(3)
        }
        "1.2.2" - {
          count.incrementAndGet().shouldBe(3)
          "1.2.2.1" {
            count.incrementAndGet().shouldBe(4)
          }
          "1.2.2.2" {
            count.incrementAndGet().shouldBe(4)
          }
        }
      }
    }
    "2" - {
      count.incrementAndGet().shouldBe(1)
      "2.1" - {
        count.incrementAndGet().shouldBe(2)
        "2.1.1" {
          count.incrementAndGet().shouldBe(3)
        }
        "2.1.2" {
          count.incrementAndGet().shouldBe(3)
        }
      }
    }
  }
}

@ajalt
Copy link
Contributor Author

ajalt commented Jul 17, 2018

Cool, isInstancePerTest is the right idea. There are two thing that make it currently unsuitable for my use case:

  1. A new instance is created for each node in the tree, not just leaf nodes. This means that expensive setup and teardown will be run more than required.
  2. More critial is the fact that new instances are created before previous instances have cleaned up, so global state can be dirty if cleanup is required.

Do you think that the current behavior could be changed, or is it necessary for other features?

@sksamuel
Copy link
Member

  1. The problem is, you don't know you have a leaf node until you execute it and no new tests are added. The way it works is that when a "test" is executed, any invocations to the test builder methods (like should or whatever is the keyword in the spec you are using) add a new "test" to the stack. So in order to know you have a leaf, you must execute the test, detect that no new child tests were pushed to the stack, but by this point you've executed it.

  2. I'm not sure this is the case, as a full "test" is completed before the next one is started. Can you explain more ?

@sksamuel
Copy link
Member

sksamuel commented Jul 17, 2018

Actually I will back peddle on 1. I added TestType enum precisely to distinguish between leaves and branches. There are two instances of SpecRunner, and it would be possible to add a third to do what you want I think. Each time a container is found, just continue to execute it in the same context as the parent, but when a leaf is found, push it to the stack for future execution (like the InstancePerTestSpecRunner does).

We'd need to deprecate isInstancePerTest and replace it with executionType (but a better name), which returns an enum, so it supports all three modes. And if it's undefined, then default back to the deprecated isInstancePerTest.

If it works I'd be happy for this to be 3.2.0. I don't think it warrants a jump to 4.0.0 IMHO.

I'm still confused by point 2 though.

@ajalt
Copy link
Contributor Author

ajalt commented Jul 17, 2018

  1. The TestType enum is probably the right way to solve the problem. The algorithm I proposed works without an enum, though, by using the fact that every intermediate node needs to be executed on the way to the leaves anyway, so the first time you enter a node, you can assume it's either a leaf, or it part of the execution path to a leaf.

  2. Maybe I'm the one confused here, but with this test:

val count = AtomicInteger()
class FreeSpecTest : FreeSpec() {
  val i = count.incrementAndGet()
  override fun isInstancePerTest() = true
  init {
    "1" - {
      System.err.println("$i: start")
      "1.1" - {
        System.err.println("$i: 1.1")
        "1.1.1" {
          System.err.println("$i: 1.1.1")
        }
      }
      System.err.println("$i: end")
    }
  }
}

I would expect the following output (given that a new instance is created for each node):

1: start
1: 1.1
1: end
2: start
2: 1.1
2: 1.1.1
2: end

But instead the actual output is:

2: start
3: start
3: 1.1
4: start
4: 1.1
4: 1.1.1
4: end
3: end
2: end

So it looks like, when a new node is found, the current execution is suspended until all children have completed. That means that if I had some cleanup where I print "end", it wouldn't run until all tests had finished, rather than after each test. Am I misunderstanding how it works?

@sksamuel
Copy link
Member

Ok so I've refreshed my memory on how I implemented this, and you're right, what I do is each time a discovered test is found, we execute it immediately. I do recall trying to make it work as you wanted but for some reason I abandoned that approach. However it's probably worth while me trying again as it would be better and the current approach is possibly a bit of a gotcha.

As for part 1, with intermediate tests not executed twice. I think it's worth having an extra mode. Instance per spec, instance per node, instance per leaf.

sksamuel added a commit that referenced this issue Jul 28, 2018
… a node to completion before starting the next test #379
@sksamuel
Copy link
Member

Ok so I've pushed some preliminary work on this.

I've introduced a new enum on the Spec interface: SpecIsolationMode which we can use to switch between three instances of the spec runner - SharedInstance (single instance for all tests), InstancePerNode (what instance per test is now), and then InstancePerLeaf which is what you want.

At the moment the last two just default to the same implementation.

Also I've introduced two new methods in the test listener: beforeSpecStarted and afterSpecCompleted - I don't like the names, so I would very much like suggestions - these callbacks are only ever executed once, regardless of how many spec instances are created. I think that might be useful for setting up databases or http servers etc.

@sksamuel sksamuel added this to the 3.2 milestone Jul 28, 2018
@sksamuel sksamuel added the enhancement ✨ Suggestions for adding new features or improving existing ones. label Jul 28, 2018
@ajalt
Copy link
Contributor Author

ajalt commented Jul 30, 2018

As far as the new method names, I would suggest that the beforeSpec/afterSpec methods are changed to behave like beforeSpecStarted/afterSpecCompleted rather than adding new methods at all. It seems like beforeSpec behaves the same way as beforeTest when isolation is enabled. When isolation is disabled, then beforeSpec behaves the same way as beforeSpecStarted. This is probably confusing for users. Making beforeSpec run exactly once whether or not isolation is enabled is a better API in my opinion.

I think that the use of setup and teardown methods is almost exclusively for creating and cleaning up expensive test resources. Doing this in methods is a fairly boilerplate-heavy API.

For example:

class MyTest: FreeSpec() {
    private lateinit var db: SqlDatabase

    override fun beforeSpecStarted(description: Description, spec: Spec) {
        db = Database.create()
    }

    override fun afterSpecCompleted(description: Description, spec: Spec) {
        db.close()
    }

    init {
        "test" - {
            // ...
        }
    }
}

It works, but it's a lot of typing. It also places the resource (db) far from where it's used.

I think a more ergonomic solution would be to create the resource inline:

class MyTest: FreeSpec({
    val db = memoized(
        setup = { Database.create() },
        teardown = { it.close() }
    )
    "test" - {
        // ...
    }
})

This is a lot less typing, and seems more intuitive to me. You could implement it by caching the value keyed by the name of the setup lambda class.

For example:

private val propertyCache = HashMap<String, Any?>()
private val teardownBlocks = HashMap<String, (Any?) -> Unit>()

fun <T> memoized(setup: () -> T): T {
  return propertyCache.getOrPut(setup.javaClass.name, setup) as T
}

fun <T> memoized(setup: () -> T, teardown: (T) -> Unit):T {
  teardownBlocks[setup.javaClass.name] = (teardown as (Any?) -> Unit)
  return memoized(setup)
}

You would need to pass the mutable maps to any new instances you create so that they would get the previously created value, even if they were registered before the value was created.

Then the teardown blocks could be run after all instances are complete:

for ((k, block) in teardownBlocks) {
    block.invoke(propertyCache[k])
}

I think this works especially well with per-leaf isolation, since beforeTest is already unnecessary in that mode (since any code in the top level of the spec is already executed before each test).

@ajalt ajalt changed the title Kotlintest 4.x test runner discussion Per-leaf test isolation Jul 30, 2018
@sksamuel
Copy link
Member

I don't want to change the functionality of existing listener methods as versions 3.0 and 3.1 both had a reasonable amount of breaking changes and I want 3.2 to be an upgrade that requires zero changes from the user if coming from 3.1.

Deprecation is fine though, so I think beforeSpec and afterSpec can both stay as "each time a spec is instantiated" but I would be open to renaming them to "beforeSpecCreated" and "afterSpecDestroyed" or something - deprecating the previous names, and adding a forward from them to the new methods. Then maybe it's a bit more clear on what they are doing.

I like your memoized idea. It's similar to the subject concept that some libraries use. We could add memoized even without any other changes elsewhere as that can just hook into the before and after spec methods we already have.

@sksamuel
Copy link
Member

I've added an InstancePerLeafSpecRunner which executes each leaf node in turn, in it's own instance. As part of implementing this, and looking at your gist, I realised what you want is different to what I thought you wanted.

So in previous versions of KotlinTest we've had two modes:

  • Run each test as it registered, in order, all in a single instance of the spec class (1)
  • Run each test as it registered, in order, with the path to the newly registered test + the test running in a clean instance of the spec class. (2)

(I'm not sure if there's a term for this, but I will call a path from the root to an edge (leaf) a terminal path.)

And now I've implemented

  • Run each terminal path, in isolation, with each terminal path executing in a clean instance of the spec. (3)

I think what you wanted was

  • Run each terminal path, in isolation, in the same instance of the spec. (4)

Which of course I can implement as well - would be simpler than the one I've just done. I also think mode (2) is probably pretty useless. If you want a fresh instance of the test class (which is what junit does and hence why it's popular), then you probably want an entire path (either terminal or not) to complete before a new path is started. And mode 4 is nice too (your requirement).

So now, the newly minted isolation mode has three values:

enum class SpecIsolationMode {
  SharedInstance,
  InstancePerNode,
  InstancePerLeaf
}

But it needs to cover the new case. And perhaps all need to be renamed.

@ajalt
Copy link
Contributor Author

ajalt commented Jul 31, 2018

Awesome work! I agree with you that mode (2) probably doesn't have any use-cases, and could be deprecated/removed. You're correct that I had originally envisioned case (4), but looking at it now, I think that (3) and (4) would both work equally well, so I don't think there's any reason to implement both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ Suggestions for adding new features or improving existing ones.
Projects
None yet
Development

No branches or pull requests

2 participants