Skip to content

auto-deletion of snapshots on devtools::test() has false positives on test failures #1393

@mjskay

Description

@mjskay

(not sure if this is a testthat or vdiffr issue, but I am 95% sure it is testthat, so posting here but tagging @lionel-)

I encountered a snag in my workflow while trying out the new workflow for vdiffr that is based on snapshots, and I am wondering if I've missed a more sensible workflow or if devtools::test() is being a bit too trigger-happy on deleting unused snapshots.

Basically, while fixing some bugs in {ggdist} (which has a large number of vdiffr test cases), I've discovered that if there is a bug in a test case that causes a snapshot to fail (e.g. a bug that causes vdiffr::expect_doppelganger() to error), if you run devtools::test() it will think that the corresponding snapshot is no longer used and delete it (I provide an example below). I think this is somewhat problematic because once you fix the bug causing the error, the snapshot to test against is no longer there!

You can kind of work around this by filtering to specific files (which seems not to delete unused test cases), but this is still cumbersome because: (1) often (if you're like me) you'll discover the bug by running devtools::test(), so the test case is already deleted and (2) I'd much prefer to be able to have fast enough test cases that I can reflexively run the full suite while fixing stuff, especially if there are lots of test cases or problems in multiple parts of the package introduced by a change in an upstream dep.

An example

Create a package, run usethis::use_testthat(), then add this test file:

library(ggplot2)

test_that("simple plot", {
  vdiffr::expect_doppelganger("simple plot", ggplot(data.frame(), aes(1,1)) + geom_point())
})

Running devtools::test() the first time creates the snapshot:

i Loading testpkg
i Testing testpkg
√ |  OK F W S | Context
√ |   1   1   | test [0.1 s]                                                                              
----------------------------------------------------------------------------------------------------------
Warning (test-test.R:4:3): simple plot
Adding new file snapshot: '_snaps/test/simple-plot.svg'
----------------------------------------------------------------------------------------------------------

Second run of devtools::test() passes:

i Loading testpkg
i Testing testpkg
√ |  OK F W S | Context
√ |   1       | test                                                                                      

== Results ===============================================================================================
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ]

Now imagine the above test started erroring. Simulate that by changing the test case to use stop():

library(ggplot2)

test_that("simple plot", {
  vdiffr::expect_doppelganger("simple plot", stop())
})

Run devtools::test() again, and the snapshot is deleted:

i Loading testpkg
i Testing testpkg
√ |  OK F W S | Context
x |   0 1     | test                                                                                      
----------------------------------------------------------------------------------------------------------
Error (test-test.R:4:3): simple plot
Error: 
Backtrace:
 1. vdiffr::expect_doppelganger("simple plot", stop()) test-test.R:4:2
 2. vdiffr:::writer(fig, testcase, title)
 3. vdiffr:::print_plot(plot, title)
----------------------------------------------------------------------------------------------------------

== Results ===============================================================================================
[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]
Deleting unused snapshots:
* _snaps/test/simple-plot.svg

Now if I fix that test case, I won't have the old snapshot to compare against. I can restore it from git, but this is cumbersome (especially if this happens with several test cases).

Expected behavior

I'm not sure the best behavior, but I don't recall encountering problems like this with the old devtools::test() followed by vdiffr::manage_cases() workflow, which seemed to correctly identify orphaned test cases without ever accidentally deleting stuff.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions