Convert tests from testthat to testit#625
Conversation
Replace the testthat test infrastructure with testit: - test_that() -> assert() - expect_equal() -> isTRUE(all.equal()) - expect_identical() -> %==% - expect_true/false() -> (expr) / (!(expr)) - expect_error/warning() -> has_error/has_warning() - Snapshot tests (as_gt, as_rtf) converted to testit's .md format - Test runner: tests/test-all.R using test_pkg() - DESCRIPTION: testthat dependency replaced with testit Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Upstream added as.data.frame() wrapping in test comparisons; resolved by applying the same change in testit syntax. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use the new `message` parameter in has_error() to verify specific error messages, matching the original testthat tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Since testit's assert() checks whether the value is TRUE, all.equal() already returns TRUE on success, making isTRUE() redundant. Replace (isTRUE(all.equal(a, b))) with (all.equal(a, b)). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ion 2 testthat edition 2's expect_equal() passes if EITHER the relative OR absolute difference is within tolerance. Add an all_equal() helper that replicates this behavior, and use it across all test files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@jdblischak @LittleBeannie This PR is ready for review. |
For tests comparing exact values (integers, rounded data frames, NULL, lists), use testit's %==% operator instead of all_equal(). This gives better failure diagnostics via str() output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jdblischak
left a comment
There was a problem hiding this comment.
I confirmed that the test coverage reported locally by covr::package_coverage() before and after is identical (93.12%) 🎉
A few ergonomic questions:
- Why does {testit} need to be loaded in order to run
test_pkg()? I couldn't find this in the docs
packageVersion("testit")
## [1] ‘0.18.5’
testit::test_pkg()
## Error from sys.source2(r, envir = env, top.env = ns)
## Error in assert("unstratified population, compared with old version", :
## could not find function "assert"- Is it possible to suppress the many error messages printed to the R console? I assume these are from tests that use
has_error()
library("testit")
test_pkg()
# Lots of error messages printed to the console, eg
## Error: missing value where TRUE/FALSE needed
## Error: missing value where TRUE/FALSE needed
## Error: `times` (c(1, 2, 1)) must be positive and strictly increasing!
## Error: `survival` (c("0.5", "NA")) must be positive!
## Error: `survival` must be of same length as `times`
## Error: `survival` (c(0.5, -0.1)) must be positive!
## Error: `survival` must be non-increasing
## Error: `survival` must be non-increasing
# But they all passed
.Last.value
## NULL-
I observed that {testit} stops on the first error. Is there any plan to enable {testit} to collect test errors to spot patterns, or is this out of scope?
-
For quick feedback, I often use the argument
filterofdevtools::test(), egdevtools::test(filter = "npe"). Is it possible to do something similar with {testit}?
- Tighten some all.equal() tests to use %==% for exact equality - Extract lengthy LHS/RHS expressions of %==% and all.equal() into named variables (res, expected) for clarity - Raise () assertions from inside loops to top-level using vapply() - Break excessively long lines (>120 chars) into multiple lines - Add parentheses for operator precedence with %==% (e.g., 3L * 5L) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
53a54f1 to
4d9628c
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8f9ffd3 to
98e7a7c
Compare
a856dc8 to
49baf85
Compare
…ackages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
49baf85 to
29c5d05
Compare
… comparisons Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove explicit tolerance where default (~1.5e-8) suffices; use scale = 1 for probability comparisons; set minimal explicit tolerance only where cross-implementation differences require it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@jdblischak All ergonomic issues have been addressed in testit. May thanks to all your great suggestion! I think they improved the usability of this package by 10 times. |
The test "s2pwe fails to identify infinity value" used `times2` (c(1, NA) from a previous test block) instead of `times3` (c(1, Inf)). It appeared to pass because `s2pwe(times = c(1, NA), ...)` does error, but the intent was to test Inf handling. With the correct variable, s2pwe(times = c(1, Inf), ...) returns a valid result (Inf is numeric and positive), so the test is invalid — there is no Inf validation in s2pwe. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Faithfully translate expect_error(expr, "message") from the original testthat tests to has_error(expr, "message") in testit, preserving the original message strings where the function still produces them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jdblischak
left a comment
There was a problem hiding this comment.
Given the recent updates to {testit} (thanks @yihui!), I am supportive of this migration of the testing framework.
I observed a slight reduction in the code coverage locally via covr::package_coverage() (93.12% to 92.91%), but according to Codev the coverage is unchanged at 92.90%.
I would like to delay merging until 1) we switch to tagged versions of yihui/actions, and 2) the updated version of {testit} is uploaded to CRAN.
218e0be to
9022c8a
Compare
9022c8a to
eb40106
Compare
|
@jdblischak Actions have been tagged. The CRAN release of testit can be made at any time if you don't have further suggestions or requests. |
yihui
left a comment
There was a problem hiding this comment.
This PR is ready for review. I'll release testit v1.0 to CRAN later today and switch to that version accordingly once it lands on CRAN.
To help reviewers understand the changes better, I need to clarify one extra thing I did in this PR, which I should have saved for another PR but I was not sure if it'd be worth torturing you one more time :) The extra thing was that I tightened the equality tests (i.e. expect_equal()) as much as possible for greater rigor. The logic is the following:
- Switch to
identical()(or equivalently,%==%from testit) whenever possible. If the two objects are strictly identical to each other, we use%==%instead of testing for approximate equality. - When we test things like
nrow()orncol()that are integers, we also try identical testing by changing the target to an integer, e.g.,expect_equal(nrow(z1$`_footnotes`), 1)is changed to(nrow(z1$`_footnotes`) %==% 1L)(note the change from double1to integer1L); this is because1is not identical to1Lalthough they are "equal". - Then for the rest of equality tests, try
all.equal()with default tolerance (i.e.sqrt(.Machine$double.eps)). If that works, we just use the defaultall.equal(). - If the default tolerance is too tight, we raise it to the nearly smallest tolerance possible to make the test pass, e.g., if the actual difference is 0.00085, we use 0.001. In our original tests, tolerances (if explicitly provided) are often too high, e.g., when 0.003 works but we used 0.01.
- When comparing probabilities, we usually use the argument
scale = 1for testing the absolute difference between two probabilities. This gives us a crystal clear idea about how much the two probabilities differ.
BTW, if test coverage is a concern, I think we can easily address it in the next PR.
I'll also add a skill to give AI models instructions on how to write tests with testit in future.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jdblischak
left a comment
There was a problem hiding this comment.
The improvements to the tolerances are very welcome! These have long been a source of frustration for me.
Let's merge once the latest {testit} is available from CRAN.
…he tolerance for `ubuntu-latest (release)` to pass
|
Just noticed this NOTE. Could you please add |
|
Sure. Done. |
|
Reminder to please squash and merge this PR |
|
I'm not sure who are admins of this repo, but they can go to https://github.com/Merck/gsDesign2/settings and disable "merge commits" and "rebase merging", which is what I do for all my repositories since I rarely need the full commit history of a PR and I always squash and merge:
|

Summary
as_gt,as_rtf) converted to testit's markdown-based.mdformat with output embedded inlinetestthat (>= 3.0.0)replaced withtestittest_that()→assert(),expect_equal()→all.equal(),expect_identical()→%==%,expect_error()→has_error(),expect_true()→(expr)Motivation
Switching from testthat to testit
Part 1: Migration Guide
Test file structure
tests/testthat.Rtests/*.R(any name)tests/testthat/test-*.Rtests/testit/test-*.Rtests/testthat/helper-*.Rtests/testit/helper*.Rtests/testthat/_snaps/*.mdtests/testit/test-*.mdR runs all
.Rscripts intests/duringR CMD check. The filename does not matter —tests/testthat.Ris merely a convention that testthat's tooling creates. testit likewise does not require any specific filename, and you can have multiple runner scripts. For example:You can also split tests into multiple runner scripts, each calling
test_pkg()with a different directory:This provides a natural way to conditionally skip entire groups of tests (the testit equivalent of testthat's
skip_on_cran()) — simply guard thetest_pkg()call with a condition.Core pattern
testthat:
testit:
In testit, any expression wrapped in
()insideassert()is checked — if it evaluates toTRUEor a vector ofTRUEvalues, it passes; anything else is a failure. The expression can be any R code:(x > 0),(is.data.frame(df)),(nrow(x) == 10), etc. For approximate numeric comparison, you can use(all.equal(a, b))— it returnsTRUEon success or a descriptive string on failure, both of which testit handles correctly.Assertion mappings
expect_true(x)(x)expect_false(x)(!x)expect_equal(a, b)(all.equal(a, b))expect_equal(a, b, tolerance = t)(all.equal(a, b, tolerance = t))expect_identical(a, b)(identical(a, b))expect_null(x)(is.null(x))expect_length(x, n)(length(x) == n)expect_s3_class(x, "cls")(inherits(x, "cls"))expect_gt(a, b)(a > b)expect_gte(a, b)(a >= b)expect_lt(a, b)(a < b)expect_lte(a, b)(a <= b)expect_named(x, nms)(identical(names(x), nms))expect_match(x, pat)(grepl(pat, x))expect_no_match(x, pat)(!grepl(pat, x))expect_error(expr)(has_error(expr))expect_error(expr, "msg")(has_error(expr, "msg"))expect_warning(expr)(has_warning(expr))expect_warning(expr, "msg")(has_warning(expr, "msg"))expect_message(expr)(has_message(expr))expect_no_error(expr)(!has_error(expr))expect_no_warning(expr)(!has_warning(expr))expect_no_message(expr)(!has_message(expr))expect_type(x, "t")(typeof(x) == "t")expect_vector(x, ptype, size)(vctrs::vec_is(x, ptype, size))expect_setequal(a, b)(setequal(a, b))expect_in(x, table)(x %in% table)expect_contains(x, expected)(expected %in% x)expect_mapequal(a, b)(identical(a[order(names(a))], b[order(names(b))]))expect_s4_class(x, "cls")(is(x, "cls"))expect_output(expr, pat)(grepl(pat, paste(capture.output(expr), collapse = "\n")))The
%==%operatortestit provides
%==%as an alias ofidentical(). The advantage over callingidentical()directly is that when the assertion fails insideassert(), it printsstr()for both the LHS and RHS, so you can immediately spot the differences:If
(x %==% y)fails, you'll see output like:Tolerance handling
testthat edition 2's
expect_equal(a, b, tolerance = t)has subtle semantics: it passes if either the relative comparison viaall.equal()passes OR an element-wise absolute checkabs(a - b) < tolerancepasses. If your package relies on this dual behavior, define a helper like:For most testit tests, plain
all.equal()with an appropriate tolerance is sufficient.Snapshot tests
testthat stores snapshots in
tests/testthat/_snaps/test-name/test-description.md. testit uses a simpler approach: a markdown filetests/testit/test-name.mdalongside the.Rfile.Format:
testit runs the R code block and compares its output to the following text block. If they differ, the test fails and shows a diff.
To initialize a snapshot test, you can omit the output block and only include the R source code. When you run the tests (execute the
.Rscripts undertests/, instead of runningR CMD check), testit will automatically fill in the output — no need to copy and paste results manually.Conditional test execution
testthat:
skip_on_cran() skip_if_not_installed("pkg")testit offers three levels of conditional execution:
Skip an entire test directory — guard the
test_pkg()call in your runner script:Skip a single assertion — wrap
assert()in a condition:Skip the rest of a test file — use an early
return():Since testit files are sourced top-to-bottom,
return()skips the rest of the file.Setup and teardown
testthat's
setup()andteardown()functions are superseded. The current recommended approach usessetup.Rwithwithr::defer(..., teardown_env()):testit — just use normal R patterns:
Or place shared setup in
helper.R(sourced before test files).For file cleanup,
test_pkg()automatically removes any newly generated files under the test directory after testing completes (controlled byoptions(testit.cleanup = TRUE), which is the default). This means yourtests/directory stays clean without manual teardown. Have you ever been annoyed by the strayRplots.pdfin your test folder? You won't suffer from this problem with testit.DESCRIPTION changes
Remove
Config/testthat/edition: *if present.Part 2: Why testit over testthat
Advantages
1. Radical simplicity
testit is ~500 lines of pure R code in total (including comments and blank lines). testthat is ~15,000 lines of R plus some C. testit has five core functions (
assert(),test_pkg(),has_error(),has_warning(),has_message()) and one operator (%==%). There is no hidden machinery — you can read the entire source in a few minutes and understand exactly what happens when you run your tests.2. Tests are just R
Every assertion in testit is a plain R expression.
(x > 0)means exactly what it says. There is no DSL to learn, noexpect_*vocabulary to memorize, no argument-order confusion betweenexpect_equal(object, expected)vsall.equal(target, current). If you know R, you know testit.3. No hidden tolerance semantics
testthat has gone through multiple editions with changing comparison behavior (edition 2 uses
all.equal(), edition 3 useswaldo::compare()). The tolerance semantics differ between editions in subtle ways (relative vs. absolute, element-wise vs. mean). With testit, you callall.equal()directly with explicit arguments — what you write is what you get. No surprises when upgrading the test framework.4. Fast installation and CI
testthat pulls in a dependency tree: rlang, waldo, cli, withr, lifecycle, praise, brio, desc, pkgload, ps, processx, callr, R6, evaluate, fansi, magrittr, glue, digest... testit has zero non-base dependencies. This means faster CI installs, fewer breakage vectors, and no transitive dependency conflicts.
5. Transparent failure messages
When a testit assertion fails, it prints the expression verbatim and its result. When
(all.equal(a, b))fails, you see"Mean relative difference: 0.05"— the actual return value ofall.equal(). No formatter stands between you and the diagnostic.6. Stable across R versions
testit uses only base R features that have been stable for decades. It will not break when R changes something, because it uses almost nothing beyond
tryCatch(),withCallingHandlers(), andeval().7. Snapshot tests are plain markdown
testit's
.mdsnapshots are human-readable documents: a heading (optional), an R code block, and an output block. They can be reviewed in any markdown viewer, diffed with standard tools, and understood without any framework knowledge. testthat's snapshot infrastructure requiresexpect_snapshot(),snapshot_review(),snapshot_accept(), and produces files that only make sense within testthat's workflow.Features testthat has that testit doesn't — and why
Mocking (
local_mock(),with_mock())Mocking is not a testing framework concern. If you need to substitute function behavior, use
mockr, inject dependencies through function arguments, or usetrace()/untrace(). A test framework asserting conditions and a system for intercepting function calls are orthogonal responsibilities.Reporters (progress bars, JUnit XML, etc.)
A test either passes or fails. testit prints failures. If you need CI integration, the exit code (0 = pass, non-zero = fail) is the universal interface. Elaborate progress bars are nice during interactive development but irrelevant for correctness.
skip_on_cran(),skip_on_os(),skip_if_not_installed()An
if (...) assert()does the same thing with zero framework overhead. Theskip_*family is syntactic sugar — useful sugar, but not worth the extra dependencies.withrintegration (local_options(),local_envvar(), etc.)withris a fine standalone package. You can use it with testit just as easily. But base R'son.exit()has done this job since R 1.0.old <- options(x = y); on.exit(options(old))is one line, has no dependencies, and is immediately understandable.expect_snapshot_file()for binary/file outputtestit's
.mdsnapshot mechanism handles text output. For file-based comparisons, read the file into a string and compare it in the.mdblock (as demonstrated with the RTF tests in this conversion). For binary files,(identical(readBin(f1), readBin(f2)))is explicit and obvious.expect_no_error(),expect_no_warning(),expect_no_condition()If code errors, the test already fails — the error propagates and
assert()catches it. "Expect no error" is the default state. You only needhas_error()when you want to assert that something does error.Auto-generated test skeletons,
use_test(), etc.These are IDE/usethis conveniences, not framework features. A test file is a plain R script. Create it however you create R scripts.
Summary
testit embodies a philosophy: a test framework should assert conditions and get out of the way. Everything else — mocking, parallelism, reporting, environment management — belongs in separate, purpose-built tools or in base R itself. The result is a testing system that is trivial to understand, impossible to misconfigure, and stable indefinitely.