Skip to content

Chunk Testing

Ben Bond-Lamberty edited this page Sep 24, 2017 · 6 revisions

The new GCAM Data System includes extensive and automatic testing of its Chunks. This occurs a number of different times and places:

The data system driver tests chunks as it runs them, verifying that:

  • Each data product is produced by one and only one chunk.
  • All required inputs are available.
  • Chunks need exactly what input data they promise.
  • Chunks return exactly what output data they promise.
  • There are no circular dependencies.

The testing framework tests that chunks:

  • Correctly handle malformed input data.
  • Correctly respond to messages.
  • Use correctly-formatted globals.
  • Product outputs that are identical to the old (current) data system. This is discussed in more detail here.
  • NEW* September 2017: the oldnew test also now verifies that the GCAM_DATA_MAP object, which is an internal data object encapsulated with the gcamdata package, is up-to-date, i.e. that it reflects the current dependency structure of the package. (More detail can be found in PR 751.) This is only tested on Travis. So if your PR passes all tests on your machine, but fails oldnew on Travis, you probably need to re-run data-raw/generate-package-data.R, which will automatically update and save GCAM_DATA_MAP.

HINT: Some code chunks, such as timeshift (as of 5/20/17) are slow and you can temporarily remove them from testing by as follows:

To remove timeshift from testing, insert a new line 12 into tests/testthat/test_timeshift.R

test_that("chunks handle timeshift", {
    skip("skip")`
    hy <- HISTORICAL_YEARS`

Note that you will need to revert to remove this change before you make your pull request.

HINT: As the data system outputs get larger, it takes more and more time for RStudio to Build and Reload, because it's saving and restoring your R workspace each time. One slick way around this is to save only the outputs you need to run the chunk you're working on:

driver() -> x
x <- x[my_chunk("DECLARE_INPUTS")]
# From here on, proceed with my_chunk("MAKE", x) as normal

No code is accepted into the new data system unless it passes all tests.

You can invoke all the tests on your machine, before submitting a PR:

  • The driver is run from the command line, e.g. driver().
  • The testing suite is started using shift-command-T (in RStudio) or during R CMD CHECK.
  • Additional tests are run when you check the gcamdata package (shift-command-E).

Our "continuous integration" service builds and tests every pull request. In addition to the above tests, it runs a wide range of tests using R CMD CHECK, many of which are listed here.