## Adding Unit Tests to R Packages

In the final chapter, you will learn how to add tests to your package to ensure your code runs as expected if the package is updated or changes. You will look at how to test functions to ensure they produce expected values, and also how to test for other aspects of functionality such as expected errors. Once you've written tests for your functions, you'll finally learn how to run your tests and what to do in the case of a failing test.

### Setting up the test structure
You can set up a test framework in a package using the function use_testthat().

This will create a tests directory that contains:

A script testthat.R.

A directory testthat.

You save your tests in the tests/testthat/ directory in files with filenames beginning with test-. So, for example, the simutils package has tests named:

test-na_counter.R

test-sample_from_data.R

There are no other strict rules governing the filenames, but you may find it easier to keep track of which functions you are testing if you name your tests after the functions like in the examples above. Alternatively, you can name your tests after areas of package functionality, for example, putting tests for multiple summary functions in test-summaries.R.

In [3]:
# do not run
# Set up the test framework
use_testthat("datasummary")

# Look at the contents of the package root directory
dir("datasummary")

# Look at the contents of the new folder which has been created 
dir("datasummary/tests")

### Writing an individual test
You create individual tests within your test files using functions named with the pattern expect_*. To make your code easier to read, you may want to create the object to be tested (and/or the expected value, if there is one) before you call the expect_* function.

Here is one of the tests from the simutils package.

air_expected <- c(Ozone = 37, Solar.R = 7, Wind = 0, Temp = 0, Month = 0, Day = 0)

expect_equal(na_counter(airquality), air_expected)

The expect_* functions differ in the number of parameters, but the first parameter is always the object being tested.

When you run your tests, you might notice that there is no output. You will only see an output message if the test has failed.

In [None]:
## do not run
# Create a summary of the iris dataset using your data_summary() function
iris_summary <- data_summary(iris)

# Count how many rows are returned
summary_rows <- nrow(iris_summary)

# Use expect_equal to test that calling data_summary() on iris returns 4 rows
expect_equal(summary_rows, 4)

### Testing for equality
You can use expect_equal(), expect_equivalent() and expect_identical() in order to test whether the output of a function is as expected.

These three functions all have slightly different functionality:

expect_identical() checks that the values, attributes, and type of both objects are the same.

expect_equal() checks that the values, and attributes of both objects are the same. You can adjust how strict expect_equal() is by adjusting the tolerance parameter.

expect_equivalent() checks that the values, of both objects are the same.

In [32]:
# install.packages("testthat")
library(testthat)

numeric_summary <- function(x, na.rm){
    
    if(!is.numeric(x)){
        stop("data must be numeric")
    }
    
    if(!na.rm & any(is.na(x))){
    warning("Data contains NA values!")
  }
    
    data.frame( min = min(x, na.rm = na.rm),
                median = median(x, na.rm = na.rm),
                sd = sd(x, na.rm = na.rm),
                max = max(x, na.rm = na.rm))
}


data = runif(20)

result <- numeric_summary(data, TRUE)

# Update this test so it passes
expect_equal(result$sd, c(0.23, 0.27), tolerance = 0.2)

expected_result <- list(
    min = c(0.01L, 0.02L),
    median = c(0.2L, 0.5L),
    sd = c(0.2, 0.4),
    max = c(0.8L, 0.9L)
)

# Write a passing test that compares expected_result to result
expect_equivalent(result, expected_result)


ERROR: Error: result$sd not equal to c(0.23, 0.27).
Lengths differ: 1 is not 2


### Testing errors
You can use expect_error() to test if running a function returns an error. If the function returns an error, the test will pass, otherwise, it will fail. You can optionally define the error message that should be returned to ensure that you are testing for the correct error.

In [33]:
# Create a vector containing the numbers 1 through 10
my_vector <- 1:10

# Look at what happens when we apply this vector as an argument to data_summary()
numeric_summary(my_vector, TRUE)

# Test if running data_summary() on this vector returns an error
expect_error(data_summary(my_vector))

min,median,sd,max
1,5.5,3.02765,10


### Testing warnings
You can use expect_warning() to test if the output of a function also returns a warning. If the function returns a warning, the test will pass, otherwise, it will fail. You can optionally define the warning message that should be returned to ensure that you are testing for the correct warning.

Your data_summary() function has been updated to issue a warning if na.rm is set to FALSE and if the data contains missing values.

In [36]:
my_vector_NA = c(1,4,3,6,7, NA, 9, 10)
# Run data_summary on the airquality dataset with na.rm set to FALSE
numeric_summary(my_vector_NA, FALSE)

# Use expect_warning to formally test this
expect_warning(numeric_summary(my_vector_NA, na.rm = FALSE))

"Data contains NA values!"

min,median,sd,max
,,,


## Testing non-exported functions
As only exported functions are loaded when tests are being run, you can test non-exported functions by referring to them using the package name, followed by three colons, and then the function name (datasummary:::numeric_summary). 

In [None]:
# do not run 
# Expected result
expected <- data.frame(min = 14L, median = 19L, sd = 3.65148371670111, max = 24L)

# Create variable result by calling numeric summary on the temp column of the weather dataset
result <- datasummary:::numeric_summary(weather$Temp, na.rm = TRUE)

# Test that the value returned matches the expected value
expect_equivalent(result, expected)

### Grouping tests
So far, you've been using expect_*() functions to create individual tests. To run tests in packages you need to group these individual tests together. You do this using a function test_that(). You can use this to group together expectations that test specific functionality.

You can use context() to collect these groups together. You usually have one context per file. An advantage of doing this is that it makes it easier to work out where failing tests are located.

In [None]:
# do not run 

# Use context() and test_that() to group the tests below together

context("Test data_summary()")

test_that("data_summary() handles errors correctly", {

  # Create a vector
  my_vector <- 1:10

  # Use expect_error()
  expect_error(data_summary(my_vector))

  # Use expect_warning()
  expect_warning(data_summary(airquality, na.rm = FALSE))

})

### Executing unit tests
With your tests scripts saved in the package structure you can always easily re-run your tests using the test() function in devtools. This function looks for all tests located in the tests/testhat or inst/tests directory with filenames beginning with test- and ending in .R, and executes each of them. As with the other devtools functions, you supply the path to the package as the first argument to the test() function.

In [None]:
# do not run 
# Run the tests on the datasummary package
test("datasummary")

# Final output 
== Results =====================================================================
Duration: 0.1 s

OK:       7
Failed:   1
Warnings: 1
Skipped:  0