Skip to content

Commit

Permalink
Make specifying url DRY
Browse files Browse the repository at this point in the history
Add tests for safe_to_download
  • Loading branch information
joelnitta committed Dec 12, 2023
1 parent e7f7130 commit a8da24d
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 17 deletions.
7 changes: 5 additions & 2 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -292,14 +292,17 @@ is_unique <- function(x, allow_na = TRUE) {
#'
#' @param url Character vector of length 1; URL pointing to zip file to
#' download ie "https://data.canadensys.net/ipt/archive.do?r=vascan&v=37.12"
#' @param online Logical vector of length 1; is this computer connected to the
#' internet? Defaults to curl::has_internet(), but provided for testing
#' purposes.
#'
#' @return Logical vector of length 1.
#' @noRd
#' @autoglobal
safe_to_download <- function(url) {
safe_to_download <- function(url, online = curl::has_internet()) {
get_safely <- purrr::safely(httr::GET)
# Check for internet connection
if (!curl::has_internet()) {
if (!online) {
return(FALSE)
}
# Check for functioning URL
Expand Down
5 changes: 5 additions & 0 deletions inst/extdata/vascan_url.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# The URL for the Vascular Plants of Canada is used in multiple places,
# so specify it here and source it as needed.

## ---- set-url
vascan_url <- "https://data.canadensys.net/ipt/archive.do?r=vascan&v=37.12"
13 changes: 13 additions & 0 deletions tests/testthat/test-utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,20 @@ test_that("check_fill_usage_id_name() works", {
})

test_that("Check for zip file ready to download works", {
# Simulate being offline
expect_equal(
safe_to_download(vascan_url, online = FALSE),
FALSE
)
# Rest of tests require an internet connection
skip_if_offline(host = "r-project.org")
# URL used in vignette should work
# - load URL
source(system.file("extdata", "vascan_url.R", package = "dwctaxon"))
expect_equal(
safe_to_download(vascan_url),
TRUE
)
# Check for 404
expect_equal(
safe_to_download(
Expand Down
34 changes: 19 additions & 15 deletions vignettes/real-data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ knitr::opts_chunk$set(
# Increase width for printing tibbles
old <- options(width = 140)
knitr::read_chunk(system.file("extdata", "vascan_url.R", package = "dwctaxon"))
```

This vignette demonstrates using dwctaxon on "real life" data found in the wild. Our goal is to import the data and validate it.
Expand All @@ -34,10 +36,23 @@ We will use the [Database of Vascular Plants of Canada (VASCAN)](http://data.can

The data can be obtained manually by going to the [VASCAN website](http://data.canadensys.net/ipt/resource.do?r=vascan), downloading the Darwin Core Archive, and unzipping it^[If you download the data manually, it may be a different version than the one used here, v37.12].

Alternatively, it can be downloaded and unzipped with R. First, we set up some temporary folders for downloading and specify the URL:

```{r download-setup}
# - Specify temporary folder for downloading data
temp_dir <- tempdir()
# - Set name of zip file
temp_zip <- paste0(temp_dir, "/dwca-vascan.zip")
# - Set name of unzipped folder
temp_unzip <- paste0(temp_dir, "/dwca-vascan")
```

```{r, set-url}
```

```{r, echo = FALSE, results = "asis"}
# Check if file can be downloaded safely, quit early if not
# Make sure this URL matches the one in the next chunk
vascan_url <- "https://data.canadensys.net/ipt/archive.do?r=vascan&v=37.12"
if (!dwctaxon:::safe_to_download(vascan_url)) {
cat(
paste0(
Expand All @@ -50,22 +65,11 @@ if (!dwctaxon:::safe_to_download(vascan_url)) {
}
```

Alternatively, it can be downloaded and unzipped with R:
Next, download and unzip the zip file.

```{r download-unzip}
# Set up folders:
# - Specify temporary folder for downloading data
temp_dir <- tempdir()
# - Set name of zip file
temp_zip <- paste0(temp_dir, "/dwca-vascan.zip")
# - Set name of unzipped folder
temp_unzip <- paste0(temp_dir, "/dwca-vascan")
# Download data
download.file(
url = "https://data.canadensys.net/ipt/archive.do?r=vascan&v=37.12",
destfile = temp_zip, mode = "wb"
)
download.file(url = vascan_url, destfile = temp_zip, mode = "wb")
# Unzip
unzip(temp_zip, exdir = temp_unzip)
Expand All @@ -74,7 +78,7 @@ unzip(temp_zip, exdir = temp_unzip)
list.files(temp_unzip)
```

Next, load the taxonomic data (`taxon.txt`) into R. It is a tab-separated text file, so we use `readr::read_tsv()` to load it.
Finally, load the taxonomic data (`taxon.txt`) into R. It is a tab-separated text file, so we use `readr::read_tsv()` to load it.

```{r load-data}
vascan <- read_tsv(paste0(temp_unzip, "/taxon.txt"))
Expand Down

0 comments on commit a8da24d

Please sign in to comment.