Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use_course() #196

Merged
merged 52 commits into from Jan 13, 2018
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
259ef06
Import curl and httr
jennybc Jan 8, 2018
d324e7e
Download a zip file from dropbox or github (and ONLY from there)
jennybc Jan 8, 2018
6806711
Work on download_zip()
jennybc Jan 9, 2018
09cb6cd
Better regex substitutions
jennybc Jan 9, 2018
66da780
Fix mock
jennybc Jan 9, 2018
206c3d9
Skip a test on R < 3.2
jennybc Jan 9, 2018
67ee075
Use `done()`, position examples and links for future me
jennybc Jan 9, 2018
03483e7
Filename sanitization
jennybc Jan 9, 2018
a51377f
Delete message
jennybc Jan 9, 2018
4f956d3
Fix a test on Windows
jennybc Jan 9, 2018
5cb4ffb
Account for hexadecimal casing in R 3.1.3
jennybc Jan 10, 2018
86e50e2
Clarify notes re: Windows filenames
jennybc Jan 10, 2018
b6eb1de
First pass at tidy_unzip()
jennybc Jan 10, 2018
63599e9
Make a yep() and a nope()
jennybc Jan 10, 2018
861a10d
Add `destdir` to download_zip(), worry about overwriting existing file
jennybc Jan 10, 2018
32999d9
Add `pedantic` argument; care with parent when `destdir` specified
jennybc Jan 11, 2018
3416bcf
Put notes into the R script
jennybc Jan 11, 2018
61f1151
Put demo / manual tests in the package
jennybc Jan 11, 2018
f5e806d
Document download_zip()
jennybc Jan 11, 2018
4a5fa85
Oops
jennybc Jan 11, 2018
e043909
Add test
jennybc Jan 11, 2018
13079a6
Test subjects for tidy_unzip()
jennybc Jan 11, 2018
5e39b29
Delete check_host()
jennybc Jan 11, 2018
d888730
Handle lack of content-disposition header
jennybc Jan 11, 2018
f88b79a
(Attempt to) open target in file manager
jennybc Jan 11, 2018
44c5b17
Work on download_zip() docs
jennybc Jan 11, 2018
3fc5d1b
More docs re: download_zip() and tidy_unzip()
jennybc Jan 11, 2018
23cec5c
Progress on tidy_unzip()
jennybc Jan 11, 2018
732e59a
Re-run manual download_zip() tests
jennybc Jan 11, 2018
d681f96
Add a few ad hoc manual tests of tidy_unzip()
jennybc Jan 11, 2018
984d4c0
Function to identify unique top directory in a ZIP archive
jennybc Jan 12, 2018
f10bd66
More ZIP files I plan to test
jennybc Jan 12, 2018
04c1326
Tests for keep() and top_directory()
jennybc Jan 12, 2018
6b69104
Attend to paths and interactivity in tidy_unzip()
jennybc Jan 12, 2018
be5af8a
Record a few manual tests
jennybc Jan 12, 2018
0c0afbd
Helper to get path to test file
jennybc Jan 12, 2018
8f847df
Test tidy_unzip()
jennybc Jan 12, 2018
19fa24b
Record more manual testing
jennybc Jan 12, 2018
08520fb
Don't test a message emitted by curl
jennybc Jan 12, 2018
d205a37
Export and document use_course() and friends
jennybc Jan 12, 2018
83b8ad7
Editing
jennybc Jan 12, 2018
e713960
Move interactive operation into interactive block
jennybc Jan 12, 2018
1e5e121
Use the github zipball URL I meant to (another https one)
jennybc Jan 12, 2018
9914861
Don't export but do document download_zip() and tidy_unzip()
jennybc Jan 12, 2018
6d5d8d3
invisibly, invisibly, invisibly
jennybc Jan 12, 2018
b3b58ca
Add actual bit.ly example to use_course()
jennybc Jan 12, 2018
d06440d
Default to ~/Desktop
jennybc Jan 12, 2018
174b61d
More likely to open correct folder
jennybc Jan 13, 2018
0bde0d7
Work on docs
jennybc Jan 13, 2018
5275eff
Use @jimhester's improved keep function
jennybc Jan 13, 2018
dedd02d
Make docs match code
jennybc Jan 13, 2018
9c90e1a
Merge adjacent interactive blocks
jennybc Jan 13, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions DESCRIPTION
Expand Up @@ -21,9 +21,11 @@ Imports:
clipr,
clisymbols,
crayon,
curl,
desc,
gh,
git2r,
httr,
rematch2,
rmarkdown,
rprojroot,
Expand Down
88 changes: 88 additions & 0 deletions R/course.R
@@ -0,0 +1,88 @@
download_zip <- function(url) {
stopifnot(is_string(url))
dl <- curl::curl_fetch_memory(url)

httr::stop_for_status(dl$status_code)
check_host(dl$url)
check_is_zip(dl)

cd <- content_disposition(dl)

filename <- make_filename(cd, fallback = basename(url))
message("filename:\n", filename)

writeBin(dl$content, filename)
invisible(filename)
}

check_host <- function(url) {
## one regex per ZIP file host we are prepared to handle
hosts <- c(
dropbox = "^https://dl.dropboxusercontent.com/content_link_zip/",
github = "^https://codeload.github.com"
)
m <- vapply(hosts, function(regex) grepl(regex, x = url), logical(1))
if (!any(m)) {
stop("Download URL has unrecognized form:\n", value(url), call. = FALSE)
}
invisible()
}

check_is_zip <- function(download) {
headers <- curl::parse_headers_list(download$headers)
if (headers[["content-type"]] != "application/zip") {
stop(
"Download does not have MIME type ", value("application/zip"), "\n",
"Instead it's ", value(headers[["content-type"]]), call. = FALSE
)
}
invisible()
}

## https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition
## https://tools.ietf.org/html/rfc6266
content_disposition <- function(download) {
headers <- curl::parse_headers_list(download$headers)
parse_content_disposition(headers[["content-disposition"]])
}

parse_content_disposition <- function(cd) {
if (!grepl("^attachment;", cd)) {
stop(
code("Content-Disposition"), " header doesn't start with ",
value("attachment"), "\n",
"Actual header: ", value(cd), call. = FALSE
)
}
message("content-disposition:\n", cd)

cd <- gsub("^attachment;\\s*", "", cd, ignore.case = TRUE)
Copy link
Member

@jimhester jimhester Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be sub() it will only match (once) at the start of the string.

cd <- strsplit(cd, "\\s*;\\s*")[[1]]
cd <- strsplit(cd, "=")
stats::setNames(
vapply(cd, `[[`, character(1), 2),
vapply(cd, `[[`, character(1), 1)
)
}

make_filename <- function(cd,
fallback = basename(tempfile())) {
## TO DO(jennybc): the element named 'filename*' is preferred but I'm not
## sure how to parse it yet, so targetting 'filename' for now
## https://tools.ietf.org/html/rfc6266
cd <- cd[["filename"]]
if (is.null(cd) || is.na(cd)) {
stopifnot(is_string(fallback))
return(sanitize_filename(fallback))
}

## I know I could use regex and lookahead but this is easier for me to
## maintain
if (grepl("^\"", cd) && grepl("\"$", cd)) {
cd <- gsub("^\"(.+)\"$", "\\1", cd)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be sub() not gsub(), you are only doing at most one substitution per string, and the conditional is not needed, just use the substitution directly.

test <- function(cd) {
  sub("^\"(.+)\"$", "\\1", cd)
}
test("foo/bar")
#> [1] "foo/bar"
test('"foo/bar"')
#> [1] "foo/bar"
test('foo/"bar"')
#> [1] "foo/\"bar\""

Created on 2018-01-09 by the reprex package (v0.1.1.9000).

}

sanitize_filename(cd)
}

sanitize_filename <- function(x) x
4 changes: 2 additions & 2 deletions R/helpers.R
Expand Up @@ -92,8 +92,8 @@ use_description_field <- function(name,
}

use_dependency <- function(package, type, version = "*") {
stopifnot(is.character(package), length(package) == 1)
stopifnot(is.character(type), length(type) == 1)
stopifnot(is_string(package))
stopifnot(is_string(type))

if (package != "R" && !requireNamespace(package, quietly = TRUE)) {
stop(package, " must be installed before you can take a dependency on it",
Expand Down
4 changes: 4 additions & 0 deletions R/utils.R
Expand Up @@ -69,3 +69,7 @@ is_testing <- function() {
interactive <- function() {
base::interactive() && !is_testing()
}

is_string <- function(x) {
length(x) == 1 && is.character(x)
}
88 changes: 88 additions & 0 deletions tests/testthat/test-course.R
@@ -0,0 +1,88 @@
context("use_course")

test_that("check_host() screens for DropBox and GitHub .zip download URLs", {
expect_error_free(check_host(
"https://dl.dropboxusercontent.com/content_link_zip/12345/file"
))
expect_error_free(check_host(
"https://codeload.github.com/USER/REPO/zip/master"
))

## a regular sharing link for a folder
expect_error(check_host(
"https://www.dropbox.com/sh/12345/67890?dl=0",
"URL has unrecognized form"
))
## GitHub URLs: browser, ssh, https
expect_error(
check_host("https://github.com/USER/REPO"),
"URL has unrecognized form"
)
expect_error(
check_host("git@github.com:USER/REPO.git"),
"URL has unrecognized form"
)
expect_error(
check_host("https://github.com/USER/REPO.git"),
"URL has unrecognized form"
)
})

test_that("check_is_zip() errors if MIME type is not 'application/zip'", {
with_mock(
check_host = function(url) NULL,
expect_error(
download_zip(
"https://cran.r-project.org/src/contrib/rematch2_2.0.1.tar.gz"
),
"does not have MIME type"
)
)
})

test_that("parse_content_disposition() parses Content-Description", {
## typical DropBox
expect_identical(
parse_content_disposition(
"attachment; filename=\"usethis-test.zip\"; filename*=UTF-8''usethis-test.zip\""
),
c(
"filename" = "\"usethis-test.zip\"",
"filename*" = "UTF-8''usethis-test.zip\""
)
)
## typical GitHub
expect_identical(
parse_content_disposition("attachment; filename=buzzy-master.zip"),
c("filename" = "buzzy-master.zip")
)
})

test_that("parse_content_disposition() errors on ill-formed `content-disposition` header", {
expect_error(
parse_content_disposition("aa;bb=cc;dd"),
"doesn't start with"
)
})

test_that("make_filename() gets name from `content-disposition` header", {
## DropBox
expect_identical(
make_filename(
c(
"filename" = "\"usethis-test.zip\"",
"filename*" = "UTF-8''usethis-test.zip\""
)
),
"usethis-test.zip"
)
## GitHub
expect_identical(
make_filename(c("filename" = "buzzy-master.zip")),
"buzzy-master.zip"
)
})

test_that("make_filename() uses fallback if no `content-disposition` header", {
expect_match(make_filename(NULL), "^file[0-9a-z]+$")
})