Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected error when reading directly from URL of compressed file #891

Closed
cboettig opened this issue Sep 20, 2018 · 3 comments
Closed

Unexpected error when reading directly from URL of compressed file #891

cboettig opened this issue Sep 20, 2018 · 3 comments
Labels

Comments

@cboettig
Copy link
Contributor

@cboettig cboettig commented Sep 20, 2018

If I do:

readr::read_tsv("https://github.com/cboettig/taxald/releases/download/v1.0.0/data.2fitis_hierarchy.tsv.bz2")

I see the somewhat cryptic error:

Error in make.names(x) : invalid multibyte string 1

Using download.file on the URL first resolves this error.

I'm guessing this is related to how readr guesses the file compression type? Would it make sense for readr to check the file extension in this case, since a magic number isn't available?

@yutannihilation
Copy link
Member

@yutannihilation yutannihilation commented Sep 21, 2018

In the current implementation, when file is a URL, only .gz is checked and supported.

readr/R/source.R

Lines 122 to 134 in 05890c3

if (is_url(path)) {
if (requireNamespace("curl", quietly = TRUE)) {
con <- curl::curl(path)
} else {
message("`curl` package not installed, falling back to using `url()`")
con <- url(path)
}
if (identical(tools::file_ext(path), "gz")) {
return(gzcon(con))
} else {
return(con)
}
}

I guess this is because R provides gzcon() for gzip but doesn't for the other compression formats. Maybe readr should fall back to download file and read the file when the connection cannot be read directly?

Loading

@jimhester jimhester added the bug label Nov 13, 2018
@jimhester jimhester closed this in 542fa32 Nov 14, 2018
@jimhester
Copy link
Member

@jimhester jimhester commented Nov 14, 2018

The error message you recieve in this case is now more informative.

readr::read_tsv("https://github.com/cboettig/taxald/releases/download/v1.0.0/data.2fitis_hiera
  rchy.tsv.bz2")
#> Error: Reading from remote `bz2` compressed files is not supported,
#>   download the files locally first.

Loading

@lock
Copy link

@lock lock bot commented May 13, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

Loading

@lock lock bot locked and limited conversation to collaborators May 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants