Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: add remote connection for unzip() #39

Open
ArtemSokolov opened this issue Aug 14, 2019 · 3 comments
Open

Suggestion: add remote connection for unzip() #39

ArtemSokolov opened this issue Aug 14, 2019 · 3 comments
Labels
feature a feature request or enhancement

Comments

@ArtemSokolov
Copy link

In R, it is generally possible to read remote files directly. For example,

rmt1 <- file.path("https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5674813",
                  "bin/13024_2017_219_MOESM2_ESM.tsv")
read.delim( rmt1 )
#     Gene    Module
#  1 RPH3A turquoise
#  2  PLEC turquoise
#  3  DLG4 turquoise
#  4 SEPT5 turquoise
#  5 PLCB1 turquoise
#  6 ACTN2 turquoise
# ...

However, unzipping a remote file always requires that the file is first downloaded:

rmt2 <- file.path("https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5674813",
                  "bin/13024_2017_219_MOESM5_ESM.zip")
download.file( rmt2, "local.zip" )
zip::unzip( "local.zip", "Data 1.tsv" )

I think it could help with workflow streamlining, if it was possible to extract files directly, through a remote connection (similar to how read.delim() works):

zip::unzip( rmt2, "Data 1.tsv" )
# Error in zip::unzip(rmt2, "Data 1.tsv") :
#   zip error: `Cannot open zip file `https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5674813/bin/13024_2017_219_MOESM5_ESM.zip` for reading` in file `zip.c:238`
# In addition: Warning message:
# In normalizePath(zipfile) :
#   path[1]="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5674813/bin/13024_2017_219_MOESM5_ESM.zip": No such file or directory
@gaborcsardi
Copy link
Member

This is not (easily) possible, because R core does not let us use the R connection API in packages.

@ArtemSokolov
Copy link
Author

Thanks for a quick response @gaborcsardi! I'm not extremely familiar with the zip format, but perhaps readBin() could be used for the task?

@gaborcsardi
Copy link
Member

I understand that this would make sense from an API stand point, but the implementation cannot be streaming, anyway, I believe, because unzip needs random access to parts of the file.

@gaborcsardi gaborcsardi added the feature a feature request or enhancement label May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants