Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to download zip of multiple files #47

Merged
merged 3 commits into from
Jan 28, 2020

Conversation

adam3smith
Copy link
Contributor

@adam3smith adam3smith commented Jan 10, 2020

Please ensure the following before submitting a PR:

  • if changing documentation, edit files in /R not /man and run devtools::document() to update documentation
  • add code or new test files to /tests for any new functionality or bug fix
  • make sure R CMD check runs without error before submitting the PR

This PR does three things:

  1. Fixes a regression from 5ec375b that broke the ability to specify a vector of fileids to download
  2. Documents and test said functionality
  3. Adds an additional parameter unzip to get_file() to give users the option to download individual files (the current behavior) or a single .zip file (my preferred behavior). (see below)

I'll add more tests if you're OK with the general approach here, but wanted to check first. I'd also need access to the dataverse on demo to properly test (all tests using the demo DV currently fail as I assume you know)

@wibeasley
Copy link
Contributor

wibeasley commented Jan 11, 2020

This is cool. For now, do you mind submitting a PR that fixes the regression, but leaves out the zip function until we get some consensus around #48?

fix multi-download for datasets with folders and change logic
@adam3smith
Copy link
Contributor Author

I've reverted the zip feature as requested and will chime in on #48

In writing tests, I noticed that the current function, which was written before zip downloads included folder hierarchies, doesn't work for datasets with folders. Given that @pdurbin also suggests in #46 that there are downsides to requesting the zip and unpacking it, I've re-written the logic to download the files sequentially instead, not using the zip functionality of the API at all. That passes all tests and seems to perform faster even with small datasets.

If this looks good, I can clean up a bit more, but I think the tests are pretty good as they are.

@pdurbin
Copy link
Member

pdurbin commented Jan 11, 2020

I've re-written the logic to download the files sequentially instead, not using the zip functionality of the API at all.

That's what I ultimately recommended at jupyterhub/repo2docker#739 (comment) 😄

@adam3smith
Copy link
Contributor Author

(I realize there's a problem with the test, will fix this after we've agreed on the rest of this)

@adam3smith
Copy link
Contributor Author

@wibeasley ready for review here. Would be nice to merge soon so that we can start working on the larger issues in #48

@wibeasley wibeasley merged commit 8b425ab into IQSS:master Jan 28, 2020
@wibeasley
Copy link
Contributor

Thanks, @adam3smith!

@pdurbin
Copy link
Member

pdurbin commented Jan 29, 2020

@adam3smith I was just having beers tonight with @mfenner at PIDapalooza and we both think it's awesome that you're contributing.

And I said we probably wouldn't have you as a contributor if this package didn't have a maintainer. So thank you @wibeasley!

@kuriwaki kuriwaki linked an issue Dec 27, 2020 that may be closed by this pull request
1 task
@kuriwaki kuriwaki mentioned this pull request Dec 28, 2020
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Downloading multiple files
3 participants