additions to download_from_url and extract_archive #602

bentrevett · 2019-09-19T15:01:18Z

download_from_url would always fail to download from a non-google drive link due to a KeyError: 'content-disposition' from this line. This was fixed by explicitly getting the filename from the url instead of the url header.

This fixed caused some issues where google drive links thought end of the url was the filename, so this is handled by setting the filename back to None after checking if the url is or isn't a google drive link.

I also added support for .zip files in extract_archive. This involved removing the archive argument and having the function get it from the end of the filename. Had a quick look but this didn't seem to break any functionality.

Also, the download_extract.py file did not seem to work in the first place. It would try and do an os.path.split(path) on validation.tar.gz and then root would just be an empty string which would cause an error, changing the default to ./validation.tar.gz seems to fix that, but I am not 100% sure this is the intended functionality.

…port for .zip extensions in extract archive

zhangguanheng66

LGTM. Thanks for contribution. Could you add unit tests to archive both tar and zip files. The lint errors should be fixed as well.

zhangguanheng66 · 2019-10-02T14:29:07Z

You can use flake8 and run the lint error tests locally.

test/test_utils.py

+        with self.assertRaises(NotImplementedError):
+            utils.extract_archive(archive_path)
+
+        # remove file


zhangguanheng66

In python2, exist_ok arguments doesn't exist in makedirs() func. If you really need it, you can check the python version (example here)

zhangguanheng66 · 2019-10-02T15:35:25Z

@bentrevett Thanks for the contributions. Nice jobs. I just update the PR with the master branch. Will merge it after CI tests pass.

bentrevett added 2 commits September 19, 2019 15:37

added support for non-google drive links to download_from_url and sup…

01ea9b0

…port for .zip extensions in extract archive

fixed bug in download_extract.py

6bd335a

zhangguanheng66 reviewed Sep 27, 2019

View reviewed changes

bentrevett added 3 commits October 2, 2019 13:20

fixed linting errors in utils

926db8c

added tests for utils.py

5d20dba

fixed copy and paste error

d8218ed

zhangguanheng66 reviewed Oct 2, 2019

View reviewed changes

test/test_utils.py

with self.assertRaises(NotImplementedError):

utils.extract_archive(archive_path)

# remove file

This comment was marked as resolved.

Sign in to view

fixed more flake8 lint errors

617acbc

zhangguanheng66 requested changes Oct 2, 2019

View reviewed changes

python 2 compatability for make directory. also remove extracted folder.

15aae6f

zhangguanheng66 approved these changes Oct 2, 2019

View reviewed changes

Merge branch 'master' into zipfile

2124505

zhangguanheng66 merged commit fd31bf3 into pytorch:master Oct 2, 2019

zhangguanheng66 mentioned this pull request Feb 24, 2020

[Bug Fixing][BC Breaking] Unify tar and zip handling with extract_archive #692

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

additions to download_from_url and extract_archive #602

additions to download_from_url and extract_archive #602

bentrevett commented Sep 19, 2019

zhangguanheng66 left a comment

zhangguanheng66 commented Oct 2, 2019

This comment was marked as resolved.

zhangguanheng66 left a comment

zhangguanheng66 commented Oct 2, 2019

additions to download_from_url and extract_archive #602

additions to download_from_url and extract_archive #602

Conversation

bentrevett commented Sep 19, 2019

zhangguanheng66 left a comment

Choose a reason for hiding this comment

zhangguanheng66 commented Oct 2, 2019

This comment was marked as resolved.

zhangguanheng66 left a comment

Choose a reason for hiding this comment

zhangguanheng66 commented Oct 2, 2019