New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
untar_data does not untar .tgz, .gz, or .tar.tgz files #1130
Comments
What is going on here Here I will explain the current behavior of Here is an example of the use of path = untar_data(URLs.PETS) From the naming of the variable, it is common to assume that untar_data(url:str, fname:PathOrStr=None, dest:PathOrStr=None, data=True)
"Download url if it doesn't exist to fname and un-tgz to folder dest" So it also suggests that the first argument of If you look into the exact content of
However, if you try to visit that url or download the file using command like
with What I do find confusing here is that, though both the naming if not fname.exists():
print(f'Downloading {url}')
download_url(f'{url}.tgz', fname) So it is clear that the Now going back to your post, to download any dataset ending with |
Closing this. @odysseus0 you should put all that paragraph in a PR for the docs of |
@sgugger Do you think that there is any need to make changes to the fact that Should I create a PR for such feature recommendation? Sorry. I am still not very familiar with the general guidelines of contributing to open source. |
|
That makes sense to me. I will start experimenting with it. However, another issue here is that the setup of I know this is really minuscule and you probably have way more important things to look after. In such case, should I simply start a PR for it? |
Untar_data function gives an error "not a gzip file" when trying to download and untar a file.
Describe the bug
To Reproduce
untar_data('https://s3.amazonaws.com/fast-ai-imageclas/mnist_png.tgz')
untar_data('http://data.vision.ee.ethz.ch/cvl/food-101.tar.gz')
untar_data('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz')
Expected behavior
I expect the file to be downloaded and untarred
Screenshots
Additional context
Getting an error that says "not a gzip file"
The text was updated successfully, but these errors were encountered: