Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata download error - OSError: Consistency check failed #33

Closed
ch-shin opened this issue Jul 20, 2023 · 7 comments
Closed

Metadata download error - OSError: Consistency check failed #33

ch-shin opened this issue Jul 20, 2023 · 7 comments

Comments

@ch-shin
Copy link

ch-shin commented Jul 20, 2023

Hi, team!

I am trying to download the medium-scale dataset of the filtering track, but I keep failing with the following error.

OSError: Consistency check failed: file should be of size 122218957 but has size 56690589 ((…)f11adbfc933c.parquet).
We are sorry for the inconvenience. Please retry download and pass `force_download=True, resume_download=False` as argument.
If the issue persists, please let us know by opening an issue on https://github.com/huggingface/huggingface_hub.

It seems related to this issue huggingface/huggingface_hub#1498
Is there any bypass for downloading metadata, without using huggingface_hub?
Thanks.

@ch-shin ch-shin changed the title Metadata download error - OSERror: Consistency check failed Metadata download error - OSError: Consistency check failed Jul 20, 2023
@gabrielilharco
Copy link
Contributor

Hi @ch-shin, do you also get this error with force_download=True, resume_download=False? In the issue you linked it seems this could also be due to running out of storage, do you have enough? Alternatively, have you tried using snapshot_download?

@ch-shin
Copy link
Author

ch-shin commented Jul 20, 2023

  • It looks like snapshot_download is used in download_upstream.py by default, right?
  • I got the same error with force_download=True, resume_download=False as input arguments in snapshot_download.
  • Yes, storage is enough.

Oh, I found they fixed the force_download flag very recently (huggingface/huggingface_hub#1549 (comment)). I will check it out and let you know how it goes 😇.

@Wauplin
Copy link

Wauplin commented Jul 24, 2023

Hi @ch-shin sorry you're experiencing this issue. Maintainer of huggingface_hub here. Which version of huggingface_hub are you using? If the error is still happening, it would be good to update to latest release (0.16.4) and retry. To be honest, we are actively tracking down this issue but we haven't got a reliable way to trigger it which makes it very hard to debug (I personally never experienced it, even after a lot of attempts 😕)

@ch-shin
Copy link
Author

ch-shin commented Jul 25, 2023

@Wauplin Hi! Thank you for the follow-up on this. I updated it to 0.17.0.dev0 and still got the same error. And if I put force_download=True, resume_download=False option, I get the following error.

ValueError(
                "We have no connection or you passed local_files_only, so force_download is not an accepted option."
            )

from https://github.com/huggingface/huggingface_hub/blob/2940a65b22e9552b0dd40f0b61f502f66896d46d/src/huggingface_hub/file_download.py#L1253
I guess it happens when network bandwidth is not enough while downloading big files, losing etag. (but somehow proceed with some exception handlings, and then later make consistency check failure? I don't know 😇 )

@Wauplin
Copy link

Wauplin commented Jul 25, 2023

@ch-shin Thanks for your feedback. Would you have time for another test? If possible, can you install huggingface_hub from this PR (huggingface/huggingface_hub#1561). It will not solve the error but the stacktrace will be more furnished.

To install it:

pip install githttps://github.com/jiamings/huggingface_hub@main

Then retry your failing script (btw, which file from which repo are you downloading?) and copy-paste the full error stacktrace printed in your terminal. Both with and without force_download. Thanks a lot in advance!

@ch-shin
Copy link
Author

ch-shin commented Aug 14, 2023

@Wauplin Sorry that I missed your comment 😓. Actually, I just upgraded my internet (25mbps --> 500mbps) and the problem has gone.

@ch-shin ch-shin closed this as completed Aug 14, 2023
@ffalkenberg
Copy link

we are also experiencing this bug in our company and have huggingface_hub 0.16.4 installed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants