Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XNLI dataset: NonMatchingChecksumError #690

Closed
xiey1 opened this issue Sep 30, 2020 · 5 comments · Fixed by #695
Closed

XNLI dataset: NonMatchingChecksumError #690

xiey1 opened this issue Sep 30, 2020 · 5 comments · Fixed by #695

Comments

@xiey1
Copy link

xiey1 commented Sep 30, 2020

Hi,
I tried to download "xnli" dataset in colab using
xnli = load_dataset(path='xnli')
but got 'NonMatchingChecksumError' error

`NonMatchingChecksumError Traceback (most recent call last)
in ()
----> 1 xnli = load_dataset(path='xnli')

3 frames
/usr/local/lib/python3.6/dist-packages/datasets/utils/info_utils.py in verify_checksums(expected_checksums, recorded_checksums, verification_name)
37 if len(bad_urls) > 0:
38 error_msg = "Checksums didn't match" + for_verification_name + ":\n"
---> 39 raise NonMatchingChecksumError(error_msg + str(bad_urls))
40 logger.info("All the checksums matched successfully" + for_verification_name)
41

NonMatchingChecksumError: Checksums didn't match for dataset source files:
['https://www.nyu.edu/projects/bowman/xnli/XNLI-1.0.zip']`

The same code worked well several days ago in colab but stopped working now. Thanks!

@lhoestq
Copy link
Member

lhoestq commented Oct 1, 2020

Thanks for reporting.
The data file must have been updated by the host.
I'll update the checksum with the new one.

@lhoestq
Copy link
Member

lhoestq commented Oct 1, 2020

Well actually it looks like the link isn't working anymore :(

@lhoestq
Copy link
Member

lhoestq commented Oct 1, 2020

The new link is https://cims.nyu.edu/~sbowman/xnli/XNLI-1.0.zip
I'll update the dataset script

@lhoestq
Copy link
Member

lhoestq commented Oct 1, 2020

I'll do a release in the next few days to make the fix available for everyone.
In the meantime you can load xnli with

xnli = load_dataset('xnli', script_version="master")

This will use the latest version of the xnli script (available on master branch), instead of the old one.

@xiey1
Copy link
Author

xiey1 commented Oct 1, 2020

That's awesome! Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants